24 posts categorized "Data Management"

12 November 2009

It's in the news...

I went along to the Forum on News Analytics over in Canary Wharf on Monday evening, organised by Professor Gautam Mitra from OptiRisk / Carisma at Brunel University. We seem to be in the early days of transforming news articles into quantifiable/machine-readable data so that it can be processed automatically/systematically in trading and risk management. It was a good event with both vendors and practititioners attending so was reasonably balanced between vendor hype and the current state of market practice.

As background on what is meant by news analytics data, then for example you might count the number of news articles about a particular company and look at whether the quantity of news articles might be a predictor of some change in the company's stock price or volatility. Moving on from this simple approach (assuming that you are clever enough to be certain about what news is about what company), then you can then move towards assessing whether the news is negative, neutral or positive in sentiment about a company/stock.

The context here is about having the capability to automatically process/analyse any kind of text-based news story, not just those from research analysts that might be nicely tagged with such quantifiers of sentiment (see http://www.rixml.org/ on xml standards for analyst data). The way in which the meaning of the text is "quantified" uses some form of Natural Language Processing.

The event started with a brief talk by Dan di Bartolemeo of Northfield Information Services. I hadn't heard of him or his company before (maybe I should pay more attention!) but he seemed a very solid speaker with strong academic and practical background in investment management and modelling. He referenced a few academic papers (available via their web site) on news analytics, and how news analytics and implied volatility could provide better estimates of future volatility than implied volatility alone. He also made some good points about how investment "models" are calibrated to history and how such models need to adapt to "today" - he put it as "how are things different now from the past?" and put forward the idea of a framework for assessing and potentially modifying a model to respond to the "now" situation. He also suggested that the market can react very differently to "expected news" (having a range of investment "what ifs" planned for a known earnings announcement) as opposed to unexpected information (we are back into the realms of the Black Swan and the ultimate in uncertainty wisdom from Donald Runsfeld)

Armando Gonzalez of RavenPack then began by explaining how RavenPack had become involved in applying text analysis to finance (it seems the subject has its origins, like a lot of things, in the military). RavenPack seem to be highest profile quantified news vendor at the moment, and whilst Armando is obviously biassed towards pushing the concept that money can be made by adding quantified news data to trading models, he said that not many firms are as yet systematically processing news and most people are relying upon manual interpretation of the news they buy/use. Some of the studies Ravenpack have on market news and prices are very interesting, showing how a news event can take up to 20 mins before the market settles on a new "fair" price level for a stock. Additionally, and maybe an interesting reflection on human behaviour, was that in bull markets there are usually twice as many positive stories about companies than negative, but strikingly in a bear market there was still almost equal amounts of positive and negative news - so humans are basically optimists! (or delusional, or just plain greedy...take your pick!)

Mark Vreijling of Semlab followed Armando and suggested that a lot of their sales prospects understandably desire "proof" of the benefits of adding quantified news to trading, but this was a little ironic since most financial institutions have been paying to receive "raw" news for years, presumably because they perceive beneift from it. Mark also mentioned that the application of quantified news to risk management was a new but growing area for him and his colleagues.

Gurvinder Brar of Macquarie then went into some of the practicallities of quantifying and using news in automated trading. He suggested that you need to understand what is really "news" (containing information on something that has just happened) and what is merely an news "article" (like a "feature" in a magazine etc). Assessing relevance of news was also difficult and he added that setting a hierarchy of what kind of events are important to your trading was a key step in dealing with news data. Fundamentally he suggested that why wait for five days for analysts to publish their assessment of a market or company-specific event when you could react to the event in near real-time.

The event then went into "panel" mode where the following points came out:

  • Dan thought that a real challenge was integrating quantified news with all of the other relevant datasets (market data, but also reference data etc)
  • Armando picked up on Dan's point by giving the example news about Gillette which at one point was about Gillette the company but then on acquisition became news about the Gillette "brand" which became a part of Proctor and Gamble.
  • Dan said that a key problem with processing news was also understanding what news was simply ignored by the news wires i.e. we know what is being talked about, but what could have been talked about, why was it ignored and is it (even so) relevant to trading?
  • Mark and Armando said that the "context" for the news story was vital and that market expectations can turn many "negative" news stories into positive outcomes for trading e.g. the market likes bad news when it is not as "bad" as everyone thought.
  • Dan made a very interesting point about trading in terms of categorising trades as "want to" trades and "have to" trades. He gave the example of a trade being observed that seemingly has no news associated/prompting it - so does this mean the trade is occuring because somebody "has to" make the trade (a fund facing an welcome client redemption for example?) or because there has been some information leak to a market participant and such a participant "wants to" make a trade before the news becomes available to the market as a whole.
  • I think all of the panel members then collectively hesitated before answering the next question from the audience, with Microsoft having one of their "text search" R&D team (think Bing...) asking about news categorisation and quantification.
  • Dan also mentioned something that I have only recently become more aware of, which is that apart from major markets in the US, most exchanges world-wide do not publish whether a trade was a "buy" or "sell" trade (they just publish the price and transaction size). Obviously knowing the direction of the trade would be useful to any trading model, and Dan referred to this as wanting to know the "signed volume".
  • A member of the audience then asked whether most quantified news had been based on just the English language and the concensus was that most was based on English, but Natural Language Processing can be trained in other languages relatively easily. A few members of the panel pointed out that all languages change, even English, requiring constant retraining, and also that certain languages, countries and cultures added further complication to the recognition process.
  • The next question asked was whether the panel could outline the major areas that quantified news is applied in - the answer included intraday (but not quite real-time) trading, algorithmic execution, lower frequency portofolio rebalancing and in compliance/risk/market abuse detection.
  • A good debate ensued about whether "news" was provided by the official newswires or by the web itself. The panel (and audience) concensus seemed to favour the premise the news wires are the source of news and the web is a reflection/regurgitation of this news. That said, Gurvinder of Macquarie gave the nice counter example of the analysts/news wires not making much of the new Apple iPod, when looking at the web it was possible to see that the public were in contrast very enthusiastic about it.

Overall an interesting event. I think the application of "quantified news" to risk management is interesting - maths and financial theory is very interesting but markets are driven by people's behaviour and if "quantified news" can help us understand this better it has to help in avoiding (some!) of the future problems to be faced in the market.

03 November 2009

Truly "Open" Bloomberg?

Interesting couple of articles from Inside Reference Data and Inside Market Data. The first is on Bloomberg making its codes freely available to all from its website http://bsym.bloomberg.com - given past standards-based attempts like ISINs falling short of providing the industry with unique and useful security IDs this looks to be a welcome addition. This seems to be a publicity "win" for Bloomberg, especially given rival Thomson Reuters has recently got some indifferent publicity with the EU over RIC licensing (see article). No prizes for anyone who thinks that Thomson Reuters will not respond in some way with regard to RIC usage, maybe giving us two working proprietary standards that go "open" - at least everyone would then be matching up Bloomberg Tickers and Reuters RICs in public rather behind closed doors - and maybe a good opportunity for a Wiki site to do the matching up?

The second relates to Bloomberg providing a open-source data distribution system called "The Platform", I presume as less expensive alternative to Reuters RMDS. Meanwhile Reuters is busying itself with the plans for its competitor to the Open Bloomberg terminal with "Project Utah". Obviously Bloomberg is comparatively unproven with regard to systems provision so this is a big change and will be very interesting to watch - from a technology point of view but also culturally since can Bloomberg turn away from thinking in "Terminals" all of the time?

21 October 2009

Integrated Data and Analytics Management

Xenomorph was one of the sponsors on the “Integrated Data Management” webcast last week, hosted by Inside Reference Data (audio recording available here). There were a number of interesting questions that arose from the Webinar.

One fundamental although somewhat academic question was "What is Integrated Data Management?". Certainly everyone seemed convinced that there would be less "Enterprise Data Management" (EDM) projects in future, given the expense, scope and scale of such projects. The concensus was that whilst the need for data management was better under stood across all financial institutions, data management projects would be bitten off in more manageable chunks by asset type, business function or division (so are silos back in fashion I ask myself?!). Coming back to the original question, I guess my slant on Integrated Data Management is that we are seeing more and more data management projects that have an integrated reference data and market data elements to them, primarily driven by the need to sort out data quality/completeness/depth for use within risk management (in light of the financial crisis).

Related to risk management, a topic I pushed was that given the origins of data management for STP/back office, and given the interest in low latency tick data management/analyis in the front office, there seems to be a market gap (particularly in the US?) on how to manage data such as IR/credit curves, volatility surfaces and other derived data sets. These data sets seem to fall into the gap between what is thought of as market data (primarily just prices) and what is reference data (IDs and terms & conditions). This is another area where a more integrated approach to data management would be beneficial, particularly in making all these datasets available for risk management.

Coming back to a "hobby-horse" of mine, then I also raised the issue that whilst it is fine to be doing great data management (high quality, complete datasets etc) what is the point if all of your data is ignored by the front office and Excel is used to download the data traders and risk managers need from Open Bloomberg. I think the management of unstructured data (spreadsheets, word docs etc) needs to be elevated as an issue since this (unfortunately?) is where most data resides currently, despite what we data management professionals like to think.

I also think that the principles of good data management (centralisation, quality and transparency) could apply to other things and not just raw "data", but what about centralised pricing and valuation, centralised curves and centralised scenarios for risk? Again what is the point of doing good data management if the ultimate "information" (e.g. a valuation) is done using poor quality data, with a complete lack of transparency over the data and model used.

A good question was asked about models, which was that given pricing models and their weaknesses have formed some part of the recent crisis, do we need more complex models. On having a few conversations about this and thought about it some more, then some would say it is complexity that got us into the crisis so this is the last thing we need. My view is that we do not necessarily need more complex pricing models and valuation techniques, but we certainly need more robust ones which does not necessarily imply more complexity. Coming back to a point raised by David Rowe previously, then I think all quants and risk managers should think about a "second means of valuation" for all the theoretical models they use, and that hedgeability (see recent post on pricing model validation) seems to be the common theme in producing more robust pricing models.


18 September 2009

Pricing Model Validation: Mitigating Model Risk

I managed to catch some of the day yesterday at the "Pricing Model Validation: Mitigating Model Risk" conference. I thought it would be worthwhile going along since firstly the past 12-18 months have made model risk very topical (take a look at previous posts from Riskminds, the Modeller's Manifesto and Wilmott/Rowe).

Secondly more of our clients are looking at managing and centralising pricing models/curve calculators in addition to just managing the underlying data (see this Insight Investment client case study for a recent public example). I am calling this "Analytics Management" which is the business-focussed technology stack that combines pricing models/calculators/analytics with all of the "Data Management" underneath. But enough of my thinly-veiled positioning statements...and on with some of the (hopefully) useful content from the conference outlined below - maybe scan the headings in bold below for those talks of interest but I would particularly recommend the ones by Tanguy Dehapiot and Yuyal Millo...

Model Risk 2009 defining and forecasting. First speaker was Professor Phillip Sibbertsen of the University of Hannover on defining and measuring model risk. Phillip started by saying that "Model Risk" was a new category of risk within the confines of "Operational Risk", and that operational risk as defined by the regulators does not yet currently include the "model risk" of market risk and credit risk, nor the "model risk" of the operational risk model itself. (I am sure I could write that up better!...). Phillip put forward that model risk is not formally a "risk" since it has no probability distribution and that he suggested it should be thought of as "model uncertainty". He also clarified that model risk applies both at the large, portfolio scale (e.g. choice of VAR model etc) and at the smaller, instrument level scale (i.e. pricing of derivatives).

Additionally in terms of measuring model risk then he excluded human failure from model risk measurement since in his view this was difficult to quantify - this approach did not meet with the approval of some of the audience were questioning how this could be excluded from a practical point of view. Phillip's colleague, Corinna Luedtke, then presented some work they had done on calibrating different GARCH models to observed data and showing how even a poor model could produce reasonable forecasts of risk if the time period was short. The work was interesting but again the audience highlighted that the human choice (failure?) in choosing the set of models to try was part of "model risk" and should not be excluded from the definition of model risk.

Is a model accurate? Testing the implementation of a model. Second speaker was David Chevance, Head of Equity & FX Model Validation at Dresdner Kleinwort. David outlined the different sorts of model risk: mathematical errors, missing risk factors, divergence from industry practice, model inconsistencies and implementation risk. He then outlined the sources of these risks: bugs, approximations, numerical precision, numerical boundaries and limitations on numerical methods (e.g. Sobol numbers in high dimension monte-carlo simulations).

David said a key area to start with in validating a model implementation was the front-office documentation of the product, its inputs and payoffs, its pricing model but also details of calibration methods used/needed etc. He made the point here that the documentation can sometimes specify just the deal, but sometimes can express the pricing methodology and pricing parameters. The emphasis was on completeness, accuracy and making use of all of the information available in the documentation. Obviously the ability to review the code used to implement the model was also necessary.

He discussed the trade-offs between a simple validation approach in terms of speed and efficiency of resources against the more time-consuming, resource hungry but more accurate approach of full replication of the model. He also suggested that in choosing a method of validation it was important to balance resource demands against what is actually being validated: payoffs from a single trade, a type of pricing model or a family of financial products. Desired accuracy of the validation was also important, given the trade-off between accuracy and effort, and the fact that small bugs are much more common than large.He finally discussed model version control, the necessary discipline of documenting changes and regression tests for new models, and the regular cycle of model review. Overall it was an interesting talk with a good practical focus.

Practical aspects of valuation model control process. One of the most entertaining and interesting speakers of the day was Tanguy Dehapiot, Head of Validation and Valuation, Group Risk Management at BNP Paribas. He started by referring to a few documents "Supervisory guidance for assessing banks’ financial instrument fair value practices", April 2009 (BCBS 153) which was then implemented within “Enhancement to the Basel II framework” (BCBS 157). The first part of his presentation was around these documents and what the regulators expect to be in place, so I guess the best approach is to read them (the BCBS 153 document content is only 12 pages long, quite short for a regulator!)

Tanguy pointed out that in his view "Mark to Market" and "Mark to Model" are often misleading as both are often required. He prefers the term "Valuation Methodology". He proposed four valuation modes: Direct Price Quotation, Use of Similar Instruments, Risk Replication, Expected Uncertain Cashflows (NPV) and categorised a useful hierarchy/matrix of which financial products fit into which valuation mode and for what purposes. Within model risk, he split off judgemental errors (choice of model etc) as part of market risk and credit risk and operational errors (model implementation and coding) as more definable and avoidable parts of operational risk.

He had some interesting slants on data, saying that he had been surprised that even getting all of the static data necessary to price simpler instruments like bonds had proven difficult. He outlined how model parameters are often stored across a variety of systems (curve definitions in one place, pricing methodology somewhere else) implying to me that this is sometimes difficult to pull together and needs some centralisation to improve transparency around this.

His opinion on market parameters (both observed prices and derived data such as implied volatility surfaces) were often stored in a larger central database but warned that this market parameter database needs to be reviewed as part of the model validation process since some of its data is derived (i.e. calculated, maybe using a model!) and as such should not be taken as perfect for all time and for all purposes. He said that it was important to categorise the origin of data and suggested the following types:

  • Quoted on an active exchange
  • Actual private transaction in an active market
  • Tradable broker quotes
  • Consensus prices from market makers
  • Non-binding indicative prices from market makers
  • Counterparty valuation, collateral valuation
  • Actual transactions in inactive market

Tanguy proposed that there should a valuation matrix for each instrument, where there might a different valuation methodology used for end of day valuation verses intraday, for risk or for trading, for pricing individually or within a portfolio reval. I guess here the rational is appropriateness, efficiency and transparency about what needs to used when. He also added that he disliked the term "Model Validation" since it seemed to imply that a model was "valid" and preferred "Model Approval" to cover the decision to use a model and "Model Review" to cover model analysis. He said he found managing the "stock" of existing models (and keeping up with when to review them) more difficult than managing the "flow" of new models and products.

Overall Tanguy was a very interesting and funny speaker with lots of practical insights and a fair amount of opinion thrown in, which is always good in my view.

The usefulness of inaccurate models: Financial risk management "in the wild". This talk was given by Dr Yuval Millo of the London School of Economics and he focussed on the evolution of the use of the Black Scholes Merton (B-S-M) model at the CBOE and how the model came to be the means by which the whole options market "communicated". Yuyal is a social scientist and prefaced his talk by stating that "Social Sciences are good at predicting the past"

First thing I didn't know (amongst the many things I do not know...) is that the B-S model was not published until a couple of weeks after the CBOE started trading stock options in April1973. Yuyal said that initially the B-S-M derived prices were not accurate at all (around 25% off the market price on CBOE) and that the model was based on assumptions that plainly were not the case on the exchange (only calls available, no short selling, no continuous trading). The model was used by local Chicago trading firms and the story goes that Fischer Black sold large paper "sheets" of option pricing matrices to these traders (there being no calculators/PCs/mobiles around at the time).

As the markets developed, larger East Coast banks entered the market with stocks being held and traded in New York and options being traded in Chicago, so trading became geographically dispersed. This started the need for "early morning meetings" to discuss the market and the B-S-M model and its parameters became the "lingua franca" or means of communication of options market participants.

He described the first years of the Options Clearing Corporation (OCC) which was set up to ensure that the financial obligations of options and buyers were met. Around 1979-80 the OCC worked overnight to calculate margin requirements, based on the (now?) arcane idea that different margin amounts should be associated with different option strategies (straddles, butterflies etc) and the job of the OCC was to take a portfolio of Option and optimise which combination of strategies would minimise the margin required for the whole portfolio. He said that there were disputes between traders and the OCC around margin levels and difficulties for the SEC with updating their Net Capital Rules as each new option strategy was created. Eventually, the OCC adopted the B-S-M model and implied volatility as the means of calculating margin against market value which enabled them to move away from the operational difficulty of strategy optimisation.

So the B-S-M became the way in which traders communicated about the market but also the model became vital operationally within clearing for the market. By 1987 B-S-M had become the de-facto standard for the market, with the model driving the market in turn driving use of the model. During the Oct '87 crash the model proved to be very innaccurate but the use of the model did not diminish - maybe pschologically the market participants needed a model (even a wrong model) to make communication easier.

I found this talk very interesting and members of the audience asked whether any similar analysis was going to be done on the Gaussian Copula model used to price CDOs. Yuyal said that one of his colleagues was undertaking this research currently. Given that he seemed to be very positive about the use of the B-S-M model within options markets I asked whether he had any opinions on Taleb's criticism of fiancial engineers and modelling. Yuyal said that he and Nassim were friends and agreed to disagree on certain topics...

Stress testing modelling parameters. Next up was Peirpaolo Montana, Head of Model Validation at West LB. Having joined the finance industry out of a career in mathematics and then at a regulator, Pierpaulo began by saying that back in the heady days of 2004 the banks thought that their own risk management systems and practices were well ahead of the regulators. He said that in light of the crisis this proved not to be the case but he now feels that this is now more evenly balanced (not sure I would agree, still lots of catchin to do for some institutions I would suggest).

He said that whilst regulators require the validation of risk models and pricing models, and that stress testing of a portfolio is required, that the stress testing of a pricing model is not a requirement and has received much less attention and in his view was not done to much degree before 2007. His point here was that pricing models should work under stress too, otherwise they are a weak foundation for building other risk measures such as stressed VAR.

Whilst focussing on pricing models, he mentioned that risk models also need to be carefully chosen and appropriate to the institution and the types of trading activities it undertakes. As an example he put forward that a simple VAR calculator might be appropriate for a long only equity fund but completely innappropriate for a relative value portfolio.

He said that stress testing had recently received much more attention as a risk management tool and cited the BIS document "Revisions to the Basel II market risk framework" where stressed VAR is introduced as part of the regulatory capital charge calculation. He also mentioned that in order to avoid "standard model" treatment of complex securitised products an institution must be able to demonstrate that its VAR model can cope with these products under times of market stress.

Pierpaulo then described the stress testing of base correlation in CDO pricing, and how even moving the base correlation from its usual level of 70% to 99% would not have predicted the valuations observed in the recent crisis. In this way he says that stress testing of models can detect implementation problems and some model weaknesses, but it cannot assist in coping with structural breaks in the market. He also discussed how the B-S-M model is used everywhere (even places it should not really be valid for) since it is a robust model based on the no-arbitrage hypothesis - in contrast the CDO base correlation and other models are not so robust since they are not arbitrage free.

(end of post!)
 


 

17 July 2009

Heavyweight Data Management...

...I am very concerned that I have previously missed an important requirement for data management solutions - a heavweight one judging by this great discussion on one of the Microsoft forums.

14 May 2009

Microsoft CEP Surfaces as "Orinoco"

Seems like Microsoft have now gone public on the Microsoft TechEd site that they have a Complex Event Processing (CEP) engine that will be coming to market shortly (see MagmaSystems blog post ). One of my colleagues Mark Woodgate attended a briefing event at Microsoft for this technology back in February this year - here's an extract from some internal notes that Mark made back then:

"Microsoft CEP is very similar to StreamBase conceptually (and not unsurprisingly), in the sense that there are adapters and streams and how you merge and split them via some kind of query language is the same. However, StreamBase uses the StreamSQL which as we have seen is SQL-like in syntax but Microsoft CEP uses LINQ and .NET and although conceptually it is doing the same thing, it does not look the same. StreamBase’s argument was you can be an SQL programmer to use it and don’t need lower-level like .NET; however, it’s not SQL really as it has all these ‘extensions’ you have to learn so using .NET might look more tricky but in fact it makes sense. They don’t have a sexy GUI yet for designing CEP applications like StreamBase but it will be done in Visual Studio 2008.

 

Currently, you build various assemblies (I/O adapters, queries and functions) and then bolt them all together, called ‘binding’ by command line tool. You then deploy the application onto one or more machines using another tool so it’s a manual process right now. They are aware this needs to be made easier and more visual. They are allowing other libraries to be bolted in via the various SDKs so it’s pretty open and flexible. It works well with HPC and clusters/grids (or so they say) and of course can be used with SQL Server. The CEP engine also has a web interface based on SOAP so at least non-Windows based systems can talk to it"

 

The release of this technology will be an interesting addition to the CEP market and to the Microsoft technology stack in general. Assuming performance is at credible levels (i.e. not necessarily leading but not appalling either) it will certainly bring both technical and commercial pressure to bare on existing CEP vendors (see earlier post on Aleri/Coral8) and has the potential to broaden the usage of CEP. Obviously Linux-Lovers (sorry, I didn't mean to be personal...) will not agree with this, but Microsoft is putting together an interesting stack of technology when you see this CEP engine, Microsoft HPC and Microsoft Velocity coming together under .NET.

 

08 May 2009

Data Quality and the Future of Risk

A new survey from the Economist Intelligence Unit (sponsored by SAS) of over 300 financial institutions world-side has put data quality and availability as a key issue to be resolved if risk management is to be fit for purpose following the financial crisis:

"Culture, expertise and data are weak points in current risk management"

A summary of the survey report is available here.

08 April 2009

High Performance Spreadsheets

Another article about the operational risk generated by the usage of spreadsheets within the financial markets (see earlier posts), appeared in the April issue of Waters Magazine.
 
The articles highlights how spreadsheets are largely used within financial institutions and suggests that the current regulation requirements for more transparency and ad-hoc risk management might push the proliferation of spreadsheets even further. The articles also refers to the progress and improvements made by Microsoft in recent versions of Excel to increase the security of spreadsheets.
 
Xenomorph has worked closely with Microsoft on hosting its time series database within SQL Server 2008. The case study we have written together describes how SQL Server 2008 offers integration within Office Excel 2007 so that whilst the spreadsheet is still the end-user viewing tool, operational risk is reduced by engaging Excel 2007 as an analytics and reporting tool and not as a mean of storing data.
 
Our TimeScape solution offers more than 700 easy to use add-in functions to Office Excel 2007 and we are currently working on the use of Excel Services, part of Microsoft Office Share Point Server 2007, to further enhance the centralized approach to spreadsheet.
 
If you are interested in how Xenomorph solves the problem of spreadsheet management, then take a look at our (newly updated) website. Here we explain how to solve the problem and how Xenomorph Spreadsheet Inside technology can bring unstructured spreadsheet data and complex calculation within a centralized data management system, increasing transparency and reducing operational risk.

13 February 2009

Data management, derivative analytics and the spreadsheet

Interesting article out doing the rounds on the newswires announcing a forthcoming report called "The Enterprise Spreadsheet: Pushing towards Transparency" by the analyst firm the Tabb Group. It is great to see an analyst firm acknowledging the importance of spreadsheets within the markets, particularly in the area of combining data and analytics together in OTC derivatives management (see earlier post).

Adam Sussman of the Tabb Group reckons that despite its shortcomings, Excel is a valuable tool: “Spreadsheets, either alone or in conjunction with other components, can meet the same requirements as a business application.” In this he seems to be agreeing with the UK Regulator the FSA, who have been recently advocating that spreadsheets and spreadsheet data needs actively managing as an institutional resource. The findings of the Tabb Group on management also seem to echo a recent report called "Buy-Side Data Management in a Changing Landscape" done by Lepus for Asset Control (registered link to report here).

Spreadsheets are a great tool and fulfil a real need in the market to pull together pricing models and data quickly, easily and with a timeframe that is meaningful to the business (see earlier post for some work by Xenomorph in this area). Spreadsheets are a big problem to manage, but they are also the symptom of failings in core systems that are not able to rapidly support new instrument types and pricing models. An institution that ignores analytics, spreadsheets and spreadsheet data within any EDM transparency initiative has already failed before it begins, and so to paraphrase the author Aldous Huxley:

"Spreadsheets do not cease to contain data because they are ignored."

28 January 2009

EDM Council on the Meaning of Data

The EDM Council has today released its first draft of a Semantics Repository for review by the industry. Its intent is to establish a common meaning for terms used in financial markets and data management in particular. How it is meant to be used and how it fits with other standards such as FpML and MDDL is briefly outlined by the following article

As already mentioned in our earlier post, it would be great for the industry if this time of crisis provides the need and motivation to get standards in place. I guess we are all fighting against the profound Tannebaum quote of "the nice thing about standards is that there are so many of them to choose from".

I wish the EDM Council every success with this effort, although the irony that we vendors and users of centralised data management systems and proponents of the single "Golden Copy", need to do better ourselves on establishing centralised industry standards for data. As I say to my five year old son "do as I say, not as I do"...

25 January 2009

CEP in 2009

Interesting predictions for complex event processing (CEP) in 2009 (click here for link) - sounds like some form of reality is appearing in this area of the market, accelerated by the current financial crisis. Entry of bigger players and usage of LINQ in CEP will be interesting too.

15 January 2009

Happy Birthday Spreadsheet!

Article on PCMag saying that the spreadsheet is 30 years old. Whilst wishing it a happy birthday, the author, John C. Dvorak, has a good rant about how spreadsheets have been the major weapon in the rise to power of the accountant in business.

Good job he did not spend too much time looking at their usage in financial markets or else his rant would have been much longer, given past issues with spreadsheets in financial markets. The spreadsheet (which means Excel at the moment) is a great tool that is:

  • a calculator;
  • a report writer;
  • a database

In my view it is the latter usage of desktop spreadsheets to store data where the problems mainly reside, not its usage as an analysis tool. Faced with inflexible trading and risk management systems that do not allow instrument and trade data to be represented quickly or correctly, it is unsurprising that traders, portfolio managers and risk managers resort to spreadsheets as the "pressure relief valve" for their business activities.

Delivering systems that can support both complex and non-standard instruments and trades in a transparent manner should be a focus in a world where that the lack of transparency over credit derivative pricing has been such an issue. The inappropriate usage of spreadsheets is a very small part of the current problems we are experiencing in the markets, but addressing this would be a positive step in creating a data management foundation that encompasses all data used by a financial institution, not just that data that is easy for software vendors to represent in their systems.

Anyway, enough of my spreadsheet hobby-horse, for some light relief and to celebrate 30 years of summing rows and columns, take a look at the Eusprig web site for a list of the most notable spreadsheet failures.

14 January 2009

Libor no more...

Following the ongoing story of Libor diverging from the OIS rate (see earlier post), Risk magazine reports that Libor risks losing its place as a funding benchmark. Spreads against the OIS have tightened recently (see recent article in the FT), but Mustafa Chowdhury, head of US interest rate research at Deutsche Bank in New York, says that Libor is becoming less relevant as a benchmark due to banks accessing other sources of funding such as Federal Reserve Funds.

Time to change all of those benchmark yield curves across the entire institution and understand all of the pricing differences? Ouch! Maybe wait a while yet...

25 November 2008

MiFID Market Data Deterioration

Members of the Investment Management Association (IMA) are not happy about the quality of market data following from the first year of MiFID being with us. According to an article in the FT this morning, a survery of IMA members has voiced concern over the fragmentation of trading venues leading to a deterioration in the quality of market data available.

Most institutions seem positive about the benefits of competition that multiple trading venues such as Chi-X, Turquoise and BATS bring, but IMA members would like to see a centralised venue where prices are consolidated and made available to the market (for free?), similar to the "consolidated tape" available for US markets.

Sounds like the institutional need for centralised data management across different departments and systems is also becoming apparent at a market and exchange level following MiFID. Multiple trading venues and decentralisation is necessary to bring the benefits of competition, but these benefits unsurprisingly do not come without costs or issues (see earlier post).

20 October 2008

Unstructured Data Management Anyone?

Good example on Finextra this week of another spreadsheet debacle, this time involving Barclays and Lehmans (see article).

Not sure when the financial markets will get serious about managing unstructured data and spreadsheet usage. We can all apply Enterprise Data Management until the cows come home, but if the "real" business is being managed in spreadsheets then really why bother with grand aims like EDM?

Even the regulators are not totally against the usage of spreadsheets (see presentation), then simply want unstructured data to be managed properly as part of an institution's formal processes and not ignored until the inevitable problems arise...

06 October 2008

Transparency for troubled times

I came across this pair of quotes on a google search, bringing data management into the context of the current financial crisis:

"Where is the wisdom? Lost in the knowledge. Where is the knowledge? Lost in the information." - T.S. Eliot

"Where is the information? Lost in the data. Where is the data? Lost in the ******* database." - JoeCelko

Here's to hoping that wisdom is not in short supply at the moment...

01 October 2008

Here today. Where tomorrow?

These may be the words on the lips of many bankers today, as they survey the continuing turmoil in global financial markets. In fact, this was the incredibly apposite tagline on a recent magazine advertisement for a major bank which (maybe unsurprisingly) was subsequently nationalised.

In the fluid (many would say “bloody”) landscape of financial services, with the next merger or acquisition just around the corner, it means that now, more than ever, data integration is a growing challenge. Accompanying this activity is the ever-growing need for consistency, accuracy, transparency and control of both the data itself and the movement of that data.

Data architecture itself is an evolving discipline and one approach worth looking at is data federation – deftly described in an article by Dain Hansen. Basically, the approach is to leave the data where it is but aggregate it into a single view, available as a service to your applications. It is an approach that Xenomorph has advocated for many years, going back to our founding days in the mid-90s, with the normalized database driver approach implemented in our Connectivity Services.

Hansen’s article explains both the advantages (simplicity, no need to copy or synchronize) and the disadvantages (performance) of this approach, and argues for a solution that incorporates both federation and consolidation of data. He shows that it is possible to architect a system that will provide consistency and control as well as agility.

It’s difficult to say whether better data management would have assisted the world’s banks in avoiding their current troubles, but greater transparency of where exactly their exposures lay would certainly have helped.

02 June 2008

Real-time holidays...

...on a happier note than real-time death data (see earlier post) then holiday data is also real-time - a link with statistics from one of our hedge fund contacts on holiday data:

http://www.financialcalendar.com/freestuff/getreal.htm

21 May 2008

Vhayu and Streambase - positioning clarified?

Partner announcement on Finextra with Vhayu and Streambase coming together:

http://www.finextra.com/fullpr.asp?id=21477

Defining what vendors mean by a "Data Management System" is difficult enough for clients, but in the area of the somewhat fuzzy technology definitions around automated trading it is interesting to see Streambase clarify their offering around CEP (and not database too, which was one of their first messages around bringing real-time and historic data together), and that Vhayu seems to be emphasising its tick database capabilities (and de-emphasising its original perception in the market as a CEP vendor).

02 April 2008

Streaming Blue Genes...

The supercomputer continues to make a come-back - just up on Finextra with TD Bank testing IBM's Blue Gene supercomputers to amalgamate and analyse real-time structured and unstructured data:

http://www.finextra.com/fullstory.asp?id=18293

01 April 2008

Far away from low latency...

...is the OTC derivatives market - still a lot to be done according to Finextra report and letter from the Fed:

http://www.finextra.com/fullstory.asp?id=18285

19 March 2008

Time Series inside SQL Server

Case study of some of the work we have been doing with Microsoft on hosting our time series storage inside SQL Server has just gone up on their site at:

http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000001637

17 March 2008

Higher quality data from the front office?

Sungard had a good event on Thursday night, with four risk managers taking the stage for a "thought leadership" seminar entitled "Regulatory Impact of Market Events" (if the advert is still around on their site, see http://www.sungard.com/ADAPTIV/default.aspx?id=4678&formAction=takeit&formid=48)

The Dresdner risk manager (Ted Macdonald, good speaker) was emphasising that data quality is a real issue for risk management, and that all participants thought that risk managers should spend more time on risk and less on validating/cleaning data (no great surprises there then but interesting to hear it validated again as an issue).

He suggested that more pressure should be put on the front office to get data right first time (as opposed to leaving everyone else to sort out the mess!), even going so far as to suggest that charging the front office for each wrongly-booked trade in the trading and risk management systems - not sure how that would go down with the trading desks, but sounds a good approach if you could agree (and unambiguously measure) these mistakes!

Seems like transfer-costing is becoming a re-occurring theme - also recently mentioned by a grid computing specialist from Credit Suisse about "metering" each desk for the amount of compute power used...anyone retraining as a management accountant out there? - sounds like the banks will be hiring soon!

07 March 2008

Best Instrument Data Model?

Some positive feedback about instrument and market data model within our data management system TimeScape:

http://mdavey.wordpress.com/2008/03/01/xenomorph-best-data-modelling/

Xenomorph: data and analytics management

About Xenomorph

Xenomorph is the leading provider of data and analytics management solutions to the financial markets. Risk, trading, quant research and IT staff use Xenomorph’s TimeScape data and analytics management solution at investment banks, hedge funds and asset management institutions across the world’s main financial centres.

Blog powered by TypePad
Member since 02/2008