4 posts categorized "Analytics Management"

12 November 2009

It's in the news...

I went along to the Forum on News Analytics over in Canary Wharf on Monday evening, organised by Professor Gautam Mitra from OptiRisk / Carisma at Brunel University. We seem to be in the early days of transforming news articles into quantifiable/machine-readable data so that it can be processed automatically/systematically in trading and risk management. It was a good event with both vendors and practititioners attending so was reasonably balanced between vendor hype and the current state of market practice.

As background on what is meant by news analytics data, then for example you might count the number of news articles about a particular company and look at whether the quantity of news articles might be a predictor of some change in the company's stock price or volatility. Moving on from this simple approach (assuming that you are clever enough to be certain about what news is about what company), then you can then move towards assessing whether the news is negative, neutral or positive in sentiment about a company/stock.

The context here is about having the capability to automatically process/analyse any kind of text-based news story, not just those from research analysts that might be nicely tagged with such quantifiers of sentiment (see http://www.rixml.org/ on xml standards for analyst data). The way in which the meaning of the text is "quantified" uses some form of Natural Language Processing.

The event started with a brief talk by Dan di Bartolemeo of Northfield Information Services. I hadn't heard of him or his company before (maybe I should pay more attention!) but he seemed a very solid speaker with strong academic and practical background in investment management and modelling. He referenced a few academic papers (available via their web site) on news analytics, and how news analytics and implied volatility could provide better estimates of future volatility than implied volatility alone. He also made some good points about how investment "models" are calibrated to history and how such models need to adapt to "today" - he put it as "how are things different now from the past?" and put forward the idea of a framework for assessing and potentially modifying a model to respond to the "now" situation. He also suggested that the market can react very differently to "expected news" (having a range of investment "what ifs" planned for a known earnings announcement) as opposed to unexpected information (we are back into the realms of the Black Swan and the ultimate in uncertainty wisdom from Donald Runsfeld)

Armando Gonzalez of RavenPack then began by explaining how RavenPack had become involved in applying text analysis to finance (it seems the subject has its origins, like a lot of things, in the military). RavenPack seem to be highest profile quantified news vendor at the moment, and whilst Armando is obviously biassed towards pushing the concept that money can be made by adding quantified news data to trading models, he said that not many firms are as yet systematically processing news and most people are relying upon manual interpretation of the news they buy/use. Some of the studies Ravenpack have on market news and prices are very interesting, showing how a news event can take up to 20 mins before the market settles on a new "fair" price level for a stock. Additionally, and maybe an interesting reflection on human behaviour, was that in bull markets there are usually twice as many positive stories about companies than negative, but strikingly in a bear market there was still almost equal amounts of positive and negative news - so humans are basically optimists! (or delusional, or just plain greedy...take your pick!)

Mark Vreijling of Semlab followed Armando and suggested that a lot of their sales prospects understandably desire "proof" of the benefits of adding quantified news to trading, but this was a little ironic since most financial institutions have been paying to receive "raw" news for years, presumably because they perceive beneift from it. Mark also mentioned that the application of quantified news to risk management was a new but growing area for him and his colleagues.

Gurvinder Brar of Macquarie then went into some of the practicallities of quantifying and using news in automated trading. He suggested that you need to understand what is really "news" (containing information on something that has just happened) and what is merely an news "article" (like a "feature" in a magazine etc). Assessing relevance of news was also difficult and he added that setting a hierarchy of what kind of events are important to your trading was a key step in dealing with news data. Fundamentally he suggested that why wait for five days for analysts to publish their assessment of a market or company-specific event when you could react to the event in near real-time.

The event then went into "panel" mode where the following points came out:

  • Dan thought that a real challenge was integrating quantified news with all of the other relevant datasets (market data, but also reference data etc)
  • Armando picked up on Dan's point by giving the example news about Gillette which at one point was about Gillette the company but then on acquisition became news about the Gillette "brand" which became a part of Proctor and Gamble.
  • Dan said that a key problem with processing news was also understanding what news was simply ignored by the news wires i.e. we know what is being talked about, but what could have been talked about, why was it ignored and is it (even so) relevant to trading?
  • Mark and Armando said that the "context" for the news story was vital and that market expectations can turn many "negative" news stories into positive outcomes for trading e.g. the market likes bad news when it is not as "bad" as everyone thought.
  • Dan made a very interesting point about trading in terms of categorising trades as "want to" trades and "have to" trades. He gave the example of a trade being observed that seemingly has no news associated/prompting it - so does this mean the trade is occuring because somebody "has to" make the trade (a fund facing an welcome client redemption for example?) or because there has been some information leak to a market participant and such a participant "wants to" make a trade before the news becomes available to the market as a whole.
  • I think all of the panel members then collectively hesitated before answering the next question from the audience, with Microsoft having one of their "text search" R&D team (think Bing...) asking about news categorisation and quantification.
  • Dan also mentioned something that I have only recently become more aware of, which is that apart from major markets in the US, most exchanges world-wide do not publish whether a trade was a "buy" or "sell" trade (they just publish the price and transaction size). Obviously knowing the direction of the trade would be useful to any trading model, and Dan referred to this as wanting to know the "signed volume".
  • A member of the audience then asked whether most quantified news had been based on just the English language and the concensus was that most was based on English, but Natural Language Processing can be trained in other languages relatively easily. A few members of the panel pointed out that all languages change, even English, requiring constant retraining, and also that certain languages, countries and cultures added further complication to the recognition process.
  • The next question asked was whether the panel could outline the major areas that quantified news is applied in - the answer included intraday (but not quite real-time) trading, algorithmic execution, lower frequency portofolio rebalancing and in compliance/risk/market abuse detection.
  • A good debate ensued about whether "news" was provided by the official newswires or by the web itself. The panel (and audience) concensus seemed to favour the premise the news wires are the source of news and the web is a reflection/regurgitation of this news. That said, Gurvinder of Macquarie gave the nice counter example of the analysts/news wires not making much of the new Apple iPod, when looking at the web it was possible to see that the public were in contrast very enthusiastic about it.

Overall an interesting event. I think the application of "quantified news" to risk management is interesting - maths and financial theory is very interesting but markets are driven by people's behaviour and if "quantified news" can help us understand this better it has to help in avoiding (some!) of the future problems to be faced in the market.

21 October 2009

Integrated Data and Analytics Management

Xenomorph was one of the sponsors on the “Integrated Data Management” webcast last week, hosted by Inside Reference Data (audio recording available here). There were a number of interesting questions that arose from the Webinar.

One fundamental although somewhat academic question was "What is Integrated Data Management?". Certainly everyone seemed convinced that there would be less "Enterprise Data Management" (EDM) projects in future, given the expense, scope and scale of such projects. The concensus was that whilst the need for data management was better under stood across all financial institutions, data management projects would be bitten off in more manageable chunks by asset type, business function or division (so are silos back in fashion I ask myself?!). Coming back to the original question, I guess my slant on Integrated Data Management is that we are seeing more and more data management projects that have an integrated reference data and market data elements to them, primarily driven by the need to sort out data quality/completeness/depth for use within risk management (in light of the financial crisis).

Related to risk management, a topic I pushed was that given the origins of data management for STP/back office, and given the interest in low latency tick data management/analyis in the front office, there seems to be a market gap (particularly in the US?) on how to manage data such as IR/credit curves, volatility surfaces and other derived data sets. These data sets seem to fall into the gap between what is thought of as market data (primarily just prices) and what is reference data (IDs and terms & conditions). This is another area where a more integrated approach to data management would be beneficial, particularly in making all these datasets available for risk management.

Coming back to a "hobby-horse" of mine, then I also raised the issue that whilst it is fine to be doing great data management (high quality, complete datasets etc) what is the point if all of your data is ignored by the front office and Excel is used to download the data traders and risk managers need from Open Bloomberg. I think the management of unstructured data (spreadsheets, word docs etc) needs to be elevated as an issue since this (unfortunately?) is where most data resides currently, despite what we data management professionals like to think.

I also think that the principles of good data management (centralisation, quality and transparency) could apply to other things and not just raw "data", but what about centralised pricing and valuation, centralised curves and centralised scenarios for risk? Again what is the point of doing good data management if the ultimate "information" (e.g. a valuation) is done using poor quality data, with a complete lack of transparency over the data and model used.

A good question was asked about models, which was that given pricing models and their weaknesses have formed some part of the recent crisis, do we need more complex models. On having a few conversations about this and thought about it some more, then some would say it is complexity that got us into the crisis so this is the last thing we need. My view is that we do not necessarily need more complex pricing models and valuation techniques, but we certainly need more robust ones which does not necessarily imply more complexity. Coming back to a point raised by David Rowe previously, then I think all quants and risk managers should think about a "second means of valuation" for all the theoretical models they use, and that hedgeability (see recent post on pricing model validation) seems to be the common theme in producing more robust pricing models.


18 September 2009

Pricing Model Validation: Mitigating Model Risk

I managed to catch some of the day yesterday at the "Pricing Model Validation: Mitigating Model Risk" conference. I thought it would be worthwhile going along since firstly the past 12-18 months have made model risk very topical (take a look at previous posts from Riskminds, the Modeller's Manifesto and Wilmott/Rowe).

Secondly more of our clients are looking at managing and centralising pricing models/curve calculators in addition to just managing the underlying data (see this Insight Investment client case study for a recent public example). I am calling this "Analytics Management" which is the business-focussed technology stack that combines pricing models/calculators/analytics with all of the "Data Management" underneath. But enough of my thinly-veiled positioning statements...and on with some of the (hopefully) useful content from the conference outlined below - maybe scan the headings in bold below for those talks of interest but I would particularly recommend the ones by Tanguy Dehapiot and Yuyal Millo...

Model Risk 2009 defining and forecasting. First speaker was Professor Phillip Sibbertsen of the University of Hannover on defining and measuring model risk. Phillip started by saying that "Model Risk" was a new category of risk within the confines of "Operational Risk", and that operational risk as defined by the regulators does not yet currently include the "model risk" of market risk and credit risk, nor the "model risk" of the operational risk model itself. (I am sure I could write that up better!...). Phillip put forward that model risk is not formally a "risk" since it has no probability distribution and that he suggested it should be thought of as "model uncertainty". He also clarified that model risk applies both at the large, portfolio scale (e.g. choice of VAR model etc) and at the smaller, instrument level scale (i.e. pricing of derivatives).

Additionally in terms of measuring model risk then he excluded human failure from model risk measurement since in his view this was difficult to quantify - this approach did not meet with the approval of some of the audience were questioning how this could be excluded from a practical point of view. Phillip's colleague, Corinna Luedtke, then presented some work they had done on calibrating different GARCH models to observed data and showing how even a poor model could produce reasonable forecasts of risk if the time period was short. The work was interesting but again the audience highlighted that the human choice (failure?) in choosing the set of models to try was part of "model risk" and should not be excluded from the definition of model risk.

Is a model accurate? Testing the implementation of a model. Second speaker was David Chevance, Head of Equity & FX Model Validation at Dresdner Kleinwort. David outlined the different sorts of model risk: mathematical errors, missing risk factors, divergence from industry practice, model inconsistencies and implementation risk. He then outlined the sources of these risks: bugs, approximations, numerical precision, numerical boundaries and limitations on numerical methods (e.g. Sobol numbers in high dimension monte-carlo simulations).

David said a key area to start with in validating a model implementation was the front-office documentation of the product, its inputs and payoffs, its pricing model but also details of calibration methods used/needed etc. He made the point here that the documentation can sometimes specify just the deal, but sometimes can express the pricing methodology and pricing parameters. The emphasis was on completeness, accuracy and making use of all of the information available in the documentation. Obviously the ability to review the code used to implement the model was also necessary.

He discussed the trade-offs between a simple validation approach in terms of speed and efficiency of resources against the more time-consuming, resource hungry but more accurate approach of full replication of the model. He also suggested that in choosing a method of validation it was important to balance resource demands against what is actually being validated: payoffs from a single trade, a type of pricing model or a family of financial products. Desired accuracy of the validation was also important, given the trade-off between accuracy and effort, and the fact that small bugs are much more common than large.He finally discussed model version control, the necessary discipline of documenting changes and regression tests for new models, and the regular cycle of model review. Overall it was an interesting talk with a good practical focus.

Practical aspects of valuation model control process. One of the most entertaining and interesting speakers of the day was Tanguy Dehapiot, Head of Validation and Valuation, Group Risk Management at BNP Paribas. He started by referring to a few documents "Supervisory guidance for assessing banks’ financial instrument fair value practices", April 2009 (BCBS 153) which was then implemented within “Enhancement to the Basel II framework” (BCBS 157). The first part of his presentation was around these documents and what the regulators expect to be in place, so I guess the best approach is to read them (the BCBS 153 document content is only 12 pages long, quite short for a regulator!)

Tanguy pointed out that in his view "Mark to Market" and "Mark to Model" are often misleading as both are often required. He prefers the term "Valuation Methodology". He proposed four valuation modes: Direct Price Quotation, Use of Similar Instruments, Risk Replication, Expected Uncertain Cashflows (NPV) and categorised a useful hierarchy/matrix of which financial products fit into which valuation mode and for what purposes. Within model risk, he split off judgemental errors (choice of model etc) as part of market risk and credit risk and operational errors (model implementation and coding) as more definable and avoidable parts of operational risk.

He had some interesting slants on data, saying that he had been surprised that even getting all of the static data necessary to price simpler instruments like bonds had proven difficult. He outlined how model parameters are often stored across a variety of systems (curve definitions in one place, pricing methodology somewhere else) implying to me that this is sometimes difficult to pull together and needs some centralisation to improve transparency around this.

His opinion on market parameters (both observed prices and derived data such as implied volatility surfaces) were often stored in a larger central database but warned that this market parameter database needs to be reviewed as part of the model validation process since some of its data is derived (i.e. calculated, maybe using a model!) and as such should not be taken as perfect for all time and for all purposes. He said that it was important to categorise the origin of data and suggested the following types:

  • Quoted on an active exchange
  • Actual private transaction in an active market
  • Tradable broker quotes
  • Consensus prices from market makers
  • Non-binding indicative prices from market makers
  • Counterparty valuation, collateral valuation
  • Actual transactions in inactive market

Tanguy proposed that there should a valuation matrix for each instrument, where there might a different valuation methodology used for end of day valuation verses intraday, for risk or for trading, for pricing individually or within a portfolio reval. I guess here the rational is appropriateness, efficiency and transparency about what needs to used when. He also added that he disliked the term "Model Validation" since it seemed to imply that a model was "valid" and preferred "Model Approval" to cover the decision to use a model and "Model Review" to cover model analysis. He said he found managing the "stock" of existing models (and keeping up with when to review them) more difficult than managing the "flow" of new models and products.

Overall Tanguy was a very interesting and funny speaker with lots of practical insights and a fair amount of opinion thrown in, which is always good in my view.

The usefulness of inaccurate models: Financial risk management "in the wild". This talk was given by Dr Yuval Millo of the London School of Economics and he focussed on the evolution of the use of the Black Scholes Merton (B-S-M) model at the CBOE and how the model came to be the means by which the whole options market "communicated". Yuyal is a social scientist and prefaced his talk by stating that "Social Sciences are good at predicting the past"

First thing I didn't know (amongst the many things I do not know...) is that the B-S model was not published until a couple of weeks after the CBOE started trading stock options in April1973. Yuyal said that initially the B-S-M derived prices were not accurate at all (around 25% off the market price on CBOE) and that the model was based on assumptions that plainly were not the case on the exchange (only calls available, no short selling, no continuous trading). The model was used by local Chicago trading firms and the story goes that Fischer Black sold large paper "sheets" of option pricing matrices to these traders (there being no calculators/PCs/mobiles around at the time).

As the markets developed, larger East Coast banks entered the market with stocks being held and traded in New York and options being traded in Chicago, so trading became geographically dispersed. This started the need for "early morning meetings" to discuss the market and the B-S-M model and its parameters became the "lingua franca" or means of communication of options market participants.

He described the first years of the Options Clearing Corporation (OCC) which was set up to ensure that the financial obligations of options and buyers were met. Around 1979-80 the OCC worked overnight to calculate margin requirements, based on the (now?) arcane idea that different margin amounts should be associated with different option strategies (straddles, butterflies etc) and the job of the OCC was to take a portfolio of Option and optimise which combination of strategies would minimise the margin required for the whole portfolio. He said that there were disputes between traders and the OCC around margin levels and difficulties for the SEC with updating their Net Capital Rules as each new option strategy was created. Eventually, the OCC adopted the B-S-M model and implied volatility as the means of calculating margin against market value which enabled them to move away from the operational difficulty of strategy optimisation.

So the B-S-M became the way in which traders communicated about the market but also the model became vital operationally within clearing for the market. By 1987 B-S-M had become the de-facto standard for the market, with the model driving the market in turn driving use of the model. During the Oct '87 crash the model proved to be very innaccurate but the use of the model did not diminish - maybe pschologically the market participants needed a model (even a wrong model) to make communication easier.

I found this talk very interesting and members of the audience asked whether any similar analysis was going to be done on the Gaussian Copula model used to price CDOs. Yuyal said that one of his colleagues was undertaking this research currently. Given that he seemed to be very positive about the use of the B-S-M model within options markets I asked whether he had any opinions on Taleb's criticism of fiancial engineers and modelling. Yuyal said that he and Nassim were friends and agreed to disagree on certain topics...

Stress testing modelling parameters. Next up was Peirpaolo Montana, Head of Model Validation at West LB. Having joined the finance industry out of a career in mathematics and then at a regulator, Pierpaulo began by saying that back in the heady days of 2004 the banks thought that their own risk management systems and practices were well ahead of the regulators. He said that in light of the crisis this proved not to be the case but he now feels that this is now more evenly balanced (not sure I would agree, still lots of catchin to do for some institutions I would suggest).

He said that whilst regulators require the validation of risk models and pricing models, and that stress testing of a portfolio is required, that the stress testing of a pricing model is not a requirement and has received much less attention and in his view was not done to much degree before 2007. His point here was that pricing models should work under stress too, otherwise they are a weak foundation for building other risk measures such as stressed VAR.

Whilst focussing on pricing models, he mentioned that risk models also need to be carefully chosen and appropriate to the institution and the types of trading activities it undertakes. As an example he put forward that a simple VAR calculator might be appropriate for a long only equity fund but completely innappropriate for a relative value portfolio.

He said that stress testing had recently received much more attention as a risk management tool and cited the BIS document "Revisions to the Basel II market risk framework" where stressed VAR is introduced as part of the regulatory capital charge calculation. He also mentioned that in order to avoid "standard model" treatment of complex securitised products an institution must be able to demonstrate that its VAR model can cope with these products under times of market stress.

Pierpaulo then described the stress testing of base correlation in CDO pricing, and how even moving the base correlation from its usual level of 70% to 99% would not have predicted the valuations observed in the recent crisis. In this way he says that stress testing of models can detect implementation problems and some model weaknesses, but it cannot assist in coping with structural breaks in the market. He also discussed how the B-S-M model is used everywhere (even places it should not really be valid for) since it is a robust model based on the no-arbitrage hypothesis - in contrast the CDO base correlation and other models are not so robust since they are not arbitrage free.

(end of post!)
 


 

08 May 2009

Analytics Management from Celent

A new report from the analyst firm Celent advocating enterprise transparency and consistency in the pricing of OTC derivatives and structured products - great that an analyst firm is acknowledging the need for analytics management as a complimentary discipline to the more established principles of data management.

Xenomorph: data and analytics management

About Xenomorph

Xenomorph is the leading provider of data and analytics management solutions to the financial markets. Risk, trading, quant research and IT staff use Xenomorph’s TimeScape data and analytics management solution at investment banks, hedge funds and asset management institutions across the world’s main financial centres.

Blog powered by TypePad
Member since 02/2008