21 posts categorized "Database Technology"

13 April 2012

CVA - a business driver for breaking down asset silos

Xenomorph's analytics partner Numerix sponsored a PRMIA event at New York's Harvard Club this week on Credit Valuation Adjustment (CVA). The event also involved Microsoft, with a surprisingly relevant contribution to the evening on CVA and "Big Data" (I still don't feel comfortable losing the quotes yet, maybe soon...). Credit Valuation Adjustment seems to be the hot topic in risk management and pricing at the moment, with Numerix's competitor Quantifi having held another PRMIA event on CVA only a few months back. 

The event started with an introduction to CVA from Aletta Ely of JP Morgan Chase. Aletta started by defining CVA as the market value of counterparty credit risk. I am new to CVA as a topic, and my own experience on any kind of adjustment in valuation for instrument was back at JP Morgan in the mid-90s (those of you under 30 are allowed to start yawning at this point...). We used to maintain separate risk-free curves (what are they now?) and counterparty spread curves, which would be combined to discount the cashflows in the model.

Whilst such an adjustment could be calibrated to come up with an adjusted valuation which would be better than having no counterparty risk modelled at all, it seems one of the key aspects of how CVA differs is that a credit valuation adjustement needs to be done in the context of the whole portfolio of exposures to the counterparty, and not in isolation instrument by instrument. The fact that a trader in equity derivatives was long exposure to a counterparty cannot be looked at in isolation from a short exposure to a portfolio of swaps with the same counterparty on the fixed income desk.

Put another way, CVA only has context if we stand to lose money if our counterparty defaults, and so an aggregated approach is needed to calculate the size of the positive exposures to the counterparty over the lifetime of the portfolio. Also, given this one sided payoff aspect of the CVA calculation, then instrument types such as vanilla interest rate swaps suddenly move from being relatively simple instrument that can be priced off a single curve to instruments that needed optionality to be modelled for the purposes of CVA.

So why has CVA become such a hot topic at the banks? Prior to the 2008/2009 crisis CVA was already around (credit risk has existed for a long time I guess, regardless of whether you regulate or report to it), but given that bank credit spreads were at that time consistently low and stable then CVA had minimal effects on valuations and P&L. Obviously with the advent of Lehmans then this changed, and CVA has been pushed into prominence since it has directly affected P&L in a significant manner for many institutions (for example see these FT articles on Citi and JPMorgan)

A key and I think positive point for the whole industry is the CVA requires a completely multi-asset view, and given regulatory focus on CVA and capital adequacy then as a result it will drive banks away from a siloed approach to data and valuation management. If capital is scarcer and more costly, then banks will invest in understanding both their aggregate CVA and the incremental contribution to CVA of a new trade in the context of all exposures to the counterparty. Looking at incremental CVA, then you can also see that this also drives investment into real or near-realtime CVA calculation, which brings me on to the next talks of the evening by Numerix on CVA calculation methods and a surprisingly good presentation on CVA and "Big Data" from David Cox of Microsoft.

Denny Yu of Numerix did a good job of explaining some of the methods of calculating CVA, and in addition to being cross asset and all the implications that requires for having the ability to price anything, CVA is both data and computationally expensive. It requires both simulation of the scenarios for the default of counterparties through time, but also the valuation of cross-asset portfolios at different points in time. Denny mentioned techniques such as American Monte-Carlo to reduce the computation needed through using the same simulation paths for both default scenarios and valuation.

So on to Microsoft. I have seen some appalling presentations on "Big Data" recently, mainly from the larger software and hardware companies try to jump on the marketing band wagon (main marketing premise: the data problems you have are "Big"...enough said I hope). Surprisingly, David Cox of Microsoft gave a very good presentation around the computation challenges of CVA, and how technologies such as Hadoop take the computational power closer to the data that needs acting on, bringing the analytics and data together. (As an aside, his presentation was notably "Metro" GUI in style, something that seems to work well for PowerPoint where the slide is very visual and it puts more emphasis on the speak to overlay the information). David was obviously keen to talk up some of the cloud technology that Microsoft is currently pushing, but he knew the CVA business topic well and did a good job of telling a good story around CVA, "Big Data" and Cloud technologies. Fundamentally, his pitch was for banks and other institutions to become "Analytic Enterprises" with a common, scaleable and flexible infrastructure for data management and analysis. 

In summary it was a great event - the Harvard Club is always worth a visit (bars and grandiose portraits as expected but also barber shop in the basement and squash courts in the loft!), the wine afterwards was tolerably good and the speakers were informative without over-selling their products or company. Quick thank you to Henry Hu of IBM for transportation on the night, and thanks also to Henry for sending through this link to a great introductory paper on CVA and credit risk from King's College London. Whilst the title of the King's paper is a bit long and scary, it takes the form of dialogue between a new employee and a CVA expert, and as such is very readable with lots of background links.

 

 

 

04 April 2012

NoSQL - the benefit of being specific

NoSQL is an unfortunate name in my view for the loose family of non-relational database technologies associated with "Big Data". NotRelational might be a better description (catchy eh? thought not...) , but either way I don't like the negatives in both of these titles, due to aestetics and in this case because it could be taken to imply that these technologies are critical of SQL and relational technology that we have all been using for years. For those of you who are relatively new to NoSQL (which is most of us), then this link contains a great introduction. Also, if you can put up with a slightly annoying reporter, then the CloudEra CEO is worth a listen to on YouTube.

In my view NoSQL databases are complementary to relational technology, and as many have said relational tech and tabular data are not going away any time soon. Ironically, some of the NoSQL technologies need more standardised query languages to gain wider acceptance, and there will be no guessing which existing query language will be used for ideas in putting these new languages together (at this point as an example I will now say SPARQL, not that should be taken to mean that I know a lot about this, but that has never stopped me before...)

Going back into the distant history of Xenomorph and our XDB database technology, then when we started in 1995 the fact that we then used a proprietary database technology was sometimes a mixed blessing on sales. The XDB database technology we had at the time was based around answering a specific question, which was "give me all of the history for this attribute of this instrument as quickly as possible".

The risk managers and traders loved the performance aspects of our object/time series database - I remember one client with a historical VaR calc that we got running in around 30 minutes on laptop PC that was taking 12 hours in an RDBMS on a (then quite meaty) Sun Sparc box. It was a great example how specific database technology designed for specific problems could offer performance that was not possible from more generic relational technology. The use of database for these problems was never intended as a replacement for relational databases dealing with relational-type "set-based" problems tough, it was complementary technology designed for very specific problem sets.

The technologists were much more reserved, some were more accepting and knew of products such as FAME around then, but some were sceptical over the use of non-standard DBMS tech. Looking back, I think this attitude was in part due to either a desire to build their own vector/time series store, but also understandably (but incorrectly) they were concerned that our proprietary database would be require specialist database admin skills. Not that the mainstream RDBMS systems were expensive or specialist to maintain then (Oracle DBA anyone?), but many proprietary database systems with proprietary languages can require expensive and on-going specialist consultant support even today.

The feedback from our clients and sales prospects that our database performance was liked, but the proprietary database admin aspects were sometimes a sales objection caused us to take a look at hosting some of our vector database structures in Microsoft SQL Server. A long time back we had already implemented a layer within our analytics and data management system where we could replace our XDB database with other databases, most notably FAME. You can see a simple overview of the architecture in the diagram below, where other non-XDB databases (and datafeeds) can "plugged in" to our TimeScape system without affecting the APIs or indeed the object data model being used by the client:

TimeScape-DUL

Data Unification Layer

Using this layer, we then worked with the Microsoft UK SQL team to implement/host some of our vector database structures inside of Microsoft SQL Server. As a result, we ended up with a database engine that maintained the performance aspects of our proprietary database, but offered clients a standards-based DBMS for maintaining and managing the database. This is going back a few years, but we tested this database at Microsoft with a 12TB database (since this was then the largest disk they had available), but still this contained 500 billion tick data records which even today could be considered "Big" (if indeed I fully understand "Big" these days?). So you can see some of the technical effort we put into getting non-mainstream database technology to be more acceptable to an audience adopting a "SQL is everything" mantra.

Fast forward to 2012, and the explosion of interest in "Big Data" (I guess I should drop the quotes soon?) and in NoSQL databases. It finally seems that due to the usage of these technologies on internet data problems that no relational database could address, the technology community seem to have much more willingness to accept non-RDBMS technology where the problem being addressed warrants it - I guess for me and Xenomorph it has been a long (and mostly enjoyable) journey from 1995 to 2012 and it is great to see a more open-minded approach being taken towards database technology and the recognition of the benefits of specfic databases for (some) specific problems. Hopefully some good news on TimeScape and NoSQL technologies to follow in coming months - this is an exciting time to be involved in analytics and data management in financial markets and this tech couldn't come a moment too soon given the new reporting requirements being requested by regulators.

 

 

 

15 March 2012

The Semantics are not yet clear.

I went along to "Demystifying Financial Services Semantics" on Tuesday, a one day conference put together by the EDMCouncil and the Object Management Group. Firstly, what are semantics? Good question, to which the general answer is that semantics are the "study of meaning". Secondly, were semantics demystified during the day? - sadly for me I would say that they weren't, but ironically I would put that down mainly to poor presentations rather than a lack of substance, but more of that later.

Quoting from Euzenat (no expert me, just search for Semantics in Wikipedia), semantics "provides the rules for interpreting the syntax which do not provide the meaning directly but constrains the possible interpretations of what is declared." John Bottega (now of BofA) gave an illustration of this in his welcoming speech at the conference by introducing himself and the day in PigLatin, where all of the information he wanted to convey was contained in what he said, but only a small minority of the audience who knew the rules of Pig Latin understood what he was saying. The rest of us were "upidstay"...

Putting this in the more in the context of financial markets technology and data management, the main use of semantics and semantic data models seem to be as a conceptual data model technique that abstract away from any particular data model or database implementation. To humour the many disciples of the "Church of Semantics", such a conceptual data model would also be self-describing in nature, such that you would not need a separate meta data model to understand it. For example take a look at say the equity example from what Mike Aitkin and the EDM Council have put together so far with their "Semantics Repository".

Abstraction and self-description are not new techniques (OO/SOA design anyone?) but I guess even the semantic experts are not claiming that all is new with semantics. So what are they saying? The main themes from the day seem to be that Semantics:

  • can bridge the gaps between business understanding and technology understanding
  • can reduce the innumerable transformations of data that go on within large organisations
  • is scaleable and adaptable to change and new business requirements
  • facilitates greater and more granular analysis of data
  • reduces the cost of data management
  • enables more efficient business processes

Certainly the issue of business and technology not understanding each other (enough) has been a constant theme of most of my time working in financial services (and indeed is one of the gaps we bridge here at Xenomorph). For example, one project I heard of a few years back was were an IT department had just delivered a tick database project, only for the business users to find that that it did not cope with stock splits and for their purposes was unusable for data analysis. The business people had assumed that IT would know about the need for stock split adjustments, and as such had never felt the need to explicitly specify the requirement. The IT people obviously did not know the business domain well enough to catch this lack of specification. 

I think there is a need to involve business people in the design of systems, particularly at the data level (whilst not quite a "semantic" data model, the data model in TimeScape presents business objects and business data types to the end user, so both business people and technologist can use it without showing any detail of an underlying table or physical data structure). You can see a lot of this around with the likes of CADIS pushing its "you don't need a fixed data model" ETL/no datawarehouse type approach against the more rigid (and to some, more complete) data models/datawarehouses of the likes of Asset Control and GoldenSource. You also get the likes of Polarlake pushing its own  semantic web and big data approach to data management as a next stage on from relational data models (however I get a bit worried when "semantic web" and "big data" are used together, sounds like we are heading into marketing hype overdrive, warp factor 11...)

So if Semantics is to become prevalent and deliver some of these benefits in bringing greater understanding between business staff and technologists, the first thing that has addressed is that Semantics is a techy topic at the moment, which would cause drooping eyelids on even the most technically enthused members of the business. Ontology, OWL, RDF, CLIF are all great if you are already in the know, but guaranteed to turn a non-technical audience off if trying to understand (demystify?) Semantics in financial markets technology.

Looking at the business benefits, many of the presenters (particularly vendors) put forward slides where "BAM! Look at what semantics delivered here!" was the mantra, whereas I was left with a huge gap in seeing how what they had explained had actually translated into the benefits they were shouting about. There needed to be a much more practical focus to these presentations, rather than semantic "magic" delivering a 50% reduction in cost with no supporting detail of just how this was achieved. Some of the "magic" seemed to be that there was no unravelling of any relational data model to effect new attributes and meanings in the semantic model, but I would suggest that abstracting away from relational representation has always been a good thing if you want to avoid collapsing under the weight of database upgrades, so nothing too new there I would suggest but maybe a new approach for some.

So in summary I was a little disappointed by the day, especially given the "Demystifying" title, although there were a few highlights with Mike Bennett's talk on FIBO (Financial Instruments Business Ontology) being interesting (sorry to use the "O" word). The discussion of the XBRL success story was also good, especially how regulators mandating this standard had enforced its adoption, but from its adoption many end consumers were now doing more with the data, enhancing its adoption further. In fact the XBRL story seemed to be model for regulators could improve the world of data in financial markets, through the provision and enforcement of the data semantics to be used with each new reporting requirement as they are mandated. In summary, a mixed day and one in which I learned that the technical fog that surrounds semantics in financial markets technology is only just beginning to clear.

 

14 December 2011

PRMIA - From Risk Measurement to Risk Management by Samuel Won

I attended the PRMIA event last night "Risk Year in Review" at Moody's New York offices. It was a good event, but by far the most interesting topic of the evening for me was from Samuel Won, who gave a talk about some of the best and most innovative risk management techniques being used in the market today. Sam said that he was inspired to do this after reading the book "The Information" by James Gleik about the history of information and its current exponential growth. Below are some of the notes I took on Sam's talk, please accept my apologies in advance for any errors but hopefully the main themes are accurate.

Early '80s ALM - Sam gave some context to risk management as a profession through his own personal experiences. He started work in the early 80's at a supra-regional bank, managing interest rate risk on a long portfolio of mortgages. These were the days before the role of "risk manager" was formally defined, and really revolved around Asset and Liability Management (ALM).

Savings and Loans Crisis - Sam then changed roles and had some first hand experience in sorting out the Savings and Loans crisis of the mid '80s. In this role he become more experienced with products such as mortgage backed securities, and more familiar with some of the more data intensive processes needed to manage such products in order to account for such factors such as prepayment risk, convexity and cashflow mapping.

The Front Office of the '90s - In the '90s he worked in the front office at a couple of tier one investment banks, where the role was more of optimal allocation of available balance sheet rather than "risk management" in the traditional sense. In order to do this better, Sam approached the head of trading for budget to improve and systemise this balance sheet allocation but was questioned as to why he needed budget when the central Risk Control department had a large staff and large budget already.

Eventually, he successfully argued the case that Risk Control were involved in risk measurement and control, whereas what he wanted to implement was active decision support to improve P&L and reduce risk. He was given a total budget of just $5M (small for a big bank) and told to get on with it. These two themes of implementing active decision support (not just risk measurement) and have a profit motive driving better risk management ran through the rest of his talk.

A Datawarehouse for End-Users Too - With a small team and a small budget, Sam made use of postgraduate students to leverage what his team could develop. They had seen that (at the time) getting systems talking to each other was costly and unproductive, and decided as a result to implement a datawarehouse for the front office, implementing data normalisation and data scrubbing, with data dashboard over the top that was easy enough for business users to do data mining. Sam made the point that useability was key in allowing the business people to extract full value from the solution.

Sam said that the techniques used by his team and the developers were not necessarily that new, things like regression and correlation analysis were used at first. These were used to establish key variables/factors, with a view to establish key risk and investment triggers in as near to real-time as possible. The expense of all of this development work was justified through its effects on P&L which given its success resulting in more funding from the business.

Poor Sell-Side Risk Innovation - Sam has seen the most innovative risk techniques being used on the buy-side and was disappointed by the lack of innovation in risk management at the banks. He listed the following sell-side problems for risk innovation:

  • politically driven requirements, not economically driven
  • arbitrary increases in capital levels required is not a rigorous approach
  • no need for decision analysis with risk processes
  • just passing a test mentality
  • just do the marginal work needed to meet the new rules
  • no P&L justification driving risk management

Features of Innovative Approaches - Sam said that he had noted a few key features of some of the initiatives he admired at some of the asset managers:

  1. Based on a sophisticated data warehouse (not usually Oracle or Sybase, but Microsoft and other databases used - maybe driven by ease of use or cost maybe?)
  2. Traders/Portfolio Managers are the people using the system and implementing it, not the technical staff.
  3. Dedicated teams within the trading division to support this, so not relying on central data team.

A Forward-Looking Risk Model Example - The typical output from such decision analysis systems he found was in the form of scenarios for users to consider. A specific example was a portfolio manager involved in event-driven long-short equity strategies around mergers and acquisitions. The manager is interested in the risk that a particular deal breaks, and in this case techniques such as Value at Risk (VaR) do not work, since the arbitrage usually requires going long the company being acquired and short the acquiror (VaR would indicate little risk in this long-short case). The manager implemented a forward looking model that was based on information relevant to the deal in question plus information from similar historic deals. The probabilities used in the model where gathered from a range of sources, and techniques such as triangulation where used to verify the probabilities. Sam views that forward-looking models to assist in decision support are real risk management, as opposed to the backward-looking risk measurement models implemented at banks to support regulatory reporting.

Summary - Sam was a great speaker, and for a change it was refreshing to not have presentation slides backing up what the speaker was saying. His thoughts on forward looking models being true risk management and moving away from risk measurement seem to echo those of Ricardo Rebanato of a few years back at RiskMinds (see post). I think his thoughts on P&L motivation being the only way that risk management advances are correct, although I think there is a lot of risk innovation at the banks but at a trading desk level and not at the firm-wide level which is caught up in regulation - the trading desks know that capital is scarce and are wanting to use it better. I think this siloed risk management flies in the face of much of the firm-wide risk management and indeed firm-wide data management talked about in the industry, and potentially still shows that we have a long way to go in getting innovation and forward looking risk management at a firm level, particularly when it is dominated by regulatory requirements. However, having a truly integrated risk data platform is something of a hobby-horse for me, I think it is the foundation for answering all of the regulatory and risk requirementst to come, whatever their form. Finally, I could not agree more easy analysis for end-users is a vital part of data management for risk, allowing business users to do risk management better. Too many times IT is focussed on systems that require more IT involvement, when the IT investment and focus should be on systems that enable business users (trading, risk, compliance) to do more for themselves. Data management for risk is key area for improvement in the industry, where many risk management sytem vendors assume that the world of data they require is perfect. Ask any risk manager - the world of data is not perfect and manual data validation continues to be a task that takes time away from actually doing risk management.

18 October 2011

A-Team event – Data Management for Risk, Analytics and Valuations

My colleagues Joanna Tydeman and Matthew Skinner attended the A-Team Group's Data Management for Risk, Analytics and Valuations event today in London. Here are some of Joanna's notes from the day:

Introductory discussion

Andrew Delaney, Amir Halton (Oracle)

Drivers of the data management problem – regulation and performance.

Key challenges that are faced – the complexity of the instruments is growing, managing data across different geographies, increase in M&As because of volatile market, broader distribution of data and analytics required etc. It’s a work in progress but there is appetite for change. A lot of emphasis is now on OTC derivatives (this was echoed at a CityIQ event earlier this month as well).

Having an LEI is becoming standard, but has its problems (e.g. China has already said it wants its own LEI which defeats the object). This was picked up as one of the main topics by a number of people in discussions after the event, seeming to justify some of the journalistic over-exposure to LEI as the "silver bullet" to solve everyone's counterparty risk problems.

Expressed the need for real time data warehousing and integrated analytics (a familiar topic for Xenomorph!) – analytics now need to reflect reality and to be updated as the data is running - coined as ‘analytics at the speed of thought’ by Amir. Hadoop was mentioned quite a lot during the conference, also NoSQL which is unsurprising from Oracle given their recent move into this tech (see post - a very interesting move given Oracle's relational foundations and history)

Impact of regulations on Enterprise Data Management requirements

Virginie O’Shea, Selwyn Blair-Ford (FRS Global), Matthew Cox (BNY Melon), Irving Henry (BBA), Chris Johnson (HSBC SS)

Discussed the new regulations, how there is now a need to change practice as regulators want to see your positions immediately. Pricing accuracy was mentioned as very important so that valuations are accurate.

Again, said how important it is to establish which areas need to be worked on and make the changes. Firms are still working on a micro level, need a macro level. It was discussed that good reasons are required to persuade management to allocate a budget for infrastructure change. This takes preparation and involving the right people.

Items that panellists considered should be on the priority list for next year were:

· Reporting – needs to be reliable and meaningful

· Long term forecasts – organisations should look ahead and anticipate where future problems could crop up.

· Engage more closely with Europe (I guess we all want the sovereign crisis behind us!)

· Commitment of firm to put enough resource into data access and reporting including on an ad hoc basis (the need for ad hoc was mentioned in another session as well).

Technology challenges of building an enterprise management infrastructure

Virginie O’Shea, Colin Gibson (RBS), Sally Hinds (Reuters), Chris Thompson (Mizuho), Victoria Stahley (RBC)

Coverage and reporting were mentioned as the biggest challenges.

Front office used to be more real time, back office used to handle the reference data, now the two must meet. There is a real requirement for consistency, front office and risk need the same data so that they arrive to the same conclusions.

Money needs to be spent in the right way and fims need to build for the future. There is real pressure for cost efficiency and for doing more for less. Discussed that timelines should perhaps be longer so that a good job can be done, but there should be shorter milestones to keep business happy.

Panellists described the next pain points/challenges that firms are likely to face as:

· Consistency of data including transaction data.

· Data coverage.

· Bringing together data silos, knowing where data is from and how to fix it.

· Getting someone to manage the project and uncover problems (which may be a bit scary, but problems are required in order to get funding).

· Don’t underestimate the challenges of using new systems.

Better business agility through data-driven analytics

Stuart Grant, Sybase

Discussed Event Stream Processing, that now analytics need to be carried out whilst data is running, not when it is standing still. This was also mentioned during other sessions, so seems to be a hot topic.

Mentioned that the buy side’s challenge is that their core competency is not IT. Now with cloud computing they are more easily able to outsource. He mentioned that buy side shouldn’t necessarily build in order to come up with a different, original solution.

Data collection, normalisation and orchestration for risk management

Andrew Delaney, Valerie Bannert-Thurner (FTEN), Michael Coleman (Hyper Rig), David Priestley (CubeLogic), Simon Tweddle (Mizuho)

Complexity of the problem is the main hindrance. When problems are small, it is hard for them to get budget so they have to wait for problems to get big – which is obviously not the best place to start from.

There is now a change in behaviour of senior front office management – now they want reports, they want a global view. Front office do in fact care about risk because they don’t want to lose money. Now we need an open dialogue between front office and risk as to what is required.

Integrating data for high compute enterprise analytics

Andrew Delaney, Stuart Grant (Sybase), Paul Johnstone (independent), Colin Rickard (DataFlux)

The need for granularity and transparency are only just being recognised by regulators. The amount of data is an overwhelming problem for regulators, not just financial institutions.

Discussed how OTCs should be treated more like exchange-traded instruments – need to look at them as structured data.

04 May 2011

More formal management of instrument valuation needed

Xenomorph has today released its white paper “Instrument Valuation Management: management of derivative and fixed income valuations in a multi-asset, multi-model, multi-datasource and multi-timeframe environment”.

The white paper expands on the “Rates, Curves and Surfaces – Golden Copy Management of Complex Datasets” white paper Xenomorph published recently (see earlier post) and describes how, despite the increasing importance of instrument valuation to investment, trading and risk management decisions, valuation management is not yet formally and fully addressed within data management strategies and remains a big concern for financial institutions.

Too often, says Xenomorph, valuations (and the analytics used to process input and calculate output data) fall between traditional data management providers and pricing model vendors. This leads to the over–use of tactical desktop spreadsheets where data “escapes” the control of the data management system, leading to an increased operational risk.

Whilst instrument valuation is certainly not the primary cause of the recent financial crisis, the lack of high quality, transparent valuations of many complex securities resulted in market uncertainty and in the failure of many risk models fed by untrustworthy valuations.

“A deeper understanding of financial products reduces operational risk and promotes quality, consistency and auditability, ensuring regulatory compliance”, says Brian Sentance, CEO Xenomorph. “Clients’ requirements have evolved and portfolio managers, traders and risk managers recognize that it is no longer sufficient to treat valuation as an external, black-box process offered by pricing service providers”, he adds.

Nowadays, regulators, auditors, clients and investors demand even more drill-down to the underlying details of an instrument’s valuation. It is therefore important to implement an integrated, consistent analytics and data management strategy which cuts across different departments and glues together reference and market data, pricing and analytics models, for transparent, high quality, independent valuation management.

“Our TimeScape solution provides a valuation environment which offers rapid and timely support for even the most complex instruments, allowing our clients to check easily the external valuation numbers, based on their choice of model and data providers”, says Sentance. “Otherwise, what is the point of good data management if the valuations and the analytics used are not based on the same data management infrastructure principles?”

For those who are interested, the white paper is available here.

 

20 October 2010

Analytics Management by Sybase and Platform

I went along to a good event at Sybase New York this morning, put on by Sybase and Platform Computing (the grid/cluster/HPC people, see an old article for some background). As much as some of Sybase's ideas in this space are competitive to Xenomorph's, some are very complimentary and I like their overall technical and marketing direction in focussing on the issue of managing of data and analytics within financial markets (given that direction I would, wouldn't I?...). Specifically, I think their marketing pitch based on moving away from batch to intraday risk management is a good one, but one that many financial institutions are unfortunately (?) a long way away from.

The event started with a decent breakfast, a wonderful sunny window view of Manhattan and then proceeded with the expected corporate marketing pitch for Sybase and Platform - this was ok but to be critical (even of some of my own speeches) there is only so much you can say about the financial crisis. The presenters described two reference architectures that combined Platform's grid computing technology with Sybase RAP and the Aleri CEP Engine, and from these two architectures they outlined four usage cases.

The first use case was for strategy back testing. The architecture for this looked fine but some questions were raised from the audience about the need for distributed data cacheing within the proposed architecture to ensure that data did not become the bottleneck. One of the presenters said that distributed cacheing was one option, although data cacheing (involving "binning" of data) can limit the computational flexibility of a grid solution. The audience member also added that when market data changes, this can cause temporary but significant issues of cache consistency across a grid as the change cascades from one node to another.

Apparently a cache could be implemented in the Aleri CEP engine on each grid node, or the Platform guy said that it was also possible to hook in a client's own C/C++ solution into Platform to achieve this, and that their "Data Affinity" offering was designed to assist with this type of issue. In summary their presentation would have looked better with the distributed cacheing illustrated in my view, and it begged the question as to why they did not have an offering or partner in this technical space. To be fair, when asked whether the architecture had any performance issues in this way, they said for the usage case they had then no it didn't - so on that simple and fundamental aspect they were covered.

They had three usage cases for the second architecture, one was intraday market risk, one was counterparty risk exposure and one was intraday option pricing. On the option pricing case, there was some debated about whether the architecture could "share" real-time objects such as zero curves, volatility surfaces etc. Apparently this is possible, but again would have benefitted by being illustrated first as an explicit part of the architecture.

There was one question about the usage of the architecture applied to transactional problems, and as usual for an event full of database specialists there was some confusion as to whether we were talking about database "transactions" or financial transactions. I think it was the latter, but this wasn't answered too clearly but neither was the question asked clearly I guess - maybe they could have explained the counterparty exposure usage case a bit more to see if this met some of the audience member's needs.

The latter question on transactions above got a conversation going on about resilliancy within the architecture, given that the Sybase ASE database engine is held in-memory for real-time updates whilst the historic data resides on shared disk in Sybase IQ, their column-based database offering. Again full resilience is possible across the whole architecture (Sybase ASE, IQ, Aleri and the Symphony Grid from Platform) but this was not illustrated this time round.

Overall good event with some decent questions and interaction.

17 May 2010

Cloudy definitions

Given that I am English and can tend to start many personal introductions with a short conversation about the weather (generally either "awful" or "not bad for this time of year"...), then maybe I should be very receptive to the use of weather-related expressions in technology such as the "cloud". Maybe not however since the "cloud" and "cloud computing" have reached that zenith of marketing hype, when everyone is talking about a new technology regardless of if they are sure what it actually is (or might be, or could become...).

Anyway, I finally swallowed my cynicism and on Thursday morning went along to "Migrating Business to the Cloud", an event by Microsoft hosted at Bafta (small venue where the UK deals out its equivalent (?) of the Oscars). The master of ceremonies was Mark Taylor of Microsoft, who gave a general introduction to what Microsoft are doing in the "cloud", and of particular note he described the four types of computing scenarios where cloud computing can optimally be applied:

  • Predictable Bursting - where computing needs come and go in predictable waves of usage/demand
  • Growing Fast - where computing needs are rising exponentially like in a successful internet start-up
  • Unpredictable Bursting - where computing demand comes in unpredictable bursts, such as that associated with say usage of a backup computer centre in disaster recovery
  • On and Off - where you might run a process once a month or at an interval you decide

The above definitions seem ok to me but there is (probably understandably) some overlap in usage cases. The "Growing Fast" case for start-ups is interesting and more of that later.

Mark handed over to David Chappell who gave his perspective on cloud platforms as they are today in the market. David was a very entertaining and knowledgeable speaker, despite wearing a dodgy suit (what happened to those trousers?!) and having a peculiar wide foot stance when speaking. Anyway I digress, on to what he said. David started by saying what the "Cloud" is comprised of:

  • Cloud Applications - basically this is Software as a Service (SaaS) and some current examples of this would be Salesforce.com CRM, Microsoft Exchange Online and Google Apps.
  • Cloud Platforms - a platform for developing cloud applications, with the following characteristics that it:
    • is aimed at developers for creating and running cloud applications, not end consumers
    • provides self-service access to computing resources
    • allows very granular, on-demand allocation of computing resources
    • charges for the consumption of computing resources in a very granular manner

David then explained that due to its ambiguity he disliked the usage of the term "Private Cloud" in the ongoing debate about publicly available cloud services (such as those provided my Amazon, Microsoft and Google) vs. private clouds deployed within private institutions. David said the main difference was that private clouds do not have the economics of public clouds (i.e. pay for what you use only when you need it). That point seemed straightforward, however I would have thought that with a large global organisation with many different departmental computing demands the economics of a private cloud would be similar to a public one.

David then went on to explain that there are two kinds of Cloud Platform:

  • Infrastructure as a Service (IaaS) - this is a cloud platform the provides a developer with a virtual machine (VM) that has (almost) full access within it; put another way the development environment gives the developer total control but with that control comes responsibility.
  • Platform as a Service (PaaS) - this is a cloud platform that runs an application that a developer has created; it is easy to use but has limited control for the developer.

David put forward that there has been only 5 major software technology platforms over the past 50 years:

  • Mainframe
  • Mini-Computer
  • PC
  • PC-based Server
  • Mobile

He perceives that the Cloud is the 6th major software technology platform, and as such he is extremely enthusiastic about the opportunity and benefits that this presents to the whole of the software industry and its consumers.

David categorised Microsoft's cloud platform as (mostly) PaaS, which had three main components:

  • Windows Azure - for environment for running cloud applications within the platform
  • SQL Azure - relational storage within the platform
  • Windows Azure Platform AppFabric – (David noted the long name and sympathised with trying to name things sensibly) this provides and manages the infrastructure within the platform

He then moved on to describe the main usage scenarios for Windows Azure, for applications that:

  • need massive scale, such as Web 2.0 applications
  • need high reliability
  • have highly variable loading
  • have short or unpredictable lifetimes
  • need parallell processing
  • will either fail fast or scale fast
  • do not fit easily in a single organisation's data centre, such as joint venture
  • need external storage

David said that in the fail quickly or scale quickly scenario, this was squarely aimed at technology start-ups where using Cloud technologies would effectively increase the frequency at which new ideas could be tried out at less economic cost if they go wrong, but are ready to scale massively if they become the new "Facebook" - so much so that many of the VCs in Silicon Valley are now insisting that start-ups use cloud technology as a condition of funding.

Amazon's Elastic Compute Cloud (Amazon EC2) was the first major commercial cloud platform, and David categorised this as IaaS, where effectively you get a Virtual Machine (VM) environment that provides a lot of control but requires more effort to control than an PaaS such as Azure.

David said that he was surprised that the Google App Engine, which has Python and now Java as its programming languages, did not come with any traditional relational storage (unlike most other cloud platforms) but on speaking with Google he found that the storage engine and the whole platform is again designed primarily for Web 2.0 apps and as such storage usage was more about retrieving photos, video etc and less about querying across many records.

David was very complimentary about the cloud platform from Salesforce.com called Force.com, He said that the sales pitch from Salesforce.com would be straight to business users, effectively saying that they could build scaleable, resilient applications without involving the IT department and without needing programming expertise. He asked the audience if anyone had used these tools and a few folks confirmed that they were extremely impressed by what the platform offered.

Bob Muglia (President, Server and Business Tools, Microsoft) then gave a quick talk on Microsoft's plans for Azure. He mentioned how Microsoft's new search engine, Bing, was based on several hundred thousand servers running in Azure, but only had a handful of operating staff in contrast with the usual economics (taken from Gartner) that usually 1 operations person was needed for every 50 servers. He emphasised that Microsoft was committed to the further development of "on premises" operating systems but that Microsoft was totally committed to cloud computing, its development and its support.

He said that some of the tools found in the Microsoft technology suite, such as SQL Reporting Services, are not yet available in the cloud on Azure/SQL Azure (due end of year though) - he said that he hoped that people understood that re-engineering an existing application for the cloud sometimes took time to ensure the scaleable and reliability demanded when providing the functionality through the cloud. The vision put forward by Bob for development of cloud applications seemed very compelling, with Microsoft aiming to make things such enabling resilience for a globally available cloud application as simple as ticking a check-box in Microsoft Visual Studio. He put forward that the major barrier to cloud adoption was the human aspect of trust of moving applications "off premises". He said that he saw a fundamental shift across all industries to cloud development and deployment, but added there may be some areas such as government and finance where this process takes a lot longer.

The event then switched to presentations by EasyJet, RiskMetrics and SeeTheDifference. The head of IT at EasyJet gave his pitch first. His department get an annual budget of 0.75% (small?) of turnover of £2.5bn (larger, so translating to £18.75m) and has around 60 people. He presented how EasyJet has taken an incremental approach to the adoption of cloud computing, utilising both "on-premises" and cloud ("off-premises") technology together (exposing end points of applications into the cloud at first). He advised this approach since it:

  • was a smaller step than full-blown adoption
  • was lower risk
  • demonstrated big value in a short time-frame
  • leveraged the rich functionality available in Azure
  • accelerated acceptance of cloud technology

Dr Rob Fraser of RiskMetrics was next up. He explained whilst Moore's Law says that computing power doubles every 18 months, the calculations needed for risk management have doubled every six months. This has driven the need for parallel computing to meet this calculation need, and that RiskMetrics' RiskBurst service uses around 2,500 64-bit Opteron cores in their data centre but combines this with use of Azure to meet the peaks in calculation needed during each day (the similarities with power consumption management were pretty apparent). He said that average CPU consumption was around 18% of peak, hence a combination of both on and off premises compute power was a good solution for them. He mentioned that the management of this hybrid combination of technologies, and in particular being able to show real-time billing for it was a key area of investment for RiskMetrics.

The final presentation was by SeeTheDifference. The main point of this presentation was that this charitable organisation had zero permanent staff involved in IT, but regardless was able to deliver a very professional, reliable and scaleable website using external consultants to build on Azure.

Final section of the morning was a roundtable discussion with questions from the audience. The EasyJet guy said that the human mindset was key to the adoption of cloud computing. In terms of what keeps him awake at night was the thought that what would happen/how would attitudes change if any of the cloud infrastructure failed - so far it has experienced 100% up time. Rob of RiskMetrics was concerned about the stability of the platform, trying to ensuring that any changes introduced do not damage reliability. He added that he disagreed with Bob Muglia and thought that financial institutions would adopt public clouds quickly – he cited their experience of their revenues now being 90% based from service provision not on-premises applications. David said that he took some of the comments from Bob to indicate that Microsoft would also offer more of a pure VM (IaaS) soon in addition to the PaaS approach of Azure. David said that trust was the major issue in cloud adoption and he advised an incremental approach so "get your feet wet" then build from there.

On the whole the presentations were good and my knowledge of cloud technology has improved a bit - certainly it is fantastically appealing to develop globally available applications with no scaling, no resilience or data replication issues - it sounds too good to be true which generally means it is, so I guess there is much more work to do in gaining trust and acceptance for this technology. So my (pragmatic?) cynicism remains - but cloudy days are certainly coming and for a change maybe this is something to very much look forward to.

 

04 February 2010

More CEP Events

Sybase have acquired Aleri according to Finextra. It was less than a year ago when the complex event processing (“CEP”) vendors Aleri and Coral8 announced their merger (see press release); there was also a big buzz when Sybase announced a CEP capability based on Coral8 and Streambase decided to offer an Amnesty Program for Aleri-Coral8 Customers (see earlier post 'Merging in public is difficult...). And only a few months later, Microsoft announced that their CEP Orinoco (now integrated with SQL Server 2008 as StreamInsight) was heading to market (see post 'Microsoft CEP surfaces as 'Orinoco').

Another sign that CEP is moving more mainstream and that real-time everything is becoming more important? Or a good market for acquisitions?

17 July 2009

Heavyweight Data Management...

...I am very concerned that I have previously missed an important requirement for data management solutions - a heavweight one judging by this great discussion on one of the Microsoft forums.

14 May 2009

Microsoft CEP Surfaces as "Orinoco"

Seems like Microsoft have now gone public on the Microsoft TechEd site that they have a Complex Event Processing (CEP) engine that will be coming to market shortly (see MagmaSystems blog post ). One of my colleagues Mark Woodgate attended a briefing event at Microsoft for this technology back in February this year - here's an extract from some internal notes that Mark made back then:

"Microsoft CEP is very similar to StreamBase conceptually (and not unsurprisingly), in the sense that there are adapters and streams and how you merge and split them via some kind of query language is the same. However, StreamBase uses the StreamSQL which as we have seen is SQL-like in syntax but Microsoft CEP uses LINQ and .NET and although conceptually it is doing the same thing, it does not look the same. StreamBase’s argument was you can be an SQL programmer to use it and don’t need lower-level like .NET; however, it’s not SQL really as it has all these ‘extensions’ you have to learn so using .NET might look more tricky but in fact it makes sense. They don’t have a sexy GUI yet for designing CEP applications like StreamBase but it will be done in Visual Studio 2008.

 

Currently, you build various assemblies (I/O adapters, queries and functions) and then bolt them all together, called ‘binding’ by command line tool. You then deploy the application onto one or more machines using another tool so it’s a manual process right now. They are aware this needs to be made easier and more visual. They are allowing other libraries to be bolted in via the various SDKs so it’s pretty open and flexible. It works well with HPC and clusters/grids (or so they say) and of course can be used with SQL Server. The CEP engine also has a web interface based on SOAP so at least non-Windows based systems can talk to it"

 

The release of this technology will be an interesting addition to the CEP market and to the Microsoft technology stack in general. Assuming performance is at credible levels (i.e. not necessarily leading but not appalling either) it will certainly bring both technical and commercial pressure to bare on existing CEP vendors (see earlier post on Aleri/Coral8) and has the potential to broaden the usage of CEP. Obviously Linux-Lovers (sorry, I didn't mean to be personal...) will not agree with this, but Microsoft is putting together an interesting stack of technology when you see this CEP engine, Microsoft HPC and Microsoft Velocity coming together under .NET.

 

08 April 2009

High Performance Spreadsheets

Another article about the operational risk generated by the usage of spreadsheets within the financial markets (see earlier posts), appeared in the April issue of Waters Magazine.
 
The articles highlights how spreadsheets are largely used within financial institutions and suggests that the current regulation requirements for more transparency and ad-hoc risk management might push the proliferation of spreadsheets even further. The articles also refers to the progress and improvements made by Microsoft in recent versions of Excel to increase the security of spreadsheets.
 
Xenomorph has worked closely with Microsoft on hosting its time series database within SQL Server 2008. The case study we have written together describes how SQL Server 2008 offers integration within Office Excel 2007 so that whilst the spreadsheet is still the end-user viewing tool, operational risk is reduced by engaging Excel 2007 as an analytics and reporting tool and not as a mean of storing data.
 
Our TimeScape solution offers more than 700 easy to use add-in functions to Office Excel 2007 and we are currently working on the use of Excel Services, part of Microsoft Office Share Point Server 2007, to further enhance the centralized approach to spreadsheet.
 
If you are interested in how Xenomorph solves the problem of spreadsheet management, then take a look at our (newly updated) website. Here we explain how to solve the problem and how Xenomorph Spreadsheet Inside technology can bring unstructured spreadsheet data and complex calculation within a centralized data management system, increasing transparency and reducing operational risk.

25 January 2009

CEP in 2009

Interesting predictions for complex event processing (CEP) in 2009 (click here for link) - sounds like some form of reality is appearing in this area of the market, accelerated by the current financial crisis. Entry of bigger players and usage of LINQ in CEP will be interesting too.

06 October 2008

Transparency for troubled times

I came across this pair of quotes on a google search, bringing data management into the context of the current financial crisis:

"Where is the wisdom? Lost in the knowledge. Where is the knowledge? Lost in the information." - T.S. Eliot

"Where is the information? Lost in the data. Where is the data? Lost in the ******* database." - JoeCelko

Here's to hoping that wisdom is not in short supply at the moment...

24 September 2008

Solid State Drives - the promise of a free lunch?

I read an interesting article a few weeks ago on the SQL Server Magazine web-site where the issue of Solid State Drives (SSD) and their potential to impact the future need to tune databases was being discussed.

The article raised the question that as SSD becomes more mainstream, and its capacity increases significantly, then could it eventually eliminate the need for database designers/administrators to have to optimise table structures to deliver acceptable levels of performance?

The argument used was along the lines that with SSD there's less traditional disk i/o going on (making reads a thousand times quicker than hard disks), so query performance levels may just be acceptable by virtue of the SSD memory delivering data quickly to the consumer process. This makes good sense, but also reminds me of previous technology advances in this area such as RAM disks and even paging files, which all promised such things but eventually needed cleverer system infrastructure around them to fulfil an overall business need.

That said, I have absolutely no doubt that SSD will make a significant impact on data storage access times (it has to). However, my guess is that it will just push the problem elsewhere. So, as much as we developers & technicians would like to think that it may deliver us a 'free lunch', I would suggest it's more likely to be a ‘free starter’ and that (sadly) we will continue to have much more work to do to produce the main course and dessert that will keep our customers happy and coming back for more... 

The original SQL Server Magazine article can be found at http://www.sqlmag.com/Articles/ArticleID/100181/100181.html

21 May 2008

Vhayu and Streambase - positioning clarified?

Partner announcement on Finextra with Vhayu and Streambase coming together:

http://www.finextra.com/fullpr.asp?id=21477

Defining what vendors mean by a "Data Management System" is difficult enough for clients, but in the area of the somewhat fuzzy technology definitions around automated trading it is interesting to see Streambase clarify their offering around CEP (and not database too, which was one of their first messages around bringing real-time and historic data together), and that Vhayu seems to be emphasising its tick database capabilities (and de-emphasising its original perception in the market as a CEP vendor).

30 April 2008

Sun and MySQL - implications for Oracle/SQL Server

Interesting article on Sun's $1billion acquisition of MySQL and how it may affect Oracle and SQL Server:

http://www.sqlmag.com/Articles/ArticleID/98951/98951.html?Ad=1

02 April 2008

Streaming Blue Genes...

The supercomputer continues to make a come-back - just up on Finextra with TD Bank testing IBM's Blue Gene supercomputers to amalgamate and analyse real-time structured and unstructured data:

http://www.finextra.com/fullstory.asp?id=18293

19 March 2008

Time Series inside SQL Server

Case study of some of the work we have been doing with Microsoft on hosting our time series storage inside SQL Server has just gone up on their site at:

http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000001637

12 March 2008

IQ grows revenue for Sybase

Interesting article saying that Sybase IQ revenues were up 70% in 2007, and formed a very significant part of overall revenues:

http://news.yahoo.com/s/cmp/20080301/tc_cmp/206901052

Also mentions Mike Stonebraker with his column-based database start-up, Vertica, and how one of the senior IBM technologists puts forward that the full benefits at the back-end of higher performance are often not seen by the end user, and so the complexity involved in proprietary solutions outweighs the benefit. Element of truth in both, a standards-based approach is prefered by most institutions, but I think financial markets are a special case where back-end performance is transparent to the user.

Are databases going green?

Just doing a bit of catching up with what is going on with Sybase IQ (1,000 terabyte benchmark sounds impressive) and came across the Wikipedia entry for this tech (http://en.wikipedia.org/wiki/Sybase_IQ)which mentions at the end that IQ's compression ability "...achieves a 90 percent reduction in CO2 emissions".

So now we have a column-oriented high performance database that is doing its bit to save the planet? I think I need to lie down for a bit and think how I can fit the Toyota Prius into our next marketing campaign...

Xenomorph: analytics and data management

About Xenomorph

Xenomorph is the leading provider of analytics and data management solutions to the financial markets. Risk, trading, quant research and IT staff use Xenomorph’s TimeScape analytics and data management solution at investment banks, hedge funds and asset management institutions across the world’s main financial centres.

Blog powered by TypePad
Member since 02/2008