Heavyweight Data Management...
...I am very concerned that I have previously missed an important requirement for data management solutions - a heavweight one judging by this great discussion on one of the Microsoft forums.
...I am very concerned that I have previously missed an important requirement for data management solutions - a heavweight one judging by this great discussion on one of the Microsoft forums.
Seems like Microsoft have now gone public on the Microsoft TechEd site that they have a Complex Event Processing (CEP) engine that will be coming to market shortly (see MagmaSystems blog post ). One of my colleagues Mark Woodgate attended a briefing event at Microsoft for this technology back in February this year - here's an extract from some internal notes that Mark made back then:
"Microsoft CEP is very similar to StreamBase conceptually (and not unsurprisingly), in the sense that there are adapters and streams and how you merge and split them via some kind of query language is the same. However, StreamBase uses the StreamSQL which as we have seen is SQL-like in syntax but Microsoft CEP uses LINQ and .NET and although conceptually it is doing the same thing, it does not look the same. StreamBase’s argument was you can be an SQL programmer to use it and don’t need lower-level like .NET; however, it’s not SQL really as it has all these ‘extensions’ you have to learn so using .NET might look more tricky but in fact it makes sense. They don’t have a sexy GUI yet for designing CEP applications like StreamBase but it will be done in Visual Studio 2008.
Currently, you build various assemblies (I/O adapters, queries and functions) and then bolt them all together, called ‘binding’ by command line tool. You then deploy the application onto one or more machines using another tool so it’s a manual process right now. They are aware this needs to be made easier and more visual. They are allowing other libraries to be bolted in via the various SDKs so it’s pretty open and flexible. It works well with HPC and clusters/grids (or so they say) and of course can be used with SQL Server. The CEP engine also has a web interface based on SOAP so at least non-Windows based systems can talk to it"
The release of this technology will be an interesting addition to the CEP market and to the Microsoft technology stack in general. Assuming performance is at credible levels (i.e. not necessarily leading but not appalling either) it will certainly bring both technical and commercial pressure to bare on existing CEP vendors (see earlier post on Aleri/Coral8) and has the potential to broaden the usage of CEP. Obviously Linux-Lovers (sorry, I didn't mean to be personal...) will not agree with this, but Microsoft is putting together an interesting stack of technology when you see this CEP engine, Microsoft HPC and Microsoft Velocity coming together under .NET.
Another article about the operational risk generated by the usage of spreadsheets within the financial markets (see earlier posts), appeared in the April issue of Waters Magazine.
The articles highlights how spreadsheets are largely used within financial institutions and suggests that the current regulation requirements for more transparency and ad-hoc risk management might push the proliferation of spreadsheets even further. The articles also refers to the progress and improvements made by Microsoft in recent versions of Excel to increase the security of spreadsheets.
Xenomorph has worked closely with Microsoft on hosting its time series database within SQL Server 2008. The case study we have written together describes how SQL Server 2008 offers integration within Office Excel 2007 so that whilst the spreadsheet is still the end-user viewing tool, operational risk is reduced by engaging Excel 2007 as an analytics and reporting tool and not as a mean of storing data.
Our TimeScape solution offers more than 700 easy to use add-in functions to Office Excel 2007 and we are currently working on the use of Excel Services, part of Microsoft Office Share Point Server 2007, to further enhance the centralized approach to spreadsheet.
If you are interested in how Xenomorph solves the problem of spreadsheet management, then take a look at our (newly updated) website. Here we explain how to solve the problem and how Xenomorph Spreadsheet Inside technology can bring unstructured spreadsheet data and complex calculation within a centralized data management system, increasing transparency and reducing operational risk.
Interesting predictions for complex event processing (CEP) in 2009 (click here for link) - sounds like some form of reality is appearing in this area of the market, accelerated by the current financial crisis. Entry of bigger players and usage of LINQ in CEP will be interesting too.
I came across this pair of quotes on a google search, bringing data management into the context of the current financial crisis:
"Where is the wisdom? Lost in the knowledge. Where is the knowledge? Lost in the information." - T.S. Eliot
"Where is the information? Lost in the data. Where is the data? Lost in the ******* database." - JoeCelko
Here's to hoping that wisdom is not in short supply at the moment...
I read an interesting article a few weeks ago on the SQL Server Magazine web-site where the issue of Solid State Drives (SSD) and their potential to impact the future need to tune databases was being discussed.
The article raised the question that as SSD becomes more mainstream, and its capacity increases significantly, then could it eventually eliminate the need for database designers/administrators to have to optimise table structures to deliver acceptable levels of performance?
The argument used was along the lines that with SSD there's less traditional disk i/o going on (making reads a thousand times quicker than hard disks), so query performance levels may just be acceptable by virtue of the SSD memory delivering data quickly to the consumer process. This makes good sense, but also reminds me of previous technology advances in this area such as RAM disks and even paging files, which all promised such things but eventually needed cleverer system infrastructure around them to fulfil an overall business need.
That said, I have absolutely no doubt that SSD will make a significant impact on data storage access times (it has to). However, my guess is that it will just push the problem elsewhere. So, as much as we developers & technicians would like to think that it may deliver us a 'free lunch', I would suggest it's more likely to be a ‘free starter’ and that (sadly) we will continue to have much more work to do to produce the main course and dessert that will keep our customers happy and coming back for more...
The original SQL Server Magazine article can be found at http://www.sqlmag.com/Articles/ArticleID/100181/100181.html
Partner announcement on Finextra with Vhayu and Streambase coming together:
http://www.finextra.com/fullpr.asp?id=21477
Defining what vendors mean by a "Data Management System" is difficult enough for clients, but in the area of the somewhat fuzzy technology definitions around automated trading it is interesting to see Streambase clarify their offering around CEP (and not database too, which was one of their first messages around bringing real-time and historic data together), and that Vhayu seems to be emphasising its tick database capabilities (and de-emphasising its original perception in the market as a CEP vendor).
Interesting article on Sun's $1billion acquisition of MySQL and how it may affect Oracle and SQL Server:
http://www.sqlmag.com/Articles/ArticleID/98951/98951.html?Ad=1
The supercomputer continues to make a come-back - just up on Finextra with TD Bank testing IBM's Blue Gene supercomputers to amalgamate and analyse real-time structured and unstructured data:
Case study of some of the work we have been doing with Microsoft on hosting our time series storage inside SQL Server has just gone up on their site at:
http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000001637
Interesting article saying that Sybase IQ revenues were up 70% in 2007, and formed a very significant part of overall revenues:
http://news.yahoo.com/s/cmp/20080301/tc_cmp/206901052
Also mentions Mike Stonebraker with his column-based database start-up, Vertica, and how one of the senior IBM technologists puts forward that the full benefits at the back-end of higher performance are often not seen by the end user, and so the complexity involved in proprietary solutions outweighs the benefit. Element of truth in both, a standards-based approach is prefered by most institutions, but I think financial markets are a special case where back-end performance is transparent to the user.
Just doing a bit of catching up with what is going on with Sybase IQ (1,000 terabyte benchmark sounds impressive) and came across the Wikipedia entry for this tech (http://en.wikipedia.org/wiki/Sybase_IQ)which mentions at the end that IQ's compression ability "...achieves a 90 percent reduction in CO2 emissions".
So now we have a column-oriented high performance database that is doing its bit to save the planet? I think I need to lie down for a bit and think how I can fit the Toyota Prius into our next marketing campaign...
Xenomorph is the leading provider of data and analytics management solutions to the financial markets. Risk, trading, quant research and IT staff use Xenomorph’s TimeScape data and analytics management solution at investment banks, hedge funds and asset management institutions across the world’s main financial centres.