Interest in data warehousing has been at fever pitch for some time now, as enterprises look to derive the greatest possible benefit from the enormous amounts of information they are generating.
But even as many of the top platform providers incorporate warehousing technology into their portfolios, signs are starting to emerge that the technology, at least as we recognize it today, may have a limited lifespan as information systems stretch beyond the data center.
The latest move into warehousing comes from EMC, which announced this week it is acquiring Greenplum, developer of open source PostgreSQL database and massively parallel processing (MPP) architectures. The move is part of EMC’s broader strategy of expanding beyond raw storage into business intelligence (BI) and analytics – tools that add value to data stored on their traditional platforms.
We’ve seen this kind of integration before. Microsoft bought Datallegro back in 2008, while IBM, SAP and Oracle have all snapped up warehousing firms in recent years (that would be, respectively, Cognos, Business Objects and Hyperion). SAP is even looking to further its reach in the market with the planned acquisition of Sybase. All this essentially leaves Teradata as the only large, pure-play warehousing firm on the market.
Ideally, each of these firms would like to hold a lock on customers’ warehousing ability by tying them directly to their broader data center platforms. But already, that kind of control is looking doubtful with the rise of new migration technologies that will allow enterprises to shift platforms with greater ease. Netezza, for example, just introduced the Netezza Migrator, which automates the process of migrating data from Oracle warehouses to its TwinFin appliance. The system is based on technology licensed from EnterpriseDB called the Postgres Plus Advanced Server, which allows data to be pulled out of Oracle warehouses but still maintain full compatibility with legacy databases.
But does the traditional warehouse really have much of a future? Not if some of the latest BI software approaches catch fire, according to Technology Evaluation Centers’ Jorge Garcia. Companies like QlikTech and Prelytis are dumping the standard warehouse structure in favor of an analytic data model said to provide for much quicker analysis and a more robust visualization process. Under these architectures, the warehouse as a central metadata repository actually inhibits performance because it impedes the flow of data into the BI framework.
Indeed, the rapid expansion of metadata that accompanies the increase in actual data loads could overwhelm even the most advanced warehouse relatively soon, says Dr. Barry Devlin, a former IBM researcher and founder and principal of 9sight Consulting. For most relational databases, metadata is rarely used beyond the modeling stage anyway, with most analysis relying directly on hard data. Much better, then, to devise a combined data/metadata approach that allows for integrated access to both sets.
Does this mean you should hold off on an upcoming warehouse purchase or expansion? Not really, unless you’re keen on becoming a test subject for cutting-edge analytics. But it does mean that warehousing is not likely to be the ultimate business intelligence resource it was made out to be only a few years ago.
Technology changes, and even warehousing will have to change along with it if it hopes to stay relevant beyond the next decade.