Welcome!

Containers Expo Blog Authors: Elizabeth White, Pat Romanski, Amit Gupta, Mehdi Daoudi, Ravi Rajamiyer

Related Topics: Containers Expo Blog, Microservices Expo

Containers Expo Blog: Article

Five Ways Data Virtualization Improves Data Warehousing

Data virtualization fills the EDW agility gap

An array of business intelligence (BI), predictive analytics, data and content mining, portals and more tap a growing volume of information sourced from enterprise data warehouses (EDW).  However, significant volumes of business-critical enterprise data resides outside the enterprise data warehouse.  To deliver the most comprehensive information to business decision-makers, IT teams are implementing data virtualization to preserve and extend their existing enterprise data warehouse investments.

This article discusses five integration patterns that combine both enterprise data warehouses and data virtualization to solve real business and IT problems along with examples from Composite Software's data virtualization customers.  The five patterns include:

  1. Data Warehouse Augmentation
  2. Data Warehouse Federation
  3. Data Warehouse Hub and Virtual Data Mart Spoke
  4. Complementing the ETL Process
  5. Data Warehouse Prototyping

Maximizing Value from Enterprise Data Warehouse Investments
Supporting critical, yet ever-changing information requirements in an environment of ever-increasing data volumes and complexity is a challenge well understood by large enterprises and government agencies today.

This inexorable pressure has and will continue to drive the demand for enterprise data warehouses as an array of BI, predictive analytics, data and content mining, portals and other key applications rely on data sourced from enterprise data warehouses.

However, business change often outpaces enterprise data warehouse evolution.  And while useful for physically consolidating and transforming a large portion of enterprise data, significant volumes of enterprise data resides outside the confines of the enterprise data warehouse.  Further, enterprise data warehouses themselves require support throughout their lifecycles, driving demand for solutions that prototype, migrate, extend, federate and leverage enterprise data warehouse assets.

Data virtualization middleware, an advanced version of earlier data federation or enterprise information integration (EII) middleware, complements enterprise data warehouses by providing a range of flexible data integration techniques that preserve, extend and thereby drive greater business value from existing enterprise data warehouse investments.

1. Data Warehouse Augmentation
Organizations overwhelmed by scattered data silos and exponentially growing data volumes have deployed data warehouses to meet many of their reporting requirements.  However, a number of data sources remain outside the warehouse.  Providing users with complete business insight in support of revenue, cost and risk management goals often requires the following:

  • Historical data from the warehouse and up-to-the-minute data from transaction systems or operational data stores;
  • Summarized data from the warehouse and drill-down detail from transaction systems or operational data stores;
  • Master customer, product or employee data from an MDM hub or warehouse and detail from transaction systems or operational data stores; and
  • Internal data from the warehouse and external data from outside sources including cloud computing.

Data virtualization effectively federates data-warehouse information with additional sources, therefore extending existing data warehouse schemas and data.  These complementary views are conducive to adding current data to historical warehouse data, detailed data to summarized warehouse data, and external data to internal warehouse data.

Energy Company Combines Up-to-the-minute and Historical Data - To optimize deployment of repair crews and equipment across more than 10,000 production oil wells, an energy company uses data virtualization to federate real-time crew, equipment and well status data from their wells and SAP's maintenance management system with historical surface, subsurface and business data from their enterprise data warehouse.  The net result is faster repairs for more uptime and thus more revenue.

2. Data Warehouse Federation
A primary reason enterprises implement data warehouses is to overcome the various transaction and analytic system silos typical in most large enterprise and government agencies today.  However, for a number of often pragmatic reasons, the single "enterprise" data warehouse remains elusive.  Instead, for these same reasons, multiple data warehouses and data marts have been developed and deployed, in effect perpetuating, rather than overcoming, the data silo issue.

Optimizing business performance requires data from across these various warehouses and marts.   But physically combining multiple marts and warehouses into a singular and complete enterprise-wide data warehouse is often too costly and time consuming.

Data virtualization federates multiple physical warehouses.  Two examples include combining data from the sales and financial warehouses, or combining two sales data warehouses after a corporate merger. This approach achieves logical consolidation of warehouses by creating an integrated view across them, using abstraction to rationalize the different schema designs.

Investment Bank Federate Financial Trading Data Warehouses - To enable more flexible customer self-service reporting and meet SEC compliance reporting mandates, a prime brokerage uses data virtualization to federate equity, fixed income and other investment positions and trades information from siloed trading data warehouses.  The net result is higher customer satisfaction and lower reporting costs.

3. Data Warehouse Hub and Virtual Spoke
A typical data warehouse pattern is a central data warehouse hub with satellite data marts as spokes around the hub.  These marts use a subset of the warehouse data and are used by a subset of the data warehouse users.   Sometimes these marts are created because the analytic tools require data in a different form than the warehouse.  On the other hand, they may be created to work around the controls provided by the warehouse, and thus act as "rogue" data marts.  Regardless of the reason, every additional mart adds cost and compromises data quality.

Data virtualization provides virtual data marts that eliminate, or at least significantly reduce, the need for physical data marts around the data warehouse hubs.  This approach abstracts the warehouse data to meet specific consuming tool and user query requirements, while still preserving the quality and controls inherent in the data warehouse.

Mutual Fund Manager Eliminates "Rogue" Financial Data Marts - A mutual fund company uses data virtualization to enable more than 150 financial analysts to build portfolio analysis models with MATLAB® and other analysis tools leveraging a wide range of equity financial data from a 10 terabyte financial research data warehouse.  Prior to introducing data virtualization, analysts frequently spawned new satellite data marts with useful data subsets for every new project.  To accelerate and simplify data access and to stop the proliferation of costly, unnecessary physical marts, the firm instead used data virtualization to create virtual data marts formed from a set of robust, reusable views that directly accessed the financial warehouse on demand.  This enables analysts to spend more time on analysis and less on access, thereby improving portfolio returns.  The IT team has also eliminated extra, unneeded marts and all the costs that go with maintaining them.

4. Complementing the ETL Process
Extract, Transform, and Load (ETL) middleware is the tool of choice for loading data warehouses.  However, there are some cases where ETL tools are not the most effective approach.  Some examples include:

  • ETL tools lack interfaces to easily access source data, for example data from packaged applications such as SAP or new technologies such as web services;
  • Readily available, existing virtual views or data services can be reused rather than building new ETL scripts from scratch; and
  • Tight batch windows require access, abstraction and federation activities to be pre-processed and virtually staged in advance of ETL processes.

ETL tools can leverage data virtualization views and data services as inputs to their batch processes, appearing as another data source. This integration pattern also integrates data source types that ETL tools cannot easily access as well as reuse existing views and services, saving time and costs.  Further these abstractions do not require ETL developers to understand the structure of, or interact directly with, actual data sources, significantly simplifying their work and reducing time to solution.

Energy Company Preprocesses SAP Data - To provide the SAP financial data required for their financial data warehouse, an energy company uses data virtualization to access and abstract SAP R/3 FICO data.  This replaces an error-prone, SAP data-expert-intensive, flat-file-extraction process that would not scale across a complex SAP landscape.  The results include more complete and timely data in the financial data warehouse enabling better performance management.

5. Data Warehouse Prototyping
Building a new data warehouse from scratch is a large undertaking that requires significant design, development and deployment efforts.  One of the biggest issues is schema change, a frequent activity early in a warehouse's lifecycle.   This change process requires modification of both the ETL scripts and physical data in the warehouse and thus becomes a bottleneck that slows new warehouse deployments.  This problem does not go away later in the lifecycle; it just lessens as the pace of change slows.

Data virtualization middleware can be the platform for prototype development environment for a new data warehouse.  In this prototype stage, a virtual data warehouse is built, rather than a physical one, saving the time to build the physical warehouse.  This virtual warehouse includes a full schema that is easy to iterate as well as a complete functional testing environment.  Performance testing is somewhat constrained at this stage, however.

Once the actual warehouse is deployed, the views and data services built during the prototype stage still have value.  These are useful for prototyping and testing subsequent warehouse schema changes that arise as business needs or underlying data sources change.

Government Agency Prototypes New Data Warehouses - To reduce data warehousing time-to-solution for new data warehouse projects and changes to existing ones, a government agency uses data virtualization.  The time spent in getting the data right has proven to be four times faster than directly building the ETL and warehouse, even when the subsequent translation of these working views into ETL scripts and physical warehouse schemas is factored in.

Key Takeaways
As data sources proliferate, including many web-based and cloud computing sources outside the traditional enterprise data warehouse, enterprises and government agencies are deploying solutions that combine enterprise data warehouses and data virtualization to deliver the most comprehensive information to decision-makers.  The results are extended life to existing information system investments, greater agility for adding new BI and other analytic technologies, and less disruption from corporate activities such as mergers and acquisitions.

More Stories By Robert Eve

Robert Eve is the EVP of Marketing at Composite Software, the data virtualization gold standard and co-author of Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility. Bob's experience includes executive level roles at leading enterprise software companies such as Mercury Interactive, PeopleSoft, and Oracle. Bob holds a Masters of Science from the Massachusetts Institute of Technology and a Bachelor of Science from the University of California at Berkeley.

@ThingsExpo Stories
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
SYS-CON Events announced today that Enroute Lab will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enroute Lab is an industrial design, research and development company of unmanned robotic vehicle system. For more information, please visit http://elab.co.jp/.
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
In his session at @ThingsExpo, Greg Gorman is the Director, IoT Developer Ecosystem, Watson IoT, will provide a short tutorial on Node-RED, a Node.js-based programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using a wide range of nodes in the palette that can be deployed to its runtime in a single-click. There is a large library of contributed nodes that help so...
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
What is the best strategy for selecting the right offshore company for your business? In his session at 21st Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, will discuss the things to look for - positive and negative - in evaluating your options. He will also discuss how to maximize productivity with your offshore developers. Before you start your search, clearly understand your business needs and how that impacts software choices.
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...