Containers Expo Blog Authors: Elizabeth White, Pat Romanski, Yeshim Deniz, Flint Brenton, Gordon Haff

Related Topics: Containers Expo Blog, Microservices Expo

Containers Expo Blog: Article

Ten Mistakes to Avoid When Virtualizing Data (Revisited)

New guidance and insights three years later

In late 2008, I wrote the cover article for the November edition of Virtualization Journal.

Ten Mistakes to Avoid When Virtualizing Data described the ten most common mistakes made by data virtualization's early adopters.

My objective was to provide important ‘lessons learned' guidance that would assist new data virtualization users to accelerate their success and benefits realization.

Fast Forward to 2011
In the nearly three years since, both data virtualization technology and its adoption have advanced significantly.  Early adopters have expanded their data virtualization deployments to a far wider set of use cases.  Hundreds for enterprises across multiple industry segments, as well as dozens of federation government agencies, have started similar data virtualization journeys.  The following articles showcase a subset of this success:

Industry Analysts Report Data Virtualization Acceleration
Industry analysts also recognize this acceleration. According to a June 2011 Forrester Research report, entitled Data Virtualization Reaches Critical Mass: Technology Advancements, New Patterns, And Customer Successes Make This Enterprise Technology Both A Short- And Long-Term Solution, data virtualization has reached critical mass with adoption in the coming 18-30 months expected to accelerate as new usage patterns and successes increase awareness and interest.

The July 2011, Gartner Hype Cycle for Data Management 2011 reports that data virtualization has moved into the slope of enlightenment with mainstream adoption expected within the next two to five years.

Looking Back - The Ten Mistakes from 2008
Let's consider the ten mistakes identified in the 2008 article.  Determining where and when to use data virtualization was the source of five common mistakes.  Implementing data virtualization, from the design and enabling technology points of view, was the source of three potential mistakes.  Failing to determine who implements it and failing to correctly estimate how much value may result were also common mistakes.

  • Are these the same mistakes data virtualization adopters are making today?
  • If so, what additional advice and insight is available today to complement this earlier counsel and mitigate negative impacts?
  • If not, are there other mistakes that are more relevant today?

Mistake #1 - Trying to Virtualize Too Much
Data virtualization, similar to storage, server and application virtualization, delivers significant top- and bottom-line benefits.  However, data virtualization is not the right solution for every data integration problem.  For instance, when the use case requires multidimensional analysis, pre-aggregating the data using physical data consolidation is a more effective, albeit a more expensive, approach.

Trying to use too much data virtualization has only recently become a common mistake amongst the most successful data virtualization adopters.  For an updated look at this topic, check out When Should We Use Data Virtualization? and Successful data integration projects require a diverse approach. These articles provide updated counsel and tools for making data virtualization versus data consolidation decisions.

Mistake #2 - Failing to Virtualize Enough
Failing to virtualize enough carries a large opportunity cost because physical data consolidation necessitates longer time- to-solution, more costly development and operations, and lower business and IT agility.

This continues as perhaps the biggest mistake today.  The main issue is familiarity with other data integration approaches closes one's mind to better options.  To counteract this tendency, become more adept at evaluating data virtualization's measurable impacts, especially in contrast to other integration approaches.  To better understand data virtualization's business and IT value propositions, take a look at How to Justify Data Virtualization Investments.

Mistake #3 - Missing the Hybrid Opportunity
In many cases, the best data integration solution is a combination of virtual and physical approaches.  There is no reason to be locked into one way or the other.   This remains true today.  For more insights into hybrid combinations of data virtualization and data warehousing, check out:

Mistake #4 - Assuming Perfect Data Is Prerequisite
Poor data quality was a pervasive problem in enterprises three years ago and remains so today.  While correcting all your data is the ultimate goal, most of the time enterprises settle for a clean data warehouse.  With source data left as is, they assume that the quality of virtualized data can never match the quality of warehouse data.

Nothing could be further from the truth.  And this myth has dissipated rapidly of late.  How Data Virtualization Improves Data Quality is an article that addresses the many ways enterprises are applying data virtualization as a solution to the data quality problem rather than a reason to not do data virtualization.  New capabilities developed over the past two years provide data virtualization platforms with a number of important data quality improvement mechanisms and techniques that complement and extend data quality tools.

Mistake #5 - Anticipating Negative Impact on Operational Systems
Although operational systems are often a data virtualization source, the run-time performance of these systems is not typically impacted as a result.  Yet, designers have been schooled to think about data volumes in terms of the size of the data warehouse or the throughput of the nightly ETLs.

When using a virtual approach, designers should instead consider the size on any individual query, and how often these queries will run.  If the queries are relatively small (for example, 100,000 rows) and broad (across multiple systems and/or tables), or run relatively infrequently (several hundred times per day), then the impact on operational systems will be light.

These and other larger queries are less of an issue today.  Data virtualization query performance has is faster than ever due to many internal advancements and technology improvements (Moore's Law). These improvements are starting to eliminate this mistake from the top ten.

As an example of these advancements, Composite Software's recent Composite 6 release included a number of innovative optimization and caching techniques.  Why Query Optimization Matters provides a good summary of the state of the art in query optimization.

Mistake #6 - Failing to Simplify the Problem
While the enterprise data environment is understandably complex, it is unnecessary to develop complex data virtualization solutions.  The most successful data virtualization projects are broken into smaller components, each addressing pieces of the overall need.  This simplification can occur in two ways: by leveraging tools and by right-sizing integration components.

Roles and Reference Architecture for Data Abstraction Success is an article that directly addresses this recurring mistake with common sense advice about using a well-organized team and data virtualization reference architecture to rationalize complex data landscapes into a set of reusable data objects.

Mistake #7 - Treating SQL/Relational and XML/Hierarchical as Separate Silos
Historically, data integration has focused on supporting business intelligence needs, whereas process integration focused on optimizing business processes.  These two divergent approaches led to different architectures, tools, middleware, methods, teams and more.  However, because today's data virtualization middleware is equally adept at relational and hierarchical data, it is a mistake to silo work on these key data forms.

Over the past three years, these technology silos have broken down to support business requirements that cross them.  How Data Virtualization Increases Business Intelligence Agility identifies a number of ways that data virtualization can federate relational and hierarchical data sources.

Mistake #8 - Implementing Data Virtualization Using the Wrong Infrastructure
The loose coupling of data services in a services oriented architecture (SOA) environment is an excellent and frequent use for data virtualization.  However, there is sometimes confusion about when to deploy enterprise service bus (ESB) middleware and when to use data virtualization platforms to design and run the data services typically required.

There is greater clarity here today than three years ago.   As such fewer organizations now make this mistake. SOA + Data Virtualization = Enterprise Data Sharing and Data Services Platforms--Bringing Order to Chaos provide advice on how best-in-class data virtualization implementations leverage SOA principles and technologies.

Mistake #9 - Segregating Data Virtualization People and Processes
As physical data consolidation technology and approaches have matured, so too did supporting Integration Competency Centers (ICC).  It was a mistake to assume that these ICCs, cannot or should not also be leveraged in support of data virtualization.

This mistake has been recognized.  Today, data virtualization strategy, design, development, and deployment is often delivered side-by-side with other integration techniques within a larger ICC.  What Is the "Best" Data Virtualization Best Practice? highlights the importance of this integrated approach to the data virtualization competency center.

Mistake #10 - Failing to Identify and Communicate Benefits
While data virtualization can accelerate new development, perform iterative changes quicker, and reduce both development and operating costs, it is a mistake to assume these benefits sell themselves.

This remains true today. What's So Great About Data Virtualization? provides an excellent summary of the challenges data virtualization addresses as well as the benefits delivered.  In addition, other's data virtualization successes can provide a lens through which you can view your own.  Use these as guides when communicating your data virtualization successes.

Netting It Out 2011 vs. 2008
Data virtualization's early adopters gained critical knowledge when implementing their data virtualization solutions.  Mistakes were made.  But lessons were learned.

Many of the mistakes experienced back then are no longer valid today.  And those that remain have been mitigated with improved technology and implementation best practices.

To err is human.  But if you are willing to learn from your peers, to err using data virtualization will be less frequent.

More Stories By Robert Eve

Robert Eve is the EVP of Marketing at Composite Software, the data virtualization gold standard and co-author of Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility. Bob's experience includes executive level roles at leading enterprise software companies such as Mercury Interactive, PeopleSoft, and Oracle. Bob holds a Masters of Science from the Massachusetts Institute of Technology and a Bachelor of Science from the University of California at Berkeley.

@ThingsExpo Stories
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial C...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term.
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
With privacy often voiced as the primary concern when using cloud based services, SyncriBox was designed to ensure that the software remains completely under the customer's control. Having both the source and destination files remain under the user?s control, there are no privacy or security issues. Since files are synchronized using Syncrify Server, no third party ever sees these files.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Internet-of-Things discussions can end up either going down the consumer gadget rabbit hole or focused on the sort of data logging that industrial manufacturers have been doing forever. However, in fact, companies today are already using IoT data both to optimize their operational technology and to improve the experience of customer interactions in novel ways. In his session at @ThingsExpo, Gordon Haff, Red Hat Technology Evangelist, shared examples from a wide range of industries – including en...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
delaPlex is a global technology and software development solutions and consulting provider, deeply committed to helping companies drive growth, revenue and marketplace value. Since 2008, delaPlex's objective has been to be a trusted advisor to its clients. By redefining the outsourcing industry's business model, the innovative delaPlex Agile Business Framework brings an unmatched alliance of industry experts, across industries and functional skillsets, to clients anywhere around the world.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...