Welcome!

Containers Expo Blog Authors: Yeshim Deniz, Liz McMillan, Pat Romanski, Zakia Bouachraoui, Elizabeth White

Related Topics: Containers Expo Blog, @CloudExpo, @DXWorldExpo

Containers Expo Blog: Article

Data #Virtualization Helps Build Self-Reliance | @CloudExpo #BigData #AI #ML

Data virtualization can act as the bridge between IT and business by providing a common language for both groups

How Data Virtualization Helps Build Self-Reliance for Information Self-Service

Information self-service is undoubtedly one of the main drivers of Modern Data Management. From "data services marketplaces" to "self-service Big Data analytics," one of the objectives of most data-related initiatives today is to provide business professionals with new ways to solve their information needs with the goals of achieving self-reliance and minimizing the IT bottleneck. However, is it realistic to expect business users to assume this job?

Studies [1] report that more than 60 percent of companies grade their experience with self-service initiatives as "average" or lower, with nearly four out of five (73 percent) claiming that "...it requires more training than expected." So, what is the problem and what can we do to solve it? Let's start with the easy part: data visualization, which is the last stage of the data analysis process. Self-service BI tools have been around for some years now, allowing business data analysts to create their own graphical reports. Although those tools are not for every business user, business analysts with data experience, basic knowledge of statistics and a bit of SQL, can use them successfully.

The problem is that these tools are only effective when users work on previously curated/integrated datasets, such as the ones we typically find in data marts and other tightly controlled data repositories. In those environments, the source data has been carefully transformed and integrated so it can be presented in a business-friendly form. Table names correspond with business entities, column names and data formats follow business conventions, and all the data has been centralized in the same place.

Even the new breed of data preparation tools has similar problems as self-service BI tools. They enable data analysts to perform some simple data transformations as a previous step to visualization, but they inherently assume that all data has been previously moved to the same repository and that the datasets are understandable for the user.

The problem with this, of course, is that only IT can create such curated, integrated repositories.  While data analysts have some technical knowledge, they cannot be expected to understand the technical details of each source system, which have different data representation models (relational, NoSQL, HDFS, multidimensional, etc.), different query languages, and are suited for different types of queries. Neither can they be expected to understand technical naming conventions (e.g., SAP technical names) or specify complex connection details (e.g., Kerberos settings). In addition, their basic SQL knowledge is far from sufficient to write the complex queries and transformations required to obtain data from the original sources in business-friendly form. Moreover, in most cases the data for a report is distributed across several systems, which further complicates things.

Where does this leave self-service? As soon as the user deviates a bit from what the designers of the curated repository had in mind, we are back to the same old process. IT needs to get involved, with the added problem that now they have more users to worry about. It's no surprise that the above-mentioned report also states that among the main problems it can "...spawn more requests to IT than before."

Data virtualization (DV) provides a way out from this apparent dead end by allowing separation of IT-related integration concerns from the business-related integration ones. The key point is that it allows IT to easily create reusable, virtual data views that expose the data in business-friendly form. These virtual datasets appear to the analyst as if they came from a single system with a consistent data representation model and query language. They use business terminology instead of technical terminology, and hide the complex transformations that are needed to present data in the way in which business users are familiar. When they don't exist in the data sources, data virtualization allows the user to add metadata to these virtual views, such as adding business-friendly column descriptions and specifying the relationships between different datasets.

It's also easy to expose different logical views over the same physical data, adapted to the needs of different business units and at different levels of granularity. Because data virtualization requires no data replication, this process is much faster than using traditional data integration strategies. Without data virtualization, new physical repositories for each business unit and each desired data granularity would be required, a process well known to be slow, costly, and prone to data inconsistencies. In turn, creating new logical views with data virtualization can be made at almost negligible cost.

Finally, and with this approach, data analysts are abstracted from the changes in the underlying infrastructure. For instance, they will not notice if now the data for a report originates from Hadoop instead of the data warehouse.

To create these business-friendly views, IT manages the details of connecting to data sources, defining data transformations, and combining information from several data sources. Unlike self-service BI and data preparation tools, data virtualization tools apply sophisticated distributed query execution techniques to ensure optimal performance even when dealing with very large data sets distributed across several data sources (see my posts at datavirtualizationblog.com for details).

Data virtualization also provides IT with a single, entry point for monitoring usage, and for creating virtual "trust domains" to apply different security and governance rules for different types of users. For example, data analysts performing an exploratory job for their own use can benefit from "light" governance rules, while shared reports go through a "hard" governance process. Notice that governance is key to avoid another of the key problems mentioned in the above report: "report chaos", which consists in different business users having supposedly similar reports which, in practice, offer different results.

Data virtualization also provides a single-entry point to establish workload management policies to set limits in resource usage. For example, IT may want to set limits on the number of requests that a certain business unit can make against a certain data source.

Data virtualization not only helps IT, it also makes it easier for data analysts to search and discover data. Best-of-breed DV tools allow browsing of all datasets and their relationships, and provide a Google-like interface to search both data and metadata across all the data sets.

The capability of creating different logical views over the same data is also crucial for enabling iterative processes. Using data virtualization, the cost of iterating over several versions of a report is very low. This also changes the interaction with IT when a new component gains enough usage to be worth "operationalizing," or when we need a new business-friendly view to be added. In this instance, instead of sending a written request, the analyst can share a prototype of the desired new component.

In summary, data virtualization can act as the bridge between IT and business by providing a common language for both groups, enabling an effective division of work where each group focuses on its core functions, and making it easier for them to collaborate. DV also allows IT to monitor and govern the process, avoid chaos and provide help when needed.

The World's Largest "Cloud Digital Transformation" Event

@CloudExpo / @ThingsExpo 2017 New York 
(June 6-8, 2017, Javits Center, Manhattan)

@CloudExpo / @ThingsExpo 2017 Silicon Valley
(Oct. 31 - Nov. 2, 2017, Santa Clara Convention Center, CA)

Full Conference Registration Gold Pass and Exhibit Hall ▸ Here

Register For @CloudExpo ▸ Here via EventBrite

Register For @ThingsExpo ▸ Here via EventBrite

Register For @DevOpsSummit ▸ Here via EventBrite

Sponsorship Opportunities

Sponsors of Cloud Expo @ThingsExpo will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35 minute technical session
  • Online targeted advertising in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage
  • Unprecedented Marketing Coverage: Editorial Coverage on ITweetup to over 100,000 plus followers, press releases sent on major wire services to over 500 industry analysts

For more information on sponsorship, exhibit, and keynote opportunities, contact Carmen Gonzalez (@GonzalezCarmen) today by email at events (at) sys-con.com, or by phone 201 802-3021.

Secrets of Sponsors and Exhibitors ▸ Here
Secrets of Cloud Expo Speakers ▸ Here

All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Track 1. FinTech
Track 2. Enterprise Cloud | Digital Transformation
Track 3. DevOps, Containers & Microservices 
Track 4. Big Data | Analytics
Track 5. Industrial IoT
Track 6. IoT Dev & Deploy | Mobility
Track 7. APIs | Cloud Security
Track 8. AI | ML | DL | Cognitive Computing

Delegates to Cloud Expo @ThingsExpo will be able to attend 8 simultaneous, information-packed education tracks.

There are over 120 breakout sessions in all, with Keynotes, General Sessions, and Power Panels adding to three days of incredibly rich presentations and content.

Join Cloud Expo @ThingsExpo conference chair Roger Strukhoff (@IoT2040), June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA for three days of intense Enterprise Cloud and 'Digital Transformation' discussion and focus, including Big Data's indispensable role in IoT, Smart Grids and (IIoT) Industrial Internet of Things, Wearables and Consumer IoT, as well as (new) Digital Transformation in Vertical Markets.

Financial Technology - or FinTech - Is Now Part of the @CloudExpo Program!

Accordingly, attendees at the upcoming 20th Cloud Expo @ThingsExpo June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA will find fresh new content in a new track called FinTech, which will incorporate machine learning, artificial intelligence, deep learning, and blockchain into one track.

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

FinTech brings efficiency as well as the ability to deliver new services and a much improved customer experience throughout the global financial services industry. FinTech is a natural fit with cloud computing, as new services are quickly developed, deployed, and scaled on public, private, and hybrid clouds.

More than US$20 billion in venture capital is being invested in FinTech this year. @CloudExpo is pleased to bring you the latest FinTech developments as an integral part of our program, starting at the 20th International Cloud Expo June 6-8, 2017 in New York City and October 31 - November 2, 2017 in Silicon Valley.

@CloudExpo is accepting submissions for this new track, so please visit www.CloudComputingExpo.com for the latest information.

Speaking Opportunities

The upcoming 20th International @CloudExpo@ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA announces that its Call For Papers for speaking opportunities is open.

Submit your speaking proposal today! ▸ Here

Our Top 100 Sponsors and the Leading "Digital Transformation" Companies

(ISC)2, 24Notion (Bronze Sponsor), 910Telecom, Accelertite (Gold Sponsor), Addteq, Adobe (Bronze Sponsor), Aeroybyte, Alert Logic, Anexia, AppNeta, Avere Systems, BMC Software (Silver Sponsor), Bsquare Corporation (Silver Sponsor), BZ Media (Media Sponsor), Catchpoint Systems (Silver Sponsor), CDS Global Cloud, Cemware, Chetu Inc., China Unicom, Cloud Raxak, CloudBerry (Media Sponsor), Cloudbric, Coalfire Systems, CollabNet, Inc. (Silver Sponsor), Column Technologies, Commvault (Bronze Sponsor), Connect2.me, ContentMX (Bronze Sponsor), CrowdReviews (Media Sponsor) CyberTrend (Media Sponsor), DataCenterDynamics (Media Sponsor), Delaplex, DICE (Bronze Sponsor), EastBanc Technologies, eCube Systems, Embotics, Enzu Inc., Ericsson (Gold Sponsor), FalconStor, Formation Data Systems, Fusion, Hanu Software, HGST, Inc. (Bronze Sponsor), Hitrons Solutions, IBM BlueBox, IBM Bluemix, IBM Cloud (Platinum Sponsor), IBM Cloud Data Services/Cloudant (Platinum Sponsor), IBM DevOps (Platinum Sponsor), iDevices, Industrial Internet of Things Consortium (Association Sponsor), Impinger Technologies, Interface Masters, Intel (Keynote Sponsor), Interoute (Bronze Sponsor), IQP Corporation, Isomorphic Software, Japan IoT Consortium, Kintone Corporation (Bronze Sponsor), LeaseWeb USA, LinearHub, MangoApps, MathFreeOn, Men & Mice, MobiDev, New Relic, Inc. (Bronze Sponsor), New York Times, Niagara Networks, Numerex, NVIDIA Corporation (AI Session Sponsor), Object Management Group (Association Sponsor), On The Avenue Marketing, Oracle MySQL, Peak10, Inc., Penta Security, Plasma Corporation, Pulzze Systems, Pythian (Bronze Sponsor), Cosmos, RackN, ReadyTalk (Silver Sponsor), Roma Software, Roundee.io, Secure Channels Inc., SD Times (Media Sponsor), SoftLayer (Platinum Sponsor), SoftNet Solutions, Solinea Inc., SpeedyCloud, SSLGURU LLC, StarNet, Stratoscale, Streamliner, SuperAdmins, TechTarget (Media Sponsor), TelecomReseller (Media Sponsor), Tintri (Welcome Reception Sponsor), TMCnet (Media Sponsor), Transparent Cloud Computing Consortium, Veeam, Venafi, Violin Memory, VAI Software, Zerto

About SYS-CON Media & Events
SYS-CON Media (www.sys-con.com) has since 1994 been connecting technology companies and customers through a comprehensive content stream - featuring over forty focused subject areas, from Cloud Computing to Web Security - interwoven with market-leading full-scale conferences produced by SYS-CON Events. The company's internationally recognized brands include among others Cloud Expo® (@CloudExpo), Big Data Expo® (@BigDataExpo), DevOps Summit (@DevOpsSummit), @ThingsExpo® (@ThingsExpo), Containers Expo (@ContainersExpo) and Microservices Expo (@MicroservicesE).

Cloud Expo®, Big Data Expo® and @ThingsExpo® are registered trademarks of Cloud Expo, Inc., a SYS-CON Events company.

More Stories By Alberto Pan

Alberto Pan is Chief Technical Officer at Denodo and Associate Professor at University of A Coruña. He leads Product Development tasks for all versions of the Denodo Platform. He has authored more than 25 scientific papers in areas such as data virtualization, data integration and web automation.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...