Welcome!

Containers Expo Blog Authors: Elizabeth White, JP Morgenthal, Pat Romanski, Donald Meyer, Reinhard Brandstädter

Related Topics: @BigDataExpo, Microservices Expo, Containers Expo Blog, IoT User Interface, @CloudExpo, SDN Journal

@BigDataExpo: Article

From Metrics to Success

Ford scours for more big data to bolster quality, improve manufacturing, streamline processes

Ford has exploited the strengths of big data analytics by directing them internally to improve business results. In doing so, they scour the metrics from the company’s best processes across myriad manufacturing efforts and through detailed outputs from in-use automobiles -- all to improve and help transform their business.

So explains Michael Cavaretta, PhD, Technical Leader of Predictive Analytics for Ford Research and Advanced Engineering in Dearborn, Michigan. Cavaretta is one of a group of experts assembled this week for The Open Group Conference in Newport Beach, California.

Cavaretta has led multiple data-analytic projects at Ford to break down silos inside the company to best define Ford’s most fruitful data sets. Ford has successfully aggregated customer feedback, and extracted all the internal data to predict how best new features in technologies will improve their cars.

As a contributor to the The Open Group conference and its focus on "Big Data -- The Transformation We Need to Embrace Today," Cavaretta explains how big data is fostering business transformation by allowing deeper insights into more types of data efficiently, and thereby improving processes, quality control, and customer satisfaction.

The interview was moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: The Open Group is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: What's different now in being able to get at this data and do this type of analysis from five years ago?

Cavaretta: The biggest difference has to do with the cheap availability of storage and processing power, where a few years ago people were very much concentrated on filtering down the datasets that were being stored for long-term analysis. There has been a big sea change with the idea that we should just store as much as we can and take advantage of that storage to improve business processes.

Gardner: How did we get here? What's the process behind the benefits?

Sea change in attitude

Cavaretta: The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises. There's this idea that you don't need to spend so much time figuring out what data you want to store and worry about the cost associated with it, and more about data as an asset. There is value in being able to store it, and being able to go back and extract different insights from it. This really comes from this really cheap storage, access to parallel processing machines, and great software.

Cavaretta

I like to talk to people about the possibility that big data provides and I always tell them that I have yet to have a circumstance where somebody is giving me too much data. You can pull in all this information and then answer a variety of questions, because you don't have to worry that something has been thrown out. You have everything.

You may have 100 questions, and each one of the questions uses a very small portion of the data. Those questions may use different portions of the data, a very small piece, but they're all different. If you go in thinking, "We’re going to answer the top 20 questions and we’re just going to hold data for that," that leaves so much on the table, and you don't get any value out of it.

The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises.

We're a big believer in mash-ups and we really believe that there is a lot of value in being able to take even datasets that are not specifically big-data sizes yet, and then not go deep, not get more detailed information, but expand the breadth. So it's being able to augment it with other internal datasets, bridging across different business areas as well as augmenting it with external datasets.

A lot of times you can take something that is maybe a few hundred thousand records or a few million records, and then by the time you’re joining it, and appending different pieces of information onto it, you can get the big dataset sizes.

Gardner: You’re really looking primarily at internal data, while also availing yourself of what external data might be appropriate. Maybe you could describe a little bit about your organization, what you do, and why this internal focus is so important for you.

Internal consultants
Cavaretta: I'm part of a larger department that is housed over in the research and advanced-engineering area at Ford Motor Company, and we’re about 30 people. We work as internal consultants, kind of like Capgemini or Ernst & Young, but only within Ford Motor Company. We’re responsible for going out and looking for different opportunities from the business perspective to bring advanced technologies. So, we’ve been focused on the area of statistical modeling and machine learning for I’d say about 15 years or so.

And in this time, we’ve had a number of engagements where we’ve talked with different business customers, and people have said, "We'd really like to do this." Then, we'd look at the datasets that they have, and say, "Wouldn’t it be great if we would have had this. So now we have to wait six months or a year."

These new technologies are really changing the game from that perspective. We can turn on the complete fire-hose, and then say that we don't have to worry about that anymore. Everything is coming in. We can record it all. We don't have to worry about if the data doesn’t support this analysis, because it's all there. That's really a big benefit of big-data technologies.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts who are really interested in looking at things from a data-driven perspective. From when I first came in to now, the biggest change has been when Alan Mulally came into the company, and really pushed the idea of data-driven decisions.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts.

Before, we were getting a lot of interest from people who are really very focused on the data that they had internally. After that, they had a lot of questions from their management and from upper level directors and vice-president saying, "We’ve got all these data assets. We should be getting more out of them." This strategic perspective has really changed a lot of what we’ve done in the last few years.

Gardner: Are we getting to the point where this sort of Holy Grail notion of a total feedback loop across the lifecycle of a major product like an automobile is really within our grasp? Are we getting there, or is this still kind of theoretical. Can we pull it altogether and make it a science?

Cavaretta: The theory is there. The question has more to do with the actual implementation and the practicality of it. We still are talking a lot of data where even with new advanced technologies and techniques that’s a lot of data to store, it’s a lot of data to analyze, there’s a lot of data to make sure that we can mash-up appropriately.

And, while I think the potential is there and I think the theory is there. There is also a work in being able to get the data from multiple sources. So everything which you can get back from the vehicle, fantastic. Now if you marry that up with internal data, is it survey data, is it manufacturing data, is it quality data? What are the things do you want to go after first? We can’t do everything all at the same time.

Highest value

Our perspective has been let’s make sure that we identify the highest value, the greatest ROI areas, and then begin to take some of the major datasets that we have and then push them and get more detail. Mash them up appropriately and really prove up the value for the technologists.

Gardner: Clearly, there's a lot more to come in terms of where we can take this, but I suppose it's useful to have a historic perspective and context as well. I was thinking about some of the early quality gurus like Deming and some of the movement towards quality like Six Sigma. Does this fall within that same lineage? Are we talking about a continuum here over that last 50 or 60 years, or is this something different?

Cavaretta: That’s a really interesting question. From the perspective of analyzing data, using data appropriately, I think there is a really good long history, and Ford has been a big follower of Deming and Six Sigma for a number of years now.

The difference though, is this idea that you don't have to worry so much upfront about getting the data. If you're doing this right, you have the data right there, and this has some great advantages. You’ll have to wait until you get enough history to look for somebody’s patterns. Then again, it also has some disadvantage, which is you’ve got so much data that it’s easy to find things that could be spurious correlations or models that don’t make any sense.

The piece that is required is good domain knowledge, in particular when you are talking about making changes in the manufacturing plant. It's very appropriate to look at things and be able to talk with people who have 20 years of experience to say, "This is what we found in the data. Does this match what your intuition is?" Then, take that extra step.

Gardner: How has the notion of the Internet of things being brought to bear on your gathering of big data and applying it to the analytics in your organization?

Cavaretta: It is a huge area, and not only from the internal process perspective -- RFID tags within the manufacturing plans, as well as out on the plant floor, and then all of the information that’s being generated by the vehicle itself.

The Ford Energi generates about 25 gigabytes of data per hour. So you can imagine selling couple of million vehicles in the near future with that amount of data being generated. There are huge opportunities within that, and there are also some interesting opportunities having to do with opening up some of these systems for third-party developers. OpenXC is an initiative that we have going on to add at Research and Advanced Engineering.

Huge number of sensors

We have a lot of data coming from the vehicle. There’s huge number of sensors and processors that are being added to the vehicles. There's data being generated there, as well as communication between the vehicle and your cell phone and communication between vehicles.

There's a group over at Ann Arbor Michigan, the University of Michigan Transportation Research Institute (UMTRI), that’s investigating that, as well as communication between the vehicle and let’s say a home system. It lets the home know that you're on your way and it’s time to increase the temperature, if it’s winter outside, or cool it at the summer time.

The amount of data that’s been generated there is invaluable information and could be used for a lot of benefits, both from the corporate perspective, as well as just the very nature of the environment.

Gardner: Just to put a stake in the ground on this, how much data do cars typically generate? Do you have a sense of what now is the case, an average?

Cavaretta: The Energi, according to the latest information that I have, generates about 25 gigabytes per hour. Different vehicles are going to generate different amounts, depending on the number of sensors and processors on the vehicle. But the biggest key has to do with not necessarily where we are right now but where we will be in the near future.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff. The question is how much information should be sent back for analysis and to find different patterns? That becomes really interesting as you look at external sensors, temperature, humidity. You can know when the windshield wipers go on, and then to be able to take that information, and mash that up with other external data sources too. It's a very interesting domain.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff.

Gardner: What skills do you target for your group, and what ways do you think that you can improve on that?

Cavaretta: The skills that we have in our department, in particular on our team, are in the area of computer science, statistics, and some good old-fashioned engineering domain knowledge. We’ve really gone about this from a training perspective. Aside from a few key hires, it's really been an internally developed group.

Targeted training

The biggest advantage that we have is that we can go out and be very targeted with the amount of training that we have. There are such big tools out there, especially in the open-source realm, that we can spin things up with relatively low cost and low risk, and do a number of experiments in the area. That's really the way that we push the technologies forward.

Talking with The Open Group really gives me an opportunity to be able to bring people on board with the idea that you should be looking at a difference in mindset. It's not "Here’s a way that data is being generated, look, try and conceive of some questions that we can use, and we’ll store that too." Let's just take everything, we’ll worry about it later, and then we’ll find the value.

It's important to be thinking about data as an asset, rather than as a cost. You even have to spend some money, and it may be a little bit unsafe without really solid ROI at the beginning. Then, move towards pulling that information in, and being able to store it in a way that allows not just the high-level data scientist to get access to and provide value, but people who are interested in the data overall. Those are very important pieces.

The last one is how do you take a big-data project, how do you take something where you’re not storing in the traditional business intelligence (BI) framework that an enterprise can develop, and then connect that to the BI systems and look at providing value to those mash-ups. Those are really important areas that still need some work.

There are many companies, especially large enterprises, that are looking at their data assets and wondering what can they do to monetize this, not only to just pay for the efficiency improvement but as a new revenue stream.

Gardner: For those organizations that want to get started on this, how do you get started?

Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

Cavaretta: We're definitely a huge believer in pilot projects and proof of concept, and we like to develop roadmaps by doing. So get out there. Understand that it's going to be messy. Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

But get your feet wet. Start doing some experiments, and then, as those experiments turn from just experimentation into really providing real business value, that’s the time to start looking at a more formal aspect and more formal IT processes. But you've just got to get going at this point.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@ThingsExpo Stories
The essence of data analysis involves setting up data pipelines that consist of several operations that are chained together – starting from data collection, data quality checks, data integration, data analysis and data visualization (including the setting up of interaction paths in that visualization). In our opinion, the challenges stem from the technology diversity at each stage of the data pipeline as well as the lack of process around the analysis.
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discuss how businesses can gain an edge over competitors by empowering consumers to take control through IoT. We'll cite examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He'll also highlight how IoT can revitalize and restore outdated business models, making them profitable...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, wh...
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit y...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device. For more information, please visit https://www.mangoapps.com/.
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo New York Call for Papers is now open.
SYS-CON Events announced today that ContentMX, the marketing technology and services company with a singular mission to increase engagement and drive more conversations for enterprise, channel and SMB technology marketers, has been named “Sponsor & Exhibitor Lounge Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York. “CloudExpo is a great opportunity to start a conversation with new prospects, but what happens after the...
Designing IoT applications is complex, but deploying them in a scalable fashion is even more complex. A scalable, API first IaaS cloud is a good start, but in order to understand the various components specific to deploying IoT applications, one needs to understand the architecture of these applications and figure out how to scale these components independently. In his session at @ThingsExpo, Nara Rajagopalan is CEO of Accelerite, will discuss the fundamental architecture of IoT applications, ...
In his session at 18th Cloud Expo, Bruce Swann, Senior Product Marketing Manager at Adobe, will discuss how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects). Bruce Swann has more than 15 years of experience working with digital marketing disciplines like web analytics, social med...
SYS-CON Events announced today that Enzu, a leading provider of cloud hosting solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to foc...
Customer experience has become a competitive differentiator for companies, and it’s imperative that brands seamlessly connect the customer journey across all platforms. With the continued explosion of IoT, join us for a look at how to build a winning digital foundation in the connected era – today and in the future. In his session at @ThingsExpo, Chris Nguyen, Group Product Marketing Manager at Adobe, will discuss how to successfully leverage mobile, rapidly deploy content, capture real-time d...
IoT generates lots of temporal data. But how do you unlock its value? How do you coordinate the diverse moving parts that must come together when developing your IoT product? What are the key challenges addressed by Data as a Service? How does cloud computing underlie and connect the notions of Digital and DevOps What is the impact of the API economy? What is the business imperative for Cognitive Computing? Get all these questions and hundreds more like them answered at the 18th Cloud Expo...
As cloud and storage projections continue to rise, the number of organizations moving to the cloud is escalating and it is clear cloud storage is here to stay. However, is it secure? Data is the lifeblood for government entities, countries, cloud service providers and enterprises alike and losing or exposing that data can have disastrous results. There are new concepts for data storage on the horizon that will deliver secure solutions for storing and moving sensitive data around the world. ...
What a difference a year makes. Organizations aren’t just talking about IoT possibilities, it is now baked into their core business strategy. With IoT, billions of devices generating data from different companies on different networks around the globe need to interact. From efficiency to better customer insights to completely new business models, IoT will turn traditional business models upside down. In the new customer-centric age, the key to success is delivering critical services and apps wit...
SYS-CON Events announced today that 24Notion has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. 24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to con...
WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at @ThingsExpo, Leo Papadopoulos, CTO of Cloud9, will discuss the importance of WebRTC and how it enables companies to fo...
SYS-CON Events announced today TechTarget has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. TechTarget is the Web’s leading destination for serious technology buyers researching and making enterprise technology decisions. Its extensive global networ...
Korean Broadcasting System (KBS) will feature the upcoming 18th Cloud Expo | @ThingsExpo in a New York news documentary about the "New IT for the Future." The documentary will cover how big companies are transmitting or adopting the new IT for the future and will be filmed on the expo floor between June 7-June 9, 2016, at the Javits Center in New York City, New York. KBS has long been a leader in the development of the broadcasting culture of Korea. As the key public service broadcaster of Korea...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York and Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty ...
There are several IoTs: the Industrial Internet, Consumer Wearables, Wearables and Healthcare, Supply Chains, and the movement toward Smart Grids, Cities, Regions, and Nations. There are competing communications standards every step of the way, a bewildering array of sensors and devices, and an entire world of competing data analytics platforms. To some this appears to be chaos. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will discuss the vast to...