Welcome!

Containers Expo Blog Authors: Liz McMillan, Pat Romanski, Yeshim Deniz, Elizabeth White, Zakia Bouachraoui

Related Topics: @CloudExpo

@CloudExpo: Blog Post

Navigating the Fog - Billing, Metering & Measuring the Cloud

The 400,000+ Amazon Web Service consumers await with great anticipation and horror

It's that dreaded time of the month again, the time of the month that we, the 400,000+ Amazon Web Service consumers await with great anticipation / horror. What I'm talking about is the Amazon Web Services Billing Statement sent at beginning of each month. A surprise every time. In honor of this monthly event, I thought I'd take a minute to discuss some of the hurdles as well as opportunities for Billing, Metering & Measuring the Cloud.

I keep hearing that one of the biggest issues facing IaaS users currently is a lack of insight into costing, billing and metering. The AWS costing problem is straightforward enough, unlike other cloud services Amazon has decided to not offer any kind of real time reporting or API for their cloud billing (EC2, S3, etc). There are some reporting features for DevPay and Flexible Payments Service (Amazon FPS) as well as a Account Activity page, but who has time for a dashboard when what we really want is an realtime API?

To give some background, when Amazon launched S3 and later EC2 the reasoning was fairly straightforward, they were a new services still in beta. So without officially comfirming, the word was a billing API was coming soon. But 3 years later, still no billing billing API? So I have to ask, what gives?

Other Cloud services have done a great job of providing a real time view of what the cloud is costing you. One of the best examples is GoGrid's myaccount.billing.get API and widget which offers a variety of metrics through their Open Source GoGrid API.

Billing APIs aside, another major problem still remains for most cloud users, a basis for comparing the quality & cost of cloud compute capacity between cloud providers. This brings us to the problem of metering the cloud which Yi-Jian Ngo at Microsoft pointed out last year. In his post he stated that "Failing to come up with an appropriate yardstick could lead to hairy billing issues, savvy customers tinkering with clever arbitrage schemes and potentially the inability of cloud service providers to effectively predict how much to charge in order to cover their costs."

Yi-Jian Ngo couldn't have been more right in pointing to Wittgenstein's Rule: "Unless you have confidence in the ruler's reliability, if you use a ruler to measure a table, you may as well be using the table to measure the ruler."



A few companies have attempted to define cloud capacity, notably Amazon's Elastic Compute Cloud service uses a EC2 Compute Unit as the basis for their EC2 pricing scheme (As well as bandwidth and storage) Amazon states they use a variety of measurements to provide each EC2 instance with a consistent and predictable amount of CPU capacity. The amount of CPU that is allocated to a particular instance is expressed in terms of EC2 Compute Units. Amazon explains that they use several benchmarks and tests to manage the consistency and predictability of the performance from an EC2 Compute Unit. One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. They claim this is the equivalent to an early-2006 1.7 GHz Xeon processor. Amazon makes no mention of how they achieve their benchmark and users of the EC2 system are not given any real insight to how they came to their benchmark numbers. Currently there are no standards for cloud capacity and therefore there is no effective way for users to compare with other cloud providers in order to make the best decision for their application demands.

An idea I suggested in a post last year was to create an open universal compute unit which could be used to address an "apples-to-apples" comparison between cloud capacity providers. My rough concept was to create a Universal Compute Unit specification and benchmark test based on integer operations that can form an (approximate) indicator of the likely performance of a given virtual application within a given cloud such as Amazon EC2, GoGrid or even a virtualized data center such as VMWare. One potential point of analysis cloud be in using a stand clock rate measured in hertz derived by multiplying the instructions per cycle and the clock speed (measured in cycles per second). It can be more accurately defined within the context of both a virtual machine kernel and standard single and multicore processor types.

My other suggestion was to create a Universal Compute Cycle (UCC) or the inverse of Universal Compute Unit. The UCC would be used when direct system access in the cloud and or operating system is not available. One such example is Google's App Engine or Microsoft Azure. UCC could be based on clock cycles per instruction or the number of clock cycles that happen when an instruction is being executed. This allows for an inverse calculation to be performed to determine the UcU value as well as providing a secondary level of performance evaluation / benchmarking.

I'm not the only one thinking about this, One such company trying to address this need is Satori Tech with their capacity measurement metric, which they call the Computing Resource Unit (“CRU”). They claim that the CRU allows for dynamic monitoring of available and used computing capacity on physical servers and virtual pools/instances. The CRU allows for uniform comparison of capacity, usage and cost efficiency in heterogeneous computing environments and abstraction away from operating details for financial optimization. Unfortunately the format is a patented and closed format only available to customers of Satori Tech.

And before you say it, I know that UCU, UCC or CRU could be "gamed" by unsavory cloud providers attempting to pull an "Enron", this is why we would need to create an auditable specification which includes a "certified measurement" to address this kind of cloud bench marking. A potential avenue is IBM's new "Resilient Cloud Validation" program, which I've come to appreciate lately. (Sorry about my previous lipstick on pig remarks) The program will allow businesses who collaborate with IBM to perform a rigorous, consistent and proven program of benchmarking and design validation to use the IBM logo: "Resilient Cloud" when marketing their services. These types of certification programs may serve as the basis for defining a level playing field among various cloud providers. Although I feel that a more impartial trade group such as the IEEE may be a better entity to handle the certification process.

More Stories By Reuven Cohen

An instigator, part time provocateur, bootstrapper, amateur cloud lexicographer, and purveyor of random thoughts, 140 characters at a time.

Reuven is an early innovator in the cloud computing space as the founder of Enomaly in 2004 (Acquired by Virtustream in February 2012). Enomaly was among the first to develop a self service infrastructure as a service (IaaS) platform (ECP) circa 2005. As well as SpotCloud (2011) the first commodity style cloud computing Spot Market.

Reuven is also the co-creator of CloudCamp (100+ Cities around the Globe) CloudCamp is an unconference where early adopters of Cloud Computing technologies exchange ideas and is the largest of the ‘barcamp’ style of events.

IoT & Smart Cities Stories
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...