Containers Expo Blog Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, Pat Romanski, Amit Gupta

Related Topics: Containers Expo Blog, @CloudExpo

Containers Expo Blog: Blog Feed Post

The Level of Uptime - Increasing Pressure Syndrome

We are now making the same expectations of everyday software that we made of the over-engineered systems

When I was earning my bachelors, I joined the Association for Computing Machinery (ACM) and through them, several special interest groups. One of those groups was SIGRISK, which focused on high-risk software engineering. At the time the focus was on complex systems whose loss was irretrievable – like satellite guidance systems or deep sea locomotion systems – and those whose failure could result in death or imageinjury to individuals – like power plant operations systems, medical equipment, and traffic light systems. The approach to engineering these controls was rigorous, more rigorous than most IT staff would consider reasonable.

And the reason was simple. As we’ve seen since – more than once - a space ship that has one line of bad code can end up veering off course and never returning, the data it collects completely different than that which it was designed to collect. Traffic lights that are mis-programmed offer a best-case of traffic snarls and people late for whatever they were doing, and a worst case of fatal accidents. These systems were categorized by the ACM as high-risk, and special processes were put in place to handle high-risk software development (interestingly, processes that don’t appear to have been followed by space agencies – who did a lot of the writing and presenting for SIGRISK when I was a member).  Interestingly, SIGRISK no longer shows up on the list of SIGs at the ACM website. It is the only SIG I belonged to that seems to have gone away or been merged into something else.

What interests me in all of this is a simple truth that I’ve noticed of late. We are now making the same expectations of everyday software that we made of these over-engineered systems designed to function even in the face of complete failure. And they’re not designed for this level of protection. Think about it a bit, critical medical systems can be locked down so that the only interface is the operator’s interface, and upgrades are only allowed with a specific hardware key, things being launched into space don’t require serious protection from hackers, they’re a way out of reach, traffic lights have been hacked, but they’re not easy, and the public nature of the interfaces makes it difficult to pull off in busy times… But Facebook and Microsoft? They have massive interfaces, global connectedness, and by definition IT staff tweaking them constantly. Configuration, new features, uncensored third party development… The mind spins.

Ariane 5 courtesy of SpaceFlightNow.com

Makes me wonder if Apple (and to a lesser extent RIM) wasn’t smart to lock down development. RIM has long had a “you have a key for all of your apps, if you want to touch protected APIs, your app will have your key, and if you are a bad kid and crash our phones, we’ll shut off your key”. Okay, that last bit might be assumed, it’s been a while since I read the agreement (I’ve got a RIM dev license), but that was the impression I was left with three years ago when I read through the documentation. Apple took a lot of grief for their policies, but seriously, they want their phone to work. Note that Microsoft often gets blamed for problems caused by “rogue” applications.

But it doesn’t address the issue of software stability in a highly exposed, highly dynamic environment. We’re putting pressures on IT folks – who are already under time imagepressure – that used to be reserved for scientists in laboratories. And we’re expecting them to get it right with an impatience more indicative of a two year old than a pool of adults. Every time a big vendor has a crash or a security breech, we act like they’re idiots. Truth is that they have highly complex systems that are exposed to both inexperienced users and experienced hackers, and we don’t give them the years of development time that critical systems get.

So what’s my point? When you’re making demands of your staff, yes, business needs and market timing are important, but give them time to do their job right, or don’t complain about the results. And in an increasingly connected enterprise, don’t assume that some back-office corner piece of software/hardware is less critical than user-facing systems. After all, the bug that bit Microsoft not too long ago was a misconfiguration in lab systems. I’ve worked in a test lab before, and they’re highly volatile. When big tests are going on, the rest of the architecture can change frequently while things are pulled in and returned from the big test, complete wipe and reconfigure is common – from switches to servers – and security was considered less important than delivering test results. And the media attention lavished on the Facebook outage in September is enough that you’d think people had died from the failure… Which was caused by a software configuration change.


www.TechCrunch.com graphic of Facebook downtime

Nice and easy. Don’t demand more than can be delivered, or you’ll get sloppy work, both in App Dev and in Systems Management. Use process to double-check everything, making sure that it is right. Better to take an extra day or even ten than to find your application down and people unable to do anything. Because while Microsoft and Facebook can apologize and move on, internal IT rarely gets off that easily.

Automation tools like those presented by the Infrastructure 2.0 crowd (Lori is one of them) can help a lot, but in the end, people are making changes, even if they’re making them through a push button on a browser… Make sure you’ve got a plan to make it go right, and an understanding of how you’ll react if it doesn’t.

And the newly coined “DevOps” hype-word might be helpful too – where Dev meets Operations is a good place to start building in those checks.

Follow me on Twitter icon_facebook

AddThis Feed Button Bookmark and Share

Related Articles and Blogs


Read the original blog entry...

More Stories By Don MacVittie

Don MacVittie is founder of Ingrained Technology, A technical advocacy and software development consultancy. He has experience in application development, architecture, infrastructure, technical writing,DevOps, and IT management. MacVittie holds a B.S. in Computer Science from Northern Michigan University, and an M.S. in Computer Science from Nova Southeastern University.

@ThingsExpo Stories
SYS-CON Events announced today that Dasher Technologies will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Dasher Technologies, Inc. ® is a premier IT solution provider that delivers expert technical resources along with trusted account executives to architect and deliver complete IT solutions and services to help our clients execute their goals, plans and objectives. Since 1999, we'v...
As popularity of the smart home is growing and continues to go mainstream, technological factors play a greater role. The IoT protocol houses the interoperability battery consumption, security, and configuration of a smart home device, and it can be difficult for companies to choose the right kind for their product. For both DIY and professionally installed smart homes, developers need to consider each of these elements for their product to be successful in the market and current smart homes.
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, will discuss how from store operations...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they b...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...