Welcome!

Containers Expo Blog Authors: Elizabeth White, Liz McMillan, Pat Romanski, Yeshim Deniz, Amit Gupta

Related Topics: Containers Expo Blog, @CloudExpo, SDN Journal

Containers Expo Blog: Blog Post

The Network as a Complex Distributed System By @MartenT1999 | @CloudExpo

Yes, we have gotten better at protecting this distributed animal that is the network

Through http://blog.ipspace.net I landed on this article on acm.org discussing the complexity of distributed systems. Through some good examples, George Neville-Neil makes it clear that creating and scaling distributed systems is very complex and “any one that tells you it is easy is either drunk or lying, and possibly both”.

Networks are of course inherently distributed systems. Most everyone that has managed a good sized network before knows that like the example in the article, minor changes in traffic or connectivity can have huge implications on the overall performance of a network. In my time supporting some very large networks I have seen huge chain reactions of events based on what appear to be some minor issues.

Very few networks are extensively modeled before they are implemented. Manufacturers of machines, cars and many other things go through extensive modeling to understand the behaviors of what they created and their design choices. Using modeling they will look at all possible inputs and outputs, conditions, failure scenarios and anything else we can think of to see how their product behaves.

There are few if any true modeling tools for networks. We build networks with extensive distributed protocols to control connectivity and reachability. The protocols and their behaviors are fairly well understood and we design our networks based on our knowledge of how these protocols will react to failures. There is very little modeling of actual traffic or expected traffic patterns. We have all seen networks melt down because of forwarding loops, excessive multicast or broadcast that affect portions of the networks we had not expected it would affect. And a chain reaction of failures follows. I have seen single transceivers go through soft failures (lots of errors and intermittent link state) and bring large networks to their knees.

Of course we learn from our mistakes (mostly) and created protection mechanisms for the things we know can go wrong. We created “hello” based protocols to guard against those soft transceiver failures. We create rate limiting and overload protection mechanisms to guard against those broadcast and multicast storms. We have implemented dampening mechanisms in many of our protocols to ensure state transitions can be managed more gracefully.

But in the end, like the example in the article, we have created artificial limits and protection mechanisms. And they are based on the things we know, because they have hurt us in the past. Less protective but more pro-active, we have implemented hashing mechanisms in our networks to distribute the traffic. As explained, these are generic mechanisms that are designed for uniform best case scenarios. Provide it with less uniform traffic, and the expected distribution is not what we want it to be. The 2014 favorite example of “elephant flow” is perfect in how a single change in a network can drastically change its behavior and performance.

Yes, we have gotten better at protecting this distributed animal that is the network. We have stubbed our toes, bashed our heads and figured out how to protect against it. But we have not taken it to that next level of engineering. The network is a product. It is a highly distributed system where we know way too little about the inputs. We do not model the network and its behavior. We do not test its resiliency. I have asked this question before in this forum and to customers. Every customer has maintenance windows to do proactive maintenance on servers, OSs and the like. Every school and company has fire drills. I have yet to meet a single customer that actually schedules network failure drills to ensure that the resiliency designed into the distributed system actually performs as expected.

We have to start treating the network like the complex distributed system it is. That means that complex algorithms that manage the distribution of traffic and take failure scenarios into account need to be in charge of our networks. Local decision making based on partially complete information can never be enough to truly create a network that pushes the boundaries of what it is capable of. It is not easy, but we have to recognize what the network is. And then approach it with the right tools and methods to model and manage it.

The post The Network as a Complex Distributed System appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

@ThingsExpo Stories
SYS-CON Events announced today that Dasher Technologies will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Dasher Technologies, Inc. ® is a premier IT solution provider that delivers expert technical resources along with trusted account executives to architect and deliver complete IT solutions and services to help our clients execute their goals, plans and objectives. Since 1999, we'v...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities – ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups. As a result, many firms employ new business models that place enormous impor...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Taica manufacturers Alpha-GEL brand silicone components and materials, which maintain outstanding performance over a wide temperature range -40C to +200C. For more information, visit http://www.taica.co.jp/english/.
SYS-CON Events announced today that TidalScale, a leading provider of systems and services, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale has been involved in shaping the computing landscape. They've designed, developed and deployed some of the most important and successful systems and services in the history of the computing industry - internet, Ethernet, operating s...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
As popularity of the smart home is growing and continues to go mainstream, technological factors play a greater role. The IoT protocol houses the interoperability battery consumption, security, and configuration of a smart home device, and it can be difficult for companies to choose the right kind for their product. For both DIY and professionally installed smart homes, developers need to consider each of these elements for their product to be successful in the market and current smart homes.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...