Welcome!

Linux Containers Authors: Elizabeth White, Pat Romanski, Akhil Sahai, AppDynamics Blog, Kareen Kircher

Related Topics: SDN Journal, Java IoT, Microservices Expo, Linux Containers, Containers Expo Blog

SDN Journal: Blog Feed Post

SDN's Eventually Consistent Network Problem

Clustering controllers to address scalability concerns introduces a well-understood problem: consistency

One of the benefits of SDN is centralized control. That is, there is a single repository containing the known current state of the entire network. It is this centralization that enables intelligent application of new policies to govern and control the network - from new routes to user experience services like QoS. Because there is a single entity which has visibility into the state of the network as a whole, it can examine the topology at any given point and make determinations as to where this packet and that should be routed, how it is prioritized and even whether or not it is allowed to traverse the network.

It's a pretty powerful concept for networks, which traditionally distribute network state as individual configuration files across the data path.

network-state-traditional

Most of the focus of SDN is on the replacement of manual and scripted configuration methods with an API-driven mechanism. Whether that's OpenFlow or OpFlex or some other protocol is not really important as the benefit of operationalization is to provide a consistent interface from the perspective of the operator, not the device.

network-state-sdn

This is a real benefit; operationalization across operations and dev has proven to produce tangible benefits in the form of improved time to market and a reduction in errors. By centralizing network state in a controller, this model provides a comprehensive view of the network at any given moment. Because the controller is not just a repository but an active participant in the flow of data across the network, this visibility enables the controller to understand how to (ostensibly) non-disruptively change routes or apply new policies in real-time.

The benefit itself is not in question. What is in question is what happens when the controller of this new software-defined architecture becomes overwhelmed, and how to preserve that benefit when the centralized model must decentralize in order to scale.

The Eventually Consistent Problem Comes to the Network

Eventual consistency is nothing new. It has always been an issue when scaling applications, particularly those that rely on shared data. Consider Amazon, if you will. If you and I are both shopping for the same thing, and I order before you, it may take seconds or more before the database is updated. If you were in the middle of ordering at the same time, you and I may be contending for the same item. Because my order takes a moment or two to propagate through the system, your view of the database (the availability of the item) is inconsistent with mine.

It is assumed that eventually our views will be consistent, and that this age old unsolved problem of distributed computing simply must be accepted as unsolvable for now,  Thus systems are designed with this principle in mind. Which means we end up back with Brewer's CAP Theorem staring us in the face and reminding us we can't be perfectly consistent in a distributed system, so we must deal with systems in such a way as to achieve eventually consistency.

At issue is the ability of a software controller to scale. The controller is, by design and necessity, part of the data path. That is both a blessing and a curse. It is from this fact that the real-time adaption of network behavior can be achieved, but it is also this fact which forces issues of scale and introduces the need for a distributed system from which the problem of eventual consistency derives. That's because more than one system will be the "master" repository for a given portion of network state. Even if one controller is designated as master of the network universe and thus maintains the "official" state of the network, there are those moments when the secondary (or tertiary) controller has modified the "official" state and introduces inconsistency. In the moments between when the two network states merge, there is the possibility that the first (master) controller will also try to make a decision based on information that relies on network state that is no longer valid. If Controller B, for example, removes a port from a VLAN, and before that state can propagate to the master, a packet arrives in the fabric, destined for that port, Controller A will have no way to know that it is no longer participating in the VLAN and will, as expected, tell the switch to route to that port.

The issue will be shortly resolved, assuming timely synchronization of network state across the cluster, but in the meantime performance (or availability) may be negatively impacted.

clustered-sdn

The problem with eventual consistency in the network is one of magnitude. Eventually consistent views of books in stock at Amazon has a very different impact than an eventually consistent view of the network underpinning today's applications and ultimately the business. We're not talking about losing out on a book, we're talking about potentially disrupting hundreds or thousands of applications that translates into hundreds of thousands or even millions of dollars. Ponemon's 2013 Cost of Data Center Outages proves this case out: "The average reported outage incident length was 86 minutes, resulting in average cost per incident of about $690,200."

Eventual consistency of the network may turn out to be quite costly.

Common Themes: Reliability and Control

This is not a new problem. This issue of stateful failover as applied to scalability of both infrastructure and applications is one that application delivery has been dealing with, well, for over a decade now. The issue when dealing with distributed state is always one of replication and synchronization between those devices providing for reliability. That doesn't change just because we move from one form factor to another, or from on-premise to cloud. The issue remains: how do we maintain an authoritative view of the state of an <application or network> while still enabling the scale necessary to meet demand?

While we (as in the industry "we") recognize that true stateful reliability - and thus perfect consistency - is currently unachievable due to the constraints of distributed system design, we also recognize that we can get pretty darn close. From an application perspective, the intelligence embedded in a service fabric is more than able to deal with the problem with minimal introduction of latency. That is, there will be a slight pause and some disruption when failure or disruption occurs in the network but if the service fabric is smart enough, the disruption is experienced by the end user as no more than a slight hiccup - likely unnoticeable.

But the further down the stack you go, toward core network function, the more disruptive such a hiccup is going to be.

That's one of the reasons a "centralized control, decentralized execution" architecture makes more sense from a network perspective. Such a model maintains authoritative control over the state of the network, but empowers individual components in the various fabrics (stateless L2-4 and stateful L4-7) that make up "the network" to maintain its own prescriptive configuration and take action when necessary based on the abstracted policies of the network as a whole.

Everyone likes to posit an answer to what will be the "killer app" for SDN. But before we can worry about that, we might want to consider what may be the "showstopper" obstacles for SDN. Eventual consistency when scaling controllers is one of those issues.

Because without a reliable and consistent network world, there is no application world. Or at least not one that users will be excited to rely on.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

@ThingsExpo Stories
On Dice.com, the number of job postings asking for skill in Amazon Web Services increased 76 percent between June 2015 and June 2016. Salesforce.com saw its own skill mentions increase 37 percent, while DevOps and Cloud rose 35 percent and 28 percent, respectively. Even as they expand their presence in the cloud, companies are also looking for tech professionals who can manage projects, crunch data, and figure out how to make systems run more autonomously. Mentions of ‘data science’ as a skill ...
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Amazon has gradually rolled out parts of its IoT offerings in the last year, but these are just the tip of the iceberg. In addition to optimizing their back-end AWS offerings, Amazon is laying the ground work to be a major force in IoT – especially in the connected home and office. Amazon is extending its reach by building on its dominant Cloud IoT platform, its Dash Button strategy, recently announced Replenishment Services, the Echo/Alexa voice recognition control platform, the 6-7 strategic...
For basic one-to-one voice or video calling solutions, WebRTC has proven to be a very powerful technology. Although WebRTC’s core functionality is to provide secure, real-time p2p media streaming, leveraging native platform features and server-side components brings up new communication capabilities for web and native mobile applications, allowing for advanced multi-user use cases such as video broadcasting, conferencing, and media recording.
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
The cloud market growth today is largely in public clouds. While there is a lot of spend in IT departments in virtualization, these aren’t yet translating into a true “cloud” experience within the enterprise. What is stopping the growth of the “private cloud” market? In his general session at 18th Cloud Expo, Nara Rajagopalan, CEO of Accelerite, explored the challenges in deploying, managing, and getting adoption for a private cloud within an enterprise. What are the key differences between wh...
There will be new vendors providing applications, middleware, and connected devices to support the thriving IoT ecosystem. This essentially means that electronic device manufacturers will also be in the software business. Many will be new to building embedded software or robust software. This creates an increased importance on software quality, particularly within the Industrial Internet of Things where business-critical applications are becoming dependent on products controlled by software. Qua...
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to imp...
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develo...
SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.
Large scale deployments present unique planning challenges, system commissioning hurdles between IT and OT and demand careful system hand-off orchestration. In his session at @ThingsExpo, Jeff Smith, Senior Director and a founding member of Incenergy, will discuss some of the key tactics to ensure delivery success based on his experience of the last two years deploying Industrial IoT systems across four continents.
Basho Technologies has announced the latest release of Basho Riak TS, version 1.3. Riak TS is an enterprise-grade NoSQL database optimized for Internet of Things (IoT). The open source version enables developers to download the software for free and use it in production as well as make contributions to the code and develop applications around Riak TS. Enhancements to Riak TS make it quick, easy and cost-effective to spin up an instance to test new ideas and build IoT applications. In addition to...
Identity is in everything and customers are looking to their providers to ensure the security of their identities, transactions and data. With the increased reliance on cloud-based services, service providers must build security and trust into their offerings, adding value to customers and improving the user experience. Making identity, security and privacy easy for customers provides a unique advantage over the competition.
"We've discovered that after shows 80% if leads that people get, 80% of the conversations end up on the show floor, meaning people forget about it, people forget who they talk to, people forget that there are actual business opportunities to be had here so we try to help out and keep the conversations going," explained Jeff Mesnik, Founder and President of ContentMX, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"There's a growing demand from users for things to be faster. When you think about all the transactions or interactions users will have with your product and everything that is between those transactions and interactions - what drives us at Catchpoint Systems is the idea to measure that and to analyze it," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York Ci...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
I wanted to gather all of my Internet of Things (IOT) blogs into a single blog (that I could later use with my University of San Francisco (USF) Big Data “MBA” course). However as I started to pull these blogs together, I realized that my IOT discussion lacked a vision; it lacked an end point towards which an organization could drive their IOT envisioning, proof of value, app dev, data engineering and data science efforts. And I think that the IOT end point is really quite simple…
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.