The growth of data centers and the concept of infrastructure as a service is leading to significant focus on cloud computing architectures. Pundits have proclaimed cloud computing as the ultimate merger of information technology with communication. The promise of cloud computing is immense — it aims to create a virtual world of applications, giving the user an unparalleled computational power using a simple front-end and a decently fast broadband connection.
A differentiator among cloud and grids, utility computing, service oriented networking, and application networks has been the scalability of the former in a business setup. What this means is that cloud computing has a more appealing business driver for the general IT audience that the others never had. Cloud computing is also in line with the roadmaps of tomorrow's business executives — being able to access data and applications across handhelds that are mobile, energy efficient, secure and yet data-intensive.
Essentially, when we distribute computational needs across a network, there are many aspects of the computational entities that can be provided as a service — Software as a Service (SaaS), Infrastructure as a Service (IaaS), Platforms as a Service (PaaS) and many others. In order to abstract the offerings of a cloud computing environment, we need to understand the composite entities that make up a "cloud" — data, storage, processors, and the network, and finally a super-control plane. What this implies is that any service that the cloud can offer is essentially a derivative of one or more of these entities. Our goal is to analyze what the network provides to these entities and how different configurations can exist.
From a perspective of network resources (storage, switches, and processors), the following lists of tasks are mandated by a cloud computing application: virtualization at various layers in the end-solution, network orchestration, automation of consolidation and virtualization functions, integration of multiple standalone and diverse solutions to obtain scalable cloud computing, and finally automation of networking entities and sub-modules to meet end-to-end cloud computing needs. We will now consider each of these networking tasks and what each layer must do to meet these.
Virtualization: Virtualization implies the ability to make available resources across disparaging physical entities with a view to enabling virtual connections to these resources. When resources are virtualized, we can achieve a higher dimension of performance. For example data may be backed up in multiple data centers but to an end-user, the backed-up data is in a single "virtual disk." Similarly, an end-user might want to crunch numbers for his pet application using a super computer. The network, in conjunction with a parallel message-passing paradigm, may support this application over several hundred server blades, many of which exist in different locations. This is known as server virtualization, or processor virtualization. Graphically, virtualization could be one-to-many, or could be many-to-one, and even in some cases many-to-many.
In the one-to-many paradigm, a single customer is connected to multiple storage or multiple servers virtualized across a network. It is the responsibility of the network to provide the virtualization of resources (across the network). This intra-data center problem has been widely studied and is known to be efficiently solved for static traffic demands. However, as the demands become dynamic, the problem assumes enormous complexity and solutions are suboptimal and non-trivial. Achieving hard- SLAs (service level agreements) is difficult in such situations, and often a heuristic approach using load balancers to create virtual servers are proposed as approximate solutions. The virtualization can occur at the network, data, or transport (physical) layer. Also, communication between data-centers — storage and process centers — needs significant provisioning and control plane efforts. Synchronous back up between data-centers and distributed processing between multiple server blades in different locations is paramount to the success of this approach. In the many-to-one paradigm, multiple end-users are connected to a single network-attached-entity, such as a data-center. The engineering problem lies in segregating and creating boundaries within the data-center to meet the service level agreements of each of the customers. A second problem is when customers are not static and move between the membership domains of data-centers.
The many-to-many virtualization case is quite complicated with a static and a dynamic membership option. In the static many-to-many virtualization option, a larger number of end-users use services provided by multiple network-attached entities. The dynamic option involves a fluid set of end-users and/or network attached devices. Many-to-many virtualization is significantly complex, requiring immense network control.
Consolidation of Resources: By definition, consolidation of resources implies the ability of a network to facilitate virtualization. At a certain level of abstraction, consolidation and virtualization are a paradox when we consolidate resources. We cannot do one without the other. Consolidation implies optimizing network resources and reducing footprint. If virtualization tells us how to implement one-to-many, then consolidation helps us to compute how many is, indeed, many! Hence, consolidation is a micro-engineering problem as compared to virtualization. At the network layer, the routing/switching fabric is well entrenched to provide consolidated services, complete with service differentiation, multi-layer support and resiliency. Consolidation at the data layer is around the corner. Startups claim to consolidate resources using customized ASIC designs that lead to data-layer consolidated solutions (switches). A lot of emphasis needs to go into the design of layer-2 switches for cloud computing. To that end, the proposed Carrier Ethernet solutions may go a long way in making the switches more application aware, better managed (not just telecom-wise, but also application wise) and cost-efficient.
Network Orchestration: Orchestration is an important characteristic for cloud computing. Orchestration implies creation of networked entities that facilitate successful implementation of cloud computing. The key here is in facilitating this communication, for which we require a common control plane. The control plane orchestrates network function among disparate entities, thereby provisioning a cloud computing service. To do so, it is necessary to create an interaction between the end-user requirements (application) and the network entities, all done by the control plane. Orchestration requires cross-layer and cross-domain intelligence. Once again, achieving such functionality is best suited for the data layer with VLAN (virtual LAN) tags and MPLS (multiprotocol label switching) labels defined in a proprietary manner.
Automation of Consolidation and Virtualization Functions: Scalability implies automation in virtualization of resources and orchestrating services ondemand. Automation is achieved through a user defined control plane that is subject to network-wide acceptance. Hence, at one level the control plane must have user interfaces with rich graphical content and programmable knocks for user definition of services, and at another level it must be standardized and accepted by the most basic of network elements to be able to control these boxes. If we do not automate the collection of storage, servers, and network elements to support cloud computing, then we cannot achieve scalability, without which the benefits of cloud computing are clearly missing. In contrast, if we do automate, then the solution can become relatively generic, implying that it can be sub-optimal (in performance) or cannot meet the specific needs of a customer.
Integration of Sub-Modules: The last aspect of cloud computing that the network must take into account is completeness. Often when tasks are divided across a network, they are managed by different entities and it is the job of the network to provide fault tolerant paths between these entities or sub-modules, integrating these to produce a seamless cloud computing environment. Pro - tection and restoration functions in a cloud are best dealt with at the lower layers with layer-2 fast-reroute or layer-1 line-protection.
In particular, WDM (wavelength-division multiplexing) at the optical layer, SONET/SDH or Ethernet at the data layer, and IP at the network layer are the dominant choices in the metro network. What this implies is that an IP mesh network consisting of core routers at select locations forms a meshed IP network. This is supported by SONET/SDH rings or Carrier Ethernet which is further provisioned on WDM rings. The protocol stack is subjected to different classes of network elements: end-users, network elements (for transport, switching etc.), storage elements, and processors (server blades).
Network Change While efficient planning can do a great deal to reduce CAPEX (capital expenditures), the mechanism can fail for on-demand service provisioning. The key is to get the fundamentals of networking correct when it comes to supporting cloud computing applications. The triple mantras of virtualization- consolidation-orchestration are to be followed at every step of the network design. Classically the cloud is a virtual topology, an overlay over a physical mesh of layer 1-4 boxes. As the above figure shows, it is a tic-tac-toe game among storage, processors, end-users on one side, and the network elements — optical, switches, routers — at the other side. As each of these two entities are put-forward, the resultant must be an element of the virtualization-consolidation- orchestration mantra. This will ensure low-CAPEX and low-OPEX (operating expenses).
It is imperative to put forth a word on OPEX. Cloud OPEX can be saved by focusing on one or more of the following issues: energy efficiency, footprint of the solution, optimally loaded CPUs, keeping data in lower-layer format, and availability of resources for traffic churn (capacity planning). There are issues such as security and interoperability that continue to evade a critical mass of support. Another aspect of cloud computing is private clouds vs. public clouds. As the name suggests private clouds are within the privy of an enterprise domain, while public clouds have universal membership.
Conclusions and Future Challenges Clearly, there is a need for a cloud- control- plane that enables greater control over the consolidation-virtualization- orchestration paradigms. We should also acknowledge that cloud computing is possible because of virtualizations in the control plane, the data plane, and the management plane. We should also understand network characteristics that must be preserved, and in certain cases highlighted, when the network is an integral aspect of a cloud. Latency and jitter are of paramount importance, especially when clouds are designed for real-time transactions. What is more important is to create a dual interface (for the user and the network) for application aware communication. The generality of the user demands is an impediment to such a control interface. A user-definable application oriented universal language like UML (but with advances) needs to be rapidly standardized so that applications control the underlying network, not just for passage of data but for more cloud-like tasks such as virtualization, consolidation and orchestration. The control plane must be automated to achieve scale, for which universal acceptance is mandatory. So while cloud computing may be the merger of IT and communications, it would be telecommunication- heavy, following the path of heavy standardization.
This article was written by Ashwin Gumaste, Ph.D., Visiting Scientist, MIT (Cambridge, MA), and Tony Antony, Sr. Marketing Manager, Cisco Systems (San Jose, CA). For more information, contact Dr. Gumaste at