Akihiro Koizumi

Evolution of Network in Data Center vol.2

Blog Post created by Akihiro Koizumi Employee on Jan 20, 2016

Challenges in Network

 

In the previous post, I have covered changes in Data Center (DC) network.

Server virtualization was brought about into DCs and networking began to face new challenges accordingly.

 

In addition to 'traditional' challenges of networking, which are (1) Lack of agility, (2) Complicated maintenance, (3) High cost, we began to experience new challenges such as (4) MAC address explosion, (5) VLAN limitation, (6) Inefficiency in path utilization.

 

2-0__Screen Shot 2015-10-28 at 3.54.52 PM.png

 

In order to address these new challenges in DC network, several approaches have been proposed.

A counter measure from networking is using tunneling technology and fabric network architecture.

 

 

Counter measures to address new challenges in Network

 

2-1_Screen Shot 2015-11-04 at 11.47.36 AM.png

 

One of the traditional network architecture is 'Fat tree', in which servers are connected to a top of rack (ToR) switch first, ToR switches are connected to aggregation switches and a couple of core switches consolidate all paths from aggregation switches. It is designed for basically North-South traffic which comes into Core switch(es) from outside of DC and eventually reach to servers. In this architecture, when a server tries to communicate with another server in the same data center, traffic will go up to the core switches and come down to another server. The capacity of core switch tends to be a bottleneck, on the other hand other paths at bottom (near servers) tend to be not fully utilized.

 

Recently 'Fabric' or 'Clos' architecture which consists of spine and leaf switches began to be adopted.

Leaf switches are connected to all spine switches, which enables for each server under a leaf switch to communicate with any other server at the same hop count (two hops; the origin leaf --> spine --> the destine leaf).

With this basis, multi-path technology such as Equal Cost Multi Path (ECMP) is used in this architecture. In a traditional layer 2 network, there can not be multi-path since multi-path means a loop in the network (if you can reach to the destination by another path, you can come back by the route. That means there is a loop in topology.) which results in a nightmare of 'storm'. So layer 3 fabric technology (or a new layer 2 fabric technology) will be applied to realize multi-path. (See also additional explanation about fabric technologies at the bottom.)

In a nutshell, by combining Clos architecture and fabric technologies leveraging ECMP, the challenge no.6, inefficiency in path utilization, can be solved.

 

Fabric network can be a basis of networking (physical network, often referred as underlay network), and overlay network technologies such as VXLAN are used up on the fabric network. In fact, many fabric switches support various overlay network technologies.

By using tunneling technology of overlay network, MAC address explosion (challenge no.4) can be solved. MAC address is capsulated into IP packet and IP network scales. Similarly, challenge no.5, VLAN limitation can be also solved by tunneling technology since it uses another ID for isolation and the number of ID is far more than VLAN ID.

 

Tunneling technology is not only for physical network gears but also virtual switches which reside by virtual machines. So let's look at it from server side.

 

Virtual switch, which was brought about by server virtualization, supports overlay tunneling technology such as VXLAN, NVGRE, STT. These protocols were introduced to overcome VLAN limitation by extending field of virtualization identifier (ID). These technologies use 24 bits ID space instead of 12 bits of VLAN ID space and realize enormous number of isolation.

Thanks to these technologies, virtual machine can be separated into a large number of logically dedicated networks.

 

2-2_Screen Shot 2015-11-04 at 1.04.17 PM.png

 

Actually, if we use overlay technology at server side, network does not have to be changed. However since both network changes in network side (Clos architecture with ECMP) and server side (Overlay network) have happened in parallel, we see both in modern DCs.

 

 

In this post, we covered new challenges that we began to face after server virtualization and the ways to address them.

However, 'traditional' challenges in networking, which are (1) Lack of agility, (2) Complicated maintenance, (3) High cost, remained unsolved.

2-3_Screen Shot 2015-11-04 at 1.17.57 PM.png

 

Software Defined Network (SDN) was introduced to address these traditional challenges.

 

Fabric network is a kind of new idea which emerged to address challenges in DC network, but each element which consists it is not brand-new.

Clos architecture was first formalized back in 1952, traditional dynamic routing protocols such as OSPF, BGP, etc. can be used to realize ECMP, and there have been several overlay technologies (e.g. Provider Backbone Bridging or PBB) since many years ago.

Those who are interested in innovation might have heard of the concept “neue Kombinationen” or new combination by Joseph Schumpeter.

I think this is a good example that a new combination of known things brings innovation. That’s why it is said important to meet and talk with people with different backgrounds ;-).

 

Stay tuned.

 

 

* Fabric technology

Some of you might have noticed that there are other way to realize fabric or clos architecture.

Yes, that's true.

I picked up an example of ECMP + overlay combination for fabric, but there are some other options which include…

 

A. Layer 2 (L2) clos with MLAG (Multi-chasis Link Aggregation) :

Each leaf switch has two (or more) connections to spine switches with MLAG technology. The two connections can be used as Active-Active mode. Possible downside of this might be that MLAG implementation is often proprietary, and MAC table explosion issue will not be solved. VLAN limitation also remains to exist.

 

B. L2 Fabric with TRILL (Transparent Interconnection of Lots of Links) or SPB (Shortest Path Bridging) :

All switches, both spine and leaf, are connected to each other in mesh topology. Routing and ECMP at layer 2. Both TRILL (IETF) and SPB (IEEE802.1aq) are processed standardization process, but some of the vendor implementation are proprietary.

 

C. L2 Fabric with proprietary technology:

Some vendors provide their proprietary fabric technology to enable the set of switches as a one switch.

 

D. Layer 3 (L3) Fabric with ECMP:

(Layer 3) ECMP is a technology to balance the traffic and improve network utilization efficiency by five tuple (Source IP address, Destination IP address, Source Port number, Destination Port number and Protocol).

 

E. L3 Fabric with ECMP and overlay (L2 over L3)

This is what I picked up as an example. Adding overlay technology such as VXLAN to L3 Fabric with ECMP. It obtains VM mobility as well (VM mobility usually requires Layer 2 connectivity in order to use the same IP address at the target location.

Outcomes