Sam Walker

VCP-NV 6 Blueprint Section 2 Notes

Blog Post created by Sam Walker on Jul 13, 2015

Section 2 – Describe VMware NSX Physical Infrastructure Requirements

Objective 2.1 – Define Benefits of Running VMware NSX on Physical Network Fabrics

Identify physical network topologies (Layer 2 Fabric, Multi-Tier, Leaf/Spine, etc.)

L2 Fabric (all hosts on same broadcast domain):


Existing / traditional three tier networks:

Traditional distributed architecture; main core network connecting to distribution switches, then to top of tack switches. Routing higher up the stack. 


Smaller (and virtualised environments), more likely a distributed core setup:



Now, moving towards Leaf and Spine architecture – more scalable, routing can happen at the leaf layer.  No VLAN trunking between leaves and spine switches, so leaf switches make a routing decision and can forward on to spine switches if necessary.



This is a non-blocking fabric model, best used with equal cost multipathing (ECMP).

Also known as a Clos / TRILL fabric.


Identify physical network trends

  • Moving away from traditional Core / Dist / Access towards Leaf and Spine architecture. 
  • L3 being moved away from distribution layer to the TOR layer (Leaf switches).
  • L2 broadcast being kept within a single rack (with dedicated racks for each particular purpose – i.e., a compute rack, edge rack, management & compute rack, etc.)


Explain the purpose of a Spine node

One of a handful of switches that sits at the top of the network stack and interconnects all leaf nodes.  Only leaf nodes connect to the spine nodes and no clients / servers.  

Explain the purpose of a Leaf node

A leaf node will sit in at the top of rack, normally with two switches for resiliency.  Leaf nodes connect all hosts within the rack.

Identify virtual network topologies (Enterprise, Service Provider Multi-Tenant, Multi-Tenant




Service Provider multi-tenant:


Multitenant Scalable:



Explain benefits of Multi-Instance TCP/IP stack

Allows multiple environments to run on the same physical hardware, i.e., Microsegmentation.

Describe challenges in a Layer 2 Fabric topology

All hosts on the same broadcast domain, allows no segmentation or security offered by VLANs.  All communication via MAC  addresses; the switch maintains a MAC address table which has a physical address for all nodes.  Small deployments only due to limitations of the ARP table. 

Describe challenges in a Multi-Tier topology

Requires a huge investment initially in core infrastructure. If sized incorrectly, once saturated, requires another huge investment.  Excessive traffic flows higher up the stack as access switches handle L2 only so routing is handled by Distribution switches.  Requires more switching infrastructure.  Uses STP so is highly resilient, but keeps 50% capacity reserved for a failure.

Describe challenges in a Leaf/Spine topology

Slow / difficult to migrate to without a greenfield site.

Differentiate physical/virtual QoS implementation

Physical is implemented by switch and/or router. Difficult to manage.

Virtual is tagged by the hypervisor, this is the preferred option.  The distributed switch implements CoS, DSCP Marking, or both

Possible to tag in the VM guest, but admin loses control & NIOC cannot assign separate queue based on the tag


Differentiate single/multiple vSphere Distributed Switch (vDS) Distributed Logical Router


Single = VDS across all compute and management clusters


Multiple = VDS across compute clusters, VDS for management and/or edge.



Multiple is the recommended approach for a number of reasons (taken from the design guide):

  • Flexibility of span of operational control: typically compute/virtual infrastructure admin and network admin are
  • separate entities and thus each domain can manage the cluster specific tasks. These benefits are already a factor in designing a dedicated cluster and rack for specific services and further substantiated by the VDS design choices.
  • Flexibility in managing uplink connectivity on computes and edge clusters - see for example the above discussion on uplink design and the recommendation of using different teaming options for compute and edge clusters.
  • Typically the VDS boundary is aligned with the transport zone and thus VMs connected to logical switches can
  • span the transport zone. However, the vMotion boundary is always limited by the extension of a VDS, so keeping a separate VDS for compute resources ensures that those workloads will never be moved (by mistake or by choice) to ESXi hosts dedicated to other services.
  • Flexibility in managing VTEP configuration – see above discussion on VTEP design choices.
  • Avoiding exposing VLAN-backed port-groups used by the services deployed in the edge racks (NSX L3 routing and NSX L2 bridging) to the compute racks.



Differentiate NSX Edge High Availability (HA)/Scale-out NSX NSX Edge HA implementations

  1. 1.       Active-Standby - Enabled with a check-box, automatically creates a secondary standby VM which will failover within 15 secs (15 secs is the default Declared Dead Time, can be tuned down to 6 seconds via UI or API). 
    1. a.       Extend OSPF or BGP hello/hold time to 40/120 seconds so there is enough time between failure and standby starting before the expiration of the hold-timer.
    2. b.      In the event of a failure, the standby ESX edge will send GARP (gratuitous ARPs) while external devices update mapping information.
    3. c.       Exchanges information between Active and Standby on the 1st internal vNIC (by default), can be changed.
  2. 2.       Standalone – Deploy two independent ESGs.  The option in NSX 6.0 for Active-Active (as no ECMP, see below). 
    1. a.       Hello/hold timers reduced down to 1/3 seconds to minimise the impact of a node failure.
    2. b.      No logical services are supported (so routing only) as there is no statefulness between nodes.
      • Leverage DFW and one-arm LB
      1. c.       In the event of Active Control VM failure, both edge devices flush their forwarding tables which can impact North-South traffic. Answer = static routes on ESG.
  3. 3.       ECMP (Equal cost multipathing) – deploy up to  8 ESGs per tenant (so up to 80Gbps throughput) and use ECMP to distribute load between all nodes evenly.  In the event of a failure, the traffic will be reduced by only the affected node (i.e., in an 8 node cluster, 12.5% capacity will be lost, in a 4 node cluster, 25% capacity will be lost) – smaller failure impact.
    1. a.       Hello/hold timers reduced down to 1/3 seconds to minimise the impact of a node failure.
    2. b.      No logical services are supported (so routing only) as there is no statefulness between nodes.
      • Leverage DFW and one-arm LB
      1. c.       In the event of Active Control VM failure, both edge devices flush their forwarding tables which can impact North-South traffic. Answer = static routes on ESG.
      2. d.      Use 2x uplinks to 2x VLANs for resiliency is recommended

Differentiate Collapsed/Separate vSphere Cluster topologies

Recommended to keep clusters (and racks) separated out into their particular roles.  I.e., management, edge, compute, etc.  Allows more granular control over each particular role cluster (installation of NSX components, use of a common teaming / failover method at the physical layer (must be uniform across VDS).  This is particularly important for the management cluster.  This avoids circular dependencies

Collapsed clusters are more common for smaller deployments, although NSX design dictates that a management cluster of 3 hosts in recommended.  

Differentiate Layer 3 and Converged cluster infrastructures

                L3 =


Allows future growth, can scale in line with requirements. VLANs stay within a single rack, so intra-VLAN (i.e., within a single rack) access stays within a leaf, communication between VLANs heads to L3 TOR.


Converged =

I think the point that is being made here is that in some converged infrastructures, this architecture will limit the rack scalability – so all VLANs scale across all racks.  Limited scale.



Objective 2.2 – Describe Physical Infrastructure Requirements for a VMware NSX Implementation

Identify management and edge cluster requirements

3 hosts so that in the event of a host failure, Controller cluster quorum isn’t lost.             

Describe minimum/optimal physical infrastructure requirements for a VMware NSX


  • Minimum – MTU 1550
  • Optimal – MTU 1600, Multicast, IGMP snooping, simple, scalable, resilient, high-bandwidth, QoS.  Leaf and spine architecture to offer future growth options

Describe how traffic types are handled in a physical infrastructure

            L2 – single broadcast domain – hosts communicate with each other using their MAC address.

L3 – multiple broadcast domains – joined by a common router (or series of). Traffic comms is IP based

Determine use cases for available virtual architectures

  • Enterprise
    • Used for ‘in-house’ companies, single VMware management team, various development teams that may need  to offer a microsegmentation & improved network flexibility.
    • Service Provider Multi-Tenant
    • Multi-Tenant scalable

Describe ESXi host vmnic requirements

                1GB = 4x NICs

                10GB = 2x NICs


10GB adapters recommended as VMware’s performance test are based on Intel 10Gb NICs

Differentiate virtual to physical switch connection methods

  • ESG – Edge Services Gateway. Acts as a point for north-south traffic. Has an uplink adapter which will connect to northbound network devices (corporate / internet router, etc).
  • L2 Bridge -  resides on DLR – Connects a VXLAN to VLAN (More in section 5.3)

Describe VMkernel networking recommendations

A separate VMkernel adapter will be installed for VTEP. From the design guide:


When deploying NSX over a routed fabric infrastructure, the VTEP interfaces for hosts connected to different compute and edge racks must be deployed as part of different IP subnets (the so called “VXLAN Transport Subnets”). Combining this with the recommendation of striping compute and edge clusters across separate racks brings an interesting deployment challenge. Provisioning VXLAN on ESXi hosts is in fact done at the cluster level and during this process the user must specify how to address the VTEP interfaces for the hosts belonging to that cluster (see previous Figure 91).

The challenge consists of being able to assign to the VTEP interfaces IP addresses in different subnets. This can be achieved in two ways:

  1. 1. Leveraging the “Use DHCP” option, so that each VTEP will receive an address in the proper IP subnet, depending on the specific rack it is connected to. This is the recommended approach for production deployment scenarios.
  2. 2. Statically assigning IP addresses to the VTEPs on a per host basis, either via ESXi CLI or via the vSphere Web Client UI. The recommendation in this case is to select the DHCP option when provisioning VXLAN to an ESXi cluster, wait for the DHCP process to time-out and then statically assign the VTEP address. Obviously this can be done only for smaller scale deployments and this approach should be limited to lab environments.””