As a former CISCO UCS Architect I'm observing VXLAN
initiative almost 2 years so I was looking forward to do the real customer
project. Finally it is here. I'm working on vSphere design for vCloud Director (vCD). To be honest I'm responsible just for vSphere design and someone else is doing vCD Design because I'm not vCD expert and I have just conceptual and high-level vCD knowledge. I'm not planning to change it in near future because I'm more focused on next generation infrastructure and vCD is in my opinion just another software for selling IaaS. I'm not saying it is not important. It is actually very important because IaaS is not just technology but business process. However nobody knows everything and I leave some work for other architects :-)
We all know that vCD sits on top of vSphere providing multi-tenancy and other IaaS constructs and since vCD 5.1 the network multi-tenancy segmentation is done by VXLAN network overlay. Therefore I have finally opportunity to plan, design and implement VXLANs for real customer.
Right now I'm designing network part of vSphere architecture and I describe VXLAN oriented design decision point bellow.
VMware VXLAN Information sources:
Design decision point:
What type of NIC teaming, loadbalancing and physical switch configuration to use for VMware's VXLAN?
Requirements:
Option comparison:
Based on available information options 3 and 4 complies with requirements and constraints. Option 3 is better because network traffic is load balanced across physical NICs. That's not a case for option 4.
Other alternatives not compliant with all requirements:
I hope some information in constraints C2, C3, and C4 are wrong and will be clarified by VMware. I'll tweet this blog post to some VMware experts and hope someone will help me to jump out from the decision circle.
If you have any official/unofficial topic related information or you see anything where I'm wrong, please feel free to speak up in the comments.
Updated 2013-09-11: Constraint C4 doesn't exists and VMware doc will be updated.
Based on updated information LACP and "Explicit fail-over" teaming/load-balancing is supported for VXLANs. LACP is better way to go and "Explicit fail-over" is alternative in case LACP is not achievable on your environment.
We all know that vCD sits on top of vSphere providing multi-tenancy and other IaaS constructs and since vCD 5.1 the network multi-tenancy segmentation is done by VXLAN network overlay. Therefore I have finally opportunity to plan, design and implement VXLANs for real customer.
Right now I'm designing network part of vSphere architecture and I describe VXLAN oriented design decision point bellow.
VMware VXLAN Information sources:
- S1: VMware vShiled Administration Guide [Official source]
- S2: VMware KB 2050697 [Official source]
- S3: Duncan Epping blog post here. [Unofficial source]
- S4: VMware VXLAN Deployment Guide available here. [Official source]
Design decision point:
What type of NIC teaming, loadbalancing and physical switch configuration to use for VMware's VXLAN?
Requirements:
- R1: Fully supported solution
- R2: vSphere 5.1 and vCloud Director 5.1
- R3: VMware vCloud Network & Security (aka vCNS or vShield) with VMware distributed virtual switch
- R4: Network Virtualization and multi-tenant segmentation with VXLAN network overlay
- R5: Leverage standard access datacenter switches like CISCO Nexus 5000, Force10 S4810, etc.
- C1: LACP 5-tuple hash algorithm is not available on current standard access datacenter physical switches mentioned in requirement R5
- C2: VMware Virtual Port ID loadbalancing is not supported with VXLAN Source: S3
- C3: VMware LBT loadbalancing is not supported with VXLAN Source: S3
- C4:
LACP must be used with 5-tuple hash algorithm Source: S3, S2, S1 on Page 48.[THIS IS STRANGE CONSTRAINT, WHY IT IS HASH DEPENDENT?]
- Option 1: Virtual Port ID
- Option 2: Load based Teaming
- Option 3: LACP
- Option 4: Explicit fail-over
Option comparison:
- Option 1: not supported because of C1
- Option 2: not supported because of C2
- Option 3: supported
- Option 4: supported but not optimal because only one NIC is used for network traffic.
Based on available information options 3 and 4 complies with requirements and constraints. Option 3 is better because network traffic is load balanced across physical NICs. That's not a case for option 4.
Alt 1: Use physical switches with 5-tuple hash loadbalancing. That means high-end switch models like Nexus 7000, Force10 E Series, etc.Alt 2: Use CISCO Nexus 1000V with VXLAN. They support LACP with any hash algorithm. 5-tuple hash is also recommended but not strictly required.
Updated 2013-09-11: Constraint C4 doesn't exists and VMware doc will be updated.
Based on updated information LACP and "Explicit fail-over" teaming/load-balancing is supported for VXLANs. LACP is better way to go and "Explicit fail-over" is alternative in case LACP is not achievable on your environment.
2 comments:
Hi David,
Could be I am misreading this post but there are two supported ways of deploying VXLAN today. My apologies for the VMware documentation not being up to date and my blog post slightly confusing. I have update my post and have requested the documentation and KB to be updated.
Anyway, these are the two options you have:
1) port channel (static / lacp)
2) specified fail-over order
In the case of port channels it is recommended to use 5-tupple hash for load balancing effectiveness. This is no hard requirement though, so if your switches do not support this then it is no problem for VXLAN it might just lead to a less balanced network.
Hope this helps, and again I have update my post and requested Docs + KB to be updated. (this will take time though)
Thanks Duncan for absolutely clear public statement. I would also thanks Fojta who reply to me privately with the same information.
Post a Comment