With VXLAN design, the easiest thing to overlook is how communication occurs between subnets. I think many times, network engineers take for granted that our traffic will flow in a VXLAN environment. And it’s also easy to get confused when trying to figure out traffic routing path between your overlay and underlay.

As I work with customers in designing VXLAN infrastructures, one of the first questions I always ask is: “Where do you expect the gateway of the servers?”

This always leads to one of three designs, which I will outline over the next two posts. Before we start, know that all these designs leverage BGP EVPN. Ethernet Virtual Private Networks (EVPN) are an address family within BGP that are used to exchange VXLAN related information. This blog won’t go into detail about EVPN, but we have previous blogs to help fill in the gap.

With that said, let’s get started with the first VXLAN design example.

The first case is the simplest environment, and that is the gateway on an internet edge service. In this case, the VXLAN acts as a strict L2 overlay, and the L3 routed BGP underlay is hidden from the end hosts and servers.

VXLAN designs

Since Server01 and Server03 are in different subnets, they have to route to their respective gateways to communicate with each other. In the above diagram, the server gateway is the firewall. The firewall has an interface on each VNI, and is the gatekeeper between communication across any VLAN. The firewall can be replaced with any L3 routed device.

Invariably, we have customers that do not have their own dedicated layer 3 device to perform the inter-VLAN routing. For this, we expand this first design into our second design, which is called centralized routing. In centralized routing, the exit switch performs both VXLAN termination and routing. The gateways are SVIs that exist on exit01, and all the servers set the SVIs to be their gateways for their respective vlans. This solution is similar to the previous external gateway solution, the only difference being that the gateway exists on the exit leaf instead of the firewall.

As an aside, I want to stop here to define what VXLAN routing is, as the term can mean different things to different people. VXLAN routing is the process in which a VTEP receives a VXLAN packet destined to itself, removes the VXLAN header and then performs a layer 3 route lookup on the inner decapsulated packet. Since the VTEP has to perform two sets of lookups, first on the encapsulated VXLAN traffic then on the decapsulated inner packet, it requires special hardware ASIC to perform both lookups in a single pass all in hardware. Now back to your regularly scheduled programming…

As demonstrated in the below diagram, instead of the gateways existing on the external firewall, now they are bound as SVIs on the exit01 itself.

VXLAN designs

But we have to be mindful with this solution, as it depends on the hardware ASIC integrated into the switch. In order to leverage the true functionality of centralized routing, we need a capability called routing in and out of tunnels (RIOT). The Trident 2+, Tomahawk and second generation Spectrum ASICs all support RIOT.

There is a unique case where our customers do not have a dedicated L3 routed device and older switch hardware so we end up having to deploy a hyperloop solution. This solution was primarily reserved for the Trident2 and first generation Spectrum ASIC. These ASICs did not have RIOT support so this creative solution allowed customers to solve this routing problem without additional gear.

These first two solutions are nice because of their simplicity. We have dedicated devices that are responsible for each individual task, with all inter-vxlan and inter-vlan permissions handled by a dedicated edge routing functionality. This also creates a simpler tenancy environment where each VXLAN and VLAN is isolated from each other unless passing through the centralized routing or edge services routing device.

A drawback of this solution is east-west traffic. Any inter-vxlan routed traffic has to trombone all the way to the exit leaf before making its way back through VXLAN to the destination. If you had two servers connected to the same TOR but on two different VXLAN, that traffic would have to move all the way through your CLOS network. This could cause an issue with scaling an environment that is primarily east-west traffic.

The foundational tenets of centralized routing can be expanded to a more complex and interesting solution known as Anycast Gateway. This solution distributes the gateway away from the centralized exit device onto each of the TORs.

Creating an environment that has the gateway on an internet edge service is just one example of a VXLAN infrastructure design. In the following posts, I will take a close look at the last options for VXLAN Design: BGP EVPN with Anycast Gateways to implement our VXLAN routing solution. Stay tuned for more in Part 2!

And make sure to check back in soon — we’ve got a very special announcement coming up! If this post has you excited about VXLAN routing, you definitely won’t want to miss the good news in store…

Are you interested in learning more about BGP? Download our ebook BGP in the Data Center for a complete guide to Border Gateway Protocol in the modern data center!