Is EVPN magic? Well, like Arthur C Clarke said, any considerable leap in technology is indistinguishable from magic. On that premise, moving from a traditional layer 2 environment to VXLAN driven by EVPN has much of that same hocus pocus feeling. To help demystify the sorcery, this blog aims to help users new to EVPN create some step-by-step understanding of how EVPN works and how the control plane converges. In this blog post, we’ll focus on basic layer 2 (L2) building blocks then work our way up to layer 3 (L3) connectivity and the control plane.
We’ll be using the “reference topology” as our cable plan and foundation to build our understanding of the traffic flow. Our infrastructure will try to demystify a symmetric mode EVPN environment using distributed gateways. All the configurations are defined in this github repo.
If you’d like to follow along as we go, feel free to launch your own CITC blank slate and deploy the above playbook:
EVPN message types
Like any good protocol, EVPN has a robust process for exchanging information with its peers. In EVPN this process uses message types. If you already know OSPF and the LSA messages you can think of EVPN message types very similarly. Each EVPN message type can carry a different kind of information about the EVPN traffic flow.
In total there are about 5 different message types, but we’re going to focus on the two most popular types for now. In this blog post we’ll cover Type 2, Mac and Mac/IP information, and in a later post, I will discuss Type 5, VNI Route information.
Digging into EVPN message types: Type 2
The easiest EVPN messages to understand are type 2. As mentioned before, type 2 routes contain MAC and MAC/IP mappings. To Startoff, let’s inspect a type 2 entry at work. To do that, we can verify basic connectivity from leaf01 to the server01.
First we look at the bridge table to make sure the MAC address of the switch has the correct mapping to the correct port for the server.
Lets get Server01’s MAC address:
cumulus@server01:~$ ip address show … 7: uplink: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 link/ether 44:38:39:00:08:01 brd ff:ff:ff:ff:ff:ff inet 10.1.1.101/24 brd 10.1.1.255 scope global uplink valid_lft forever preferred_lft forever inet6 fe80::4638:39ff:fe00:801/64 scope link valid_lft forever preferred_lft forever |
Look at Leaf01’s bridge table to make sure the MAC address is mapped to the port we expect. We can cross reference it with LLDP:
cumulus@leaf01:~$ net show bridge mac VLAN Master Interface MAC TunnelDest State Flags LastSeen -------- ------ ---------- ----------------- ------------ --------- ------------- -------- 10 bridge SERVER01 44:38:39:00:08:01 00:01:15 ... |
cumulus@leaf01:~$ net show lldp LocalPort Speed Mode RemoteHost RemotePort --------- ----- ------------- ---------- ---------- swp1 1G BondMember server01 eth1 swp49 1G BondMember leaf02 swp49 swp50 1G BondMember leaf02 swp50 swp51 1G NotConfigured spine01 swp1 swp52 1G NotConfigured spine02 swp1 |
Checking the ARP table we can validate the MAC and IP addresses are mapped correctly.
cumulus@leaf01:~$ ip neighbor show 10.1.1.101 dev vlan10 lladdr 44:38:39:00:08:01 REACHABLE ... |
Now that we’ve checked the basics, let’s start looking at how this gets pulled into EVPN. To being, we validate the local VNIs that are configured:
cumulus@leaf01:~$ net show evpn vni VNI Type VxLAN IF # MACs # ARPs # Remote VTEPs  Tenant VRF 10010 L2 VXLAN10 4 8 1 RED 10020 L2 VXLAN20 2 6 1 BLUE 104001 L3 L3VNI_RED 1 1 n/a RED 104002 L3 L3VNI_BLUE 0 0 n/a BLUE |
Since we validated that server01 is mapped to vlan10 as per the bridge mac table, we’ll check if the ip neighbor entries are being pulled into the EVPN cache. This cache describes the information that is being exchanged with the other EVPN speakers in the environment.
cumulus@leaf01:~$ net show evpn arp-cache vni 10010 Number of ARPs (local and remote) known for this VNI: 8 IP Type MAC Remote VTEP fe80::4638:39ff:fe00:205 local 44:38:39:00:02:05 fe80::4638:39ff:fe00:801 local 44:38:39:00:08:01 10.1.1.2 local 44:38:39:00:02:05 10.1.1.103 remote 44:38:39:00:0a:01 192.168.1.34 10.1.1.1 local 00:00:00:00:00:1a fe80::200:ff:fe00:1a local 00:00:00:00:00:1a 10.1.1.101 local 44:38:39:00:08:01 fe80::4638:39ff:fe00:a01 remote 44:38:39:00:0a:01 192.168.1.34 |
Let’s review what we know so far. The L2 connectivity works correctly as the L2 bridge table and L3 neighbor table are populated locally on leaf01. Next we verified that the mac and ip information is being properly pulled into EVPN via the EVPN arp cache.
Using this information, we check to make sure that the RD and RT mapping so we can learn more about the full VNI advertisement.
An RD is a route distinguisher and is used to disambiguate EVPN routes in different VNIs (as they may have the same MAC and/or IP address).
The RTs are route targets. They are used to describe the VPN membership for the route, specifically which VRFs are exporting and importing the different routes in the infrastructure.
cumulus@leaf01:~$ net show bgp l2vpn evpn vni Advertise Gateway Macip: Disabled Advertise All VNI flag: Enabled Number of L2 VNIs: 2 Number of L3 VNIs: 2 Flags: * - Kernel VNI Type RD Import RT Export RT Tenant VRF * 10010 L2 10.255.255.11:2 65101:10010 65101:10010 RED * 10020 L2 10.255.255.11:3 65101:10020 65101:10020 BLUE * 104001 L3 10.1.1.2:4 65101:104001 65101:104001 RED * 104002 L3 10.2.2.2:5 65101:104002 65101:104002 BLUE |
Since the local L2 VNI has RD 10.255.255.11:2, the RD is essentially an identifier for all routes that are exchanged by this node. When looking elsewhere in the fabric, we use that information to see all the routes advertised by leaf01.
cumulus@leaf01:~$ net show bgp l2vpn evpn route rd 10.255.255.11:2 EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP] BGP routing table entry for 10.255.255.11:2:[2]:[0]:[0]:[48]:[44:38:39:00:08:01] Paths: (1 available, best #1) Advertised to non peer-group peers: spine01(swp51) spine02(swp52) Route [2]:[0]:[0]:[48]:[44:38:39:00:08:01] VNI 10010/104001 Local 192.168.1.12 from 0.0.0.0 (10.255.255.11) Origin IGP, localpref 100, weight 32768, valid, sourced, local, bestpath-from-AS Local, best Extended Community: ET:8 RT:65101:10010 RT:65101:104001 Rmac:44:38:39:00:02:05 AddPath ID: RX 0, TX 51 Last update: Thu Sep 6 18:20:00 2018 BGP routing table entry for 10.255.255.11:2:[2]:[0]:[0]:[48]:[44:38:39:00:08:01]:[32]:[10.1.1.101] Paths: (1 available, best #1) Advertised to non peer-group peers: spine01(swp51) spine02(swp52) Route [2]:[0]:[0]:[48]:[44:38:39:00:08:01]:[32]:[10.1.1.101] VNI 10010/104001 Local 192.168.1.12 from 0.0.0.0 (10.255.255.11) Origin IGP, localpref 100, weight 32768, valid, sourced, local, bestpath-from-AS Local, best Extended Community: ET:8 RT:65101:10010 RT:65101:104001 Rmac:44:38:39:00:02:05 AddPath ID: RX 0, TX 83 Last update: Thu Sep  6 18:20:06 2018 .... Displayed 6 prefixes (6 paths) with this RD |
Here’s an important piece of information and lets spend some time dissecting the EVPN type 2 route. There are actually two different forms that a type 2 route can take, in this case we’re sending each of the two types.
- Type 2 MAC Route
- The first one is an EVPN type 2 MAC route. It only includes a 48 byte MAC entry. This entry is pulled in directly from from the bridge table, and hence only has L2 information in it. Any time a MAC address is learned in the bridge table, that MAC address is pulled into EVPN as a type 2 MAC route.
- Type 2 MAC/IP Route
- The second EVPN type 2 entry is a MAC/IP route. These entries are pulled into EVPN from the ARP table. Reading this entry, the first section includes MAC address and the second one is a mapping for the IP address and mask. Notice how the mask for the IP address is a /32, since this is pulled from the ARP table all EVPN routes are pulled in as host routes.
BGP routing table entry for 10.255.255.11:2:[2]:[0]:[0]:[48]:[44:38:39:00:08:01] ... Route [2]:[0]:[0]:[48]:[44:38:39:00:08:01] VNI 10010/104001 ... BGP routing table entry for 10.255.255.11:2:[2]:[0]:[0]:[48]:[44:38:39:00:08:01]:[32]:[10.1.1.101] ... Route [2]:[0]:[0]:[48]:[44:38:39:00:08:01]:[32]:[10.1.1.101] VNI 10010/104001 .... |
Using this information, we should be able to validate that this /32 host route for server01 is in the routing table of leaf03 as a pure L3 route, pointing out to the L3VNI.
cumulus@leaf03:~$ net show route vrf RED show ip route vrf RED ====================== Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, > - selected route, * - FIB route VRF RED: K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 00:34:43 C * 10.1.1.0/24 is directly connected, vlan10-v0, 00:33:28 C>* 10.1.1.0/24 is directly connected, vlan10, 00:33:29 B>* 10.1.1.101/32 [20/0] via 192.168.1.12, vlan4001 onlink, 00:31:18 |
Let’s spend some time dissecting this output. The neighbor entry in Leaf01 for Server01 has made it all the way to Leaf03 as a /32 host route where the next hop is leaf01 but via the L3VNI.
In order to validate that the connection between the L2 VNI and the L3 VNI are accomplished successfully, we can examine the L3 VNI:
cumulus@leaf01:~$ net show evpn vni 104001 VNI: 104001 Type: L3 Tenant VRF: RED Local Vtep Ip: 192.168.1.12 Vxlan-Intf: L3VNI_RED SVI-If: vlan4001 State: Up VNI Filter: none Router MAC: 44:38:39:00:02:05 L2 VNIs: 10010 |
Notice in this output that the L3 VNI of 104001 is mapped to VRF RED, which we validated in the output of net show evpn vni 10010. Using this, we also can see that VNI 10010 is mapped to VRF 104001 via vlan 4001. All the outputs we’re seeing are lining up to indicate that we have a full working EVPN Type 2 VXLAN infrastructure.
There you have it. From start to finish, we looked at how EVPN works for Type 2 based routes. Specifically we focused at the different EVPN message types and how control planes converge in an L2 extension environment. It’s not witchcraft — just good technology. Tune in for our next post where we extend the EVPN control plane demystification and tackle the traffic flows around Type 5 messages and VXLAN routing. If you haven’t already, I highly recommend trying this out for yourself with Cumulus in the Cloud. And if you’d like to take a deeper dive, we’ve put together a hub of EVPN content — from whitepapers to videos — so you can expand your expertise (or skills in the black arts).
HI Team, Running the scenario on CITC blank slate , I cannot ping between server01 -10.1.1.101 & server02 10.1.1.103
Is it a bug on the Lab ?
[…] Is EVPN magic? Well, like Arthur C Clarke said, any considerable leap in technology is indistinguish… […]