“Success does not consist in never making mistakes but in never making the same one a second time.” – George Bernard Shaw

Introducing PTM – Prescriptive Topology Manager

Data center networks generally follow regular topologies, but these topologies can have various unique configurations, from a simple two-tier leaf and spine to a massive multi-tier scale-out model. There usually will be one Top of Rack (ToR) switch per rack but there can also be a second one for switch redundancy. A typical data center may even have multiple topologies in use depending on application needs or migrations to new architectures. Figure 1 shows a few examples of various topologies in use today.

PTM – Prescriptive Topology Manager

Figure 1. Examples of various data center topologies today

The large amount of physical interconnections and the various patterns with which they connect introduce complexity into the management of the wiring plant. Some topologies such as very large and/or multi-tier leaf and spine (Clos) fabrics can have very dense wiring, which can number into the many thousands of cables. In addition, different personnel from the network design team, who are generally unaware of design requirements, usually install and manage the physical cable plant.  Cumulus Networks created the Prescriptive Topology Manager (PTM) to give data center operators a new tool with which to perform a strict wiring validation and more.

A Prescriptive Approach

PTM introduces a software abstraction layer that ensures certain wiring rules are followed by doing a simple runtime verification of connectivity as determined by an operator’s specified wiring plan. This “prescriptive” layer dynamically ensures the desired logical topology and can take some defined actions based on the results of the topology verification, including running scripts and communicating with the Quagga routing protocol suite.

PTM is written entirely in Python and C code and developed and tested on the “Wheezy” release of Debian Linux. The source code is available under the Eclipse Public License (EPL) in keeping with the Cumulus Networks ethos of participation in the Linux community. It is available on Github at: https://github.com/CumulusNetworks/ptm.

Start With A Plan – The DOT File

The network-wiring port interconnections are specified in a special file called topology.dot. DOT is a plaintext graph description language. It is a simple way of describing graphs that both humans and computer programs can use. One common method of generating the DOT file is through the use of graph visualization software such as the open source Graphviz. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. For more information, please refer to: http://en.wikipedia.org/wiki/DOT_(graph_description_language)

At its simplest, DOT can be used to describe an undirected graph. An undirected graph shows simple relations between objects, such as connections between switches. The graph keyword starts a new graph, and nodes are described within curly braces. A double-hyphen (–) shows the relationships between the nodes. Figure 2 shows a sample network expressed as an undirected graph in the corresponding DOT file.

Figure 2. A topology represented as an undirected graph

Figure 2. A topology represented as an undirected graph

Similar to undirected graphs, DOT can describe directed graphs such as networks, flowcharts and dependency trees. The syntax is the same as for undirected graphs, except the digraph keyword starts the graph, and an arrow (->) shows the relationships between nodes. Figure 3 shows an example of a sample network expressed as a directed graph in the corresponding DOT file.

Figure 3. A topology represented as a directed graph

Figure 3. A topology represented as a directed graph

Meet the Neighbors – Link Discovery with LLDP

PTM dynamically communicates with the standard Link-Local Discovery Protocol (LLDP) to learn the neighbor’s system ID and local port ID. LLDP automatically discovers neighbors on the physical link and exchanges the required system information. A PTM property called topology status is set, which indicates whether the link is connected to the correct system on the remote end of the link.

It is important to note that most host operating systems and network devices today can run LLDP also. This enables the network operator to extend the wiring plan validation to the 30-40 hosts and network devices attached to a ToR switch within a typical rack.

ToR switch within a typical rack

Execute the Plan – PTM Daemon

The PTM daemon or ptmd is the Linux process responsible for PTM functions. It is included in a normal installation of Cumulus Linux. The default setting of ptmd expects the DOT-specified network graph to be located at /etc/cumulus/ptm.d/topology.dot (/etc/ptm.d for 2.1 or later). The DOT file can be installed on each switch manually, but a better method is to use some form of automation to distribute the file, such as Puppet, Chef, CFEngine, or Ansible. The ptmd process on each switch then determines the relevant local port information for the switch node from the entire switch and port information in the DOT graph file. Each relevant link in the graph is then determined to pass or fail the PTM validation and the appropriate scripting and routing actions are taken. The PTM daemon logs all related events and error conditions in the /var/log/ptmd.logfile for help in troubleshooting.

Scripting

PTM gives the operator the ability to dynamically run a script based on a pass or fail for the wiring verification. By default, ptmd will execute scripts at /etc/cumulus/ptm.d/if-topo-pass (/etc/ptm.d/if-topo-pass for 2.1 or later) and /etc/cumulus/ptm.d/if-topo-fail (/etc/ptm.d/if-topo-fail for 2.1 or later) for each interface that undergoes a change in PTM oper_state.

Quagga Routing Interaction

The Quagga routing suite enables additional checks to ensure that routing adjacencies are formed only on links that have connectivity conformant to the topology.dot specification, as determined by ptmd. IP unnumbered interfaces are commonly used to simplify fabric instantiation in many data centers today. This can present a problem with incorrectly connected interfaces, where routing adjacencies will be established regardless. IP numbering complexity is dramatically reduced by using IP unnumbered interfaces and cabling is verified before routing can be established by implementing PTM and unnumbered interfaces together.

PTM and unnumbered interfaces

PTMCTL

The ptmctl command shows the results of the topology verification. ptmctl  is a client of ptmd. It connects to ptmd over a Unix socket and listens for notifications. You can use the -w option to retrieve the first dump of interface status from ptmd and watch for LLDP neighbor changes. Note: Link transitions cannot be watched. Please see “man ptmctl” from within Cumulus Linux for more information.

ptmctl command

Cumulus Linux 2.1 Improvements

The Cumulus Linux 2.1 code release supports the use of Bidirectional Forwarding Detection (BFD) as a PTM module. It utilizes the PTM topology.dot file to configure BFD sessions. PTM now supports three parameter types in the topology file: global, per-port and templates. Cumulus Linux 2.1 also changes the default location of the script/topology file to /etc/ptm.d and adds a sample script/topology file under/usr/share/doc/ptmd/examples.

Global parameters are applied to all the nodes in the topology file. There are currently two global parameters that exist: LLDP and BFD.

  • The LLDP parameter allows a user to configure global LLDP parameters and apply on all ports. By default, LLDP is enabled; if no keyword is present, then default compiled values are used on all ports. LLDP is always enabled.
  • The BFD parameter allows a user to configure global BFD parameters and apply on all ports. If the keyword is not present, then the feature is considered disabled (unless there is a per-port over-ride)

The per-port parameters allow finer grain control over how PTM should configure a port and will over-ride any compiled or global defaults.

Templates allow flexibilty in choosing different parameter combinations without having to change parameters per port. A template is a special parameter that tells PTM to reference a “named” parameter string, rather than the default ones. There are currently two template keywords:bfdtmpl and lldptmpl.

  • The bfdtmpl template keyword allows a user to specify a custom parameter tuple for configuring BFD parameters on a port.
  • The lldptmpl template keyword allows a user to specify a custom parameter tuple for configuring LLDP parameters on a port.

PTM – A Great New Tool for Network Operators!

Prescriptive Topology Manager is an innovative approach to data center wiring plan validation and more. It leverages standard Linux and open source tools and protocols for ease of use, and interoperates with any network devices or server hosts that support industry-standard LLDP. PTM also has the ability to provide some basic automation around topology states. For more information on PTM and ptmd, please refer to the PTM documentation: https://cumulusnetworks.com/docs/

This is one of several exciting technologies that Cumulus Networks is bringing to the modern data center as part of the Cumulus Linux distribution for bare-metal switches. Learn more at www.cumulusnetworks.com.