Generally, in an IP clos fabric it is recommended to not connect devices in the same layer (meaning no physical or logical connection between leaf devices or no physical or no logical connection between spine devices. But I have not yet found any explanation as to why we should not connect devices in the leaf layer or connect devices in spine layer.
I have attached a network architecture for reference which consists of spine and leaf devices. Each leaf device has an independent connection to a spine, but there is no connection between leaf devices or connection between spine devices. In the the figure, if for example the 2 uplink connections from leaf1 to Spine1 and Spine2 were to go down, all end user stations connected to Leaf1 would see a complete outage with the referenced diagram in place.
On the other if we were to have a connection between Leaf1 and Leaf2, and assuming the the 2 uplinks to the spines from leaf1 was down, traffic from leaf1 would get re-routed to Leaf2, saving us from a network outage.
Also, in cases where proprietary protocols such as MC-LAG are used in the Leaf layer along with VRRP to provide first hop redundancy for networks routed at the Leaf layer, there is an explicit requirement to connect the leaf devices.
The text about not connecting the leaf devices with each other and not connecting the spine devices to each other can be found in the JNCIP-DC course in the IP Fabric section. There might be something I am missing here, but if someone could help shine the light with regards to this topic that would be great.
In IP-Clos architecture, Spine switches do not connect to other Spine switches, and Leaf switches do not connect directly to other Leaf switches. All links in a Leaf-Spine architecture are set up to forward with no looping. Leaf-Spine architectures are typically configured to implement Equal Cost Multipathing (ECMP), which allows all routes to be configured on the switches so that they can access any Spine switch in the layer 3 routing fabric.
It provides for numerous paths between two points allowing traffic flows across all available routes, offering improved redundancy, like STP, still prevent loops.
Connecting devices in the same layer and then adding more techniques and rules to ensure loop prevention is just going to add more complexity is my understanding.
Hope this helps 🙂
Please mark this as "Accepted Solution" if this solves your concerns.
Kudos are always appreciated!
Thank you for your response. In the architecture, that I have posted, the devices in the IP fabric (leaf and spine) are pure layer3 devices and are not running any type of xstp protocols. Only layer3 routing protocols such as BGP, OSPF, ISIS etc is being run between the leaf and the spine.
Also if we are to connect the two leaf devices together (for eg: leaf1 and leaf2), we do form a closed formed architecture, but then since we are only running dynamic routing protocols between these devices and not running any form of spanning tree protocol, I don't think loops should be a concern here.
Check out the video below:
I created this video a couple of years ago around VXLAN and EVPN.
It's not for Juniper (I had never touched a Juniper device at that time), but the architecture part will be relevant for any vendor.
Luke, I had seen this video prior to you sharing, but I'm must say its one of the best resources to get familiar with the VXLAN and EVPN concept.
Thanks Biraj! It's really good to hear that.
Most enterprises that host data centers are looking to increase resiliency and also support new technologies such as VMware NSX that allow them to deploy applications, servers, and virtual networks within seconds. Layer 3 Fabrics allow them to support better uptime, performance, and newer cloud infrastructures such as VMware NSX. In order to maintain the large scale required to host thousands of servers, the use of a multi-stage Clos architecture is required. Such an architecture allows the physical network to scale beyond the port density of a single switch. Layer 3 Fabrics use BGP as the control plane protocol to advertise prefixes, perform traffic engineering, and tag traffic. The most common designs in a multi-stage Clos architecture are a 3-stage and 5-stage networks that use the spine-and-leaf topology.
Spine-and-leaf topology is an alternate to the traditional three-layer network architecture, which consists of an access layer, aggregation layer, and a core. In the spine-and-leaf topology, all the leaf devices are connected to the spine devices in a mesh.
Typically, the spine devices are high-performance switches capable of Layer 3 switching and routing combined with high port density. Spine devices constitute the core and the leaf devices constitute the access layer in Layer 3 Fabrics. Leaf devices enable servers to connect to the Layer 3 Fabric. They also provide uplinks to spine devices.
Network Director currently supports only the 3-stage design. The 3-stage design has two roles—the spine and the leaf. It is called a 3-stage design because the traffic must traverse three switches in the worst-case scenario.
The maximum number of spine devices that you can have in your Layer 3 Fabric depends on the number of 40-Gigabit Ethernet interfaces in your leaf devices. A Layer 3 Fabric that has 8 QFX5100-24Q spine devices and 32 QFX5100-96S leaf devices (each leaf supports 96 10-Gigabit Ethernet ports) can provide 3072 usable 10-Gigabit Ethernet ports.
There is no issue in connecting the Spine to another spine and a leaf to another leaf as long as the ipclos is used for routing and only underlay network is at work.
However most commonly we would use the ipclos routing as an underlay and then we will have overlay with VXLAN/EVPN or any other encapsulation. On the overlay however you will use this fabric for the L2 communication for end devices in the same vlan or towards the Gateway situated on the spine or the leaf depending on the architecture. In that case you will face unexpected behavior of traffic with loops. For the current scenario you mentioned the most common practice is to have the Server connected multi homed with 2 or more leaf devices which gives the active active redundancy in case one of the leaf fails.
Hope this helps.