Log in to ask questions, share your expertise, or stay connected to content you value. Don’t have a login? Learn how to become a member.
we're looking for a simple way to build a data center interconnect. We tend to try a bunch of four EX4500 tied together as a single virtual chassis streched over the two locations (two devices on each site). As I found in the best practices guide for virtual chassis the distance between both, the first and the second site shouldn't lead to a problem, what I'm interested about is how routing behaves in a virtual chassis.
In the case described, will routing stay local on each site? Or would L3 traffic always be passed to the routing engine, in a sense that all traffic from one vlan to another at site one would be passed to the routing engine on site two in the worst case?
I would be glad about any advices.
Thanks a lot in advance,
Routing will be performed locally at each site. While the routing-engine contains the routing-table, it loads the calculated paths down to the forwarding table in each switch, allowing traffic to be routed where it ingresses the virtual-chassis.
I have done stretched VC deployments for a number of customers and here is a list of pointers:
- one box to manage, so all VLANs, IPs, routing etc. is configured ONCE
- very fast fail-over if/when the chassis splits in two (DCs become isolated)
- no spanning-tree blocking or VRRP required if you want to stretch VLANs between DCs (the L3 interface is the "same" at both sites, but ARP response/routing is done locally)
- Recovery from a chassis split is slow (around 45 seconds for L3 and L2 to reconverge)
- Any failures that cause RE switch-over or VC-Topology or software upgrades now affect both your DCs at once. This is getting better with ISSU, but it is something to be aware of.
Needless to say, make sure your VCs are in a RING with a diverse fibre path and you'll be right.
Thanks a lot for your answering! As we planned two fiber paths between the two DCs a split situation should hopefully not occur. But in a case we have a total cut off, I think, the recovery time of the VC would be tolerable because the time it takes to repair a phyical disrupted fibre would take much more time.
To expand a bit more on the recovery though, one of the 2-member virtual chassis that gets formed will essentially restart (not reboot) when connectivity is restored, so traffic *within* one of your DCs will fail while they merge back together (eg: forwarding table will be flushed and re-built).
From memory, the VC master that has been up the longest wins (so the side that used to be primary for all four members) should maintain control and continue forwarding during this process.
With a diverse fibre ring, as the two physical paths are rarely the same length, in your experience what is the maximum distance delta before problems start occurring (if there are any issues)?
To put this in context, the majority of my customers have ~2km difference between primary and secondary path, but for one customer I had to provide a 4km primary and 27km secondary.
I haven't ever come across any diverse path length issues with either Virtual Chassis, or Aggregated Ethernet with varying fibre lengths (I have one site with 27km and 62km respectively) - in the case of the Virtual Chassis, each path is considered individually, depending on the source and destination interface (more specifically PFE) that the traffic flow is being sent between.
Some ECMP does occur though if both PFEs are the same "distance" apart, but I suspect that this would be flow-based rather than packet-by-packet, so again inter-arrival time wouldn't be relevant, as with LACP/Aggregated Ethernet.
Back of a napkin maths though (VERY rough):
Speed of light in fibre ~300,000m/sec
Distance of Fibre: 27km
Time to travel end-to-end: 90µsec (0.09ms).
Amount of data transmitted at 10Gbps in 90µsec: 921.6kbps
thanks for your detailed answers. Is there any documentation on your comment that, "Some ECMP does occur though if both PFEs are the same "distance" apart, but I suspect that this would be flow-based rather than packet-by-packet"?
I'm asking as we're thinking about exactly such a scenario (using an EX VC distributed over two data centers), but we're more concerned with total available bandwidth for a single flow: We're planning on making redundant links of equal length by extending the shorter fiber, so transmit times would be symmetric.
I did a bit of hunting around but the best I could find was the Junos Enterprise Switching book which has an excellent description of VCCP (Virtual Chassis Control Protocol) - Google Books has the appropriate pages available for viewing if you don't have a copy.
On Page 209 it says "Currently Load-Balancing is not supported" referring to multiple paths between VC Member PFEs, however this book was published in 2009 so things may have changed.
There is some detail here around VCP LAGs that are formed if you increase the number of VCP interfaces, but this is not really the same thing as it would only "multi-path" between directly adjacent members:
Thanks for the hints - yes, I have the book here.