Log in to ask questions, share your expertise, or stay connected to content you value. Don’t have a login? Learn how to become a member.
Hi, I'm hoping for some ponters nd freindly advise as to acheving a tricky solution.
I've read SRX Services Gateway Cluster Deployments Across Layer Two Networks and quite a bit of other stuff but I remain unsure.
We have a an SRX650 cluster that works fine. The control and dual data fabric links are back-to-back plumbed – no Layer 2. Deployment model is currently Z-mode design. This was the first SRX I implemented and it seemed like the best thing to do at the time. It’s been stable for a few years.
One design driver was the requirement for 12 or so reth interfaces with a physical child on each node. It made sense as we had that many DMZs – each gets its own reth. A related decision made at the beginning which may seem strange to you who are more expert is that I put each of these reth in its own redundancy-group. With hindsight I’m not sure that was the best idea. I’m very unsure here as I can’t really think of a reason why it’s terrible practice other than I have not seen it done elsewhere and it may complicate things. Especially in light of what is coming up.
Now to the crux: there is a project to move one of the cluster nodes to a secondary DC while maintaining Virtual chassis. I plan to move away from the Z-mode to something closer to Line-mode while still using RGs. My reasoning for this is the that most of the active DMZ hosts are at the primary site. The secondary site is gradually becoming ‘hot’ from a business continuity stand point but it is not all the way there just yet
In truth I could carry on with Z-mode because the I have secured 3 1Gb CWDM links to carry each of fxp1, fab0 & fab0. The latency between sites is about generally <5ms. So essentially I will have a 1gb connection between fxp1, fab0, and fab1 just as I did when they were copper back-to-back
A downside is the network is looked after by a third party and it is cisco. I’ll never get visibility which will make issue resolution difficult. Nevertheless I gave then my requirements:
- jumbo frames,
- isolation for the HA links,
- Disable IGMP snooping on the switched network (I presume only on the VLANs used for the HA)
- Vlan tagging should not be an issue as we run 12.1 and Juniper states “From 10.2R3 onwards, VLAN tagging is not enabled, by default, on the control port”
So my questions for the forum are:
1) am I making a wise decision to at least do the initial implementation as Line mode but keep RGs? (It is an extremely vital piece of infrastructure and I need to go with safest option first)
2) Do I have too many RGs … should I be bundling groups of reths into a much smaller number of RGs?
3) I’ve found it difficult to get conclusive answer to the follow question: Can I put more than two physical Ifs into a reth – e.g. could I put 2 physicals into rethX on the node remaining at the primary site, and have a third on the node to be moved. Reason for this is that the primary site has 2 inside, 2 DMZ, and 2 outside switches as they were to allow virtual-chassis dual homing. The secondary DC only has 1 switch at the inside, DMZ, and outside layers. See the attached diagram for detail on this. Since the node left at the primary site will be permanently active (until disaster) I’d like to dual-home it across the dual switches. Are reth interfaces capable of this config.
4) Directly related to 3) above I’d like the failover order would go from active IF at the primary, then to the other at that same site (plumbed into the other switch), and then only if both those failed would it use the reth member on the secondary node.
5) What would trigger a full failover to the secondary node?
Any help, advice with this greatly appreciated. I’ve been cursed by an SRX cluster which has worked so well over the last 3 years that my skills are not refined … I’ve not had to do any hard core fault finding.
Attached diagram gives some clue of the final state just from the Cluster over L2 goes
Firstly congratulations! Having an SRX cluster work flawlessly for 3 years means you must have done something right back then ; )
Now then to your questions:
1. I personally believe that Z-mode actually provides better availability - especially in your case where each reth is a separate physical interface. If they were attached to individual DMZ switches and you lost one of those switches, I'd much prefer to fail over just the affected reth, than to suddenly swing existing active flows across as well.
2. There is no problem with having a redundancy-group per reth, in fact I find this the most flexible way of deployment, because it allows you to fail a single DMZ over at a time, rather than having to move every reth across at once (which may not be desirable in some cases. For those with mild OCD it also makes your config look neat when your RGs match your reths, but that might just be me ; )
3. Yes - you can add multiple physical interfaces to a reth. If you want to get really tricky, you can also enable LACP sub-lags which allow the physical interfaces on each node of your reth to use LACP to their downstream switch to give you faster detection of individual link faults. This is not required for operation, but is not a bas way to go. And no, there is no problem with having mis-matched member numbers on your promary and secondary reths.
4. The way it will work is that all interfaces in the primary reth will be active simultaneously - outbound traffic will be hashed in some way, and inbound traffic will probably bias towards a particular link based on your switches MAC learning. If either of the primary reth member interfaces fail, the other one will continue to operate and take the full load. To ensure this behaviour, set your interface-monitoring weight to be 128 for each interface, so that no single failure causes the RG to fail over. If your reth members are going into multiple switches (that aren't clustered/stacked in some way) then forget about LACP from above.
5. If by full-failover you mean RE failing over - something like a software upgrade would do it, although forwarding would still happen at the primary site. If you mean just redundancy groups, then any of your interface-monitoring failing would cause individual RGs to fail over as per normal. If you changed over to using a single RG for all interfaces, then any interface-monitor failing would cause all RGs to fail over, which is probably not desirable (at least to me)
Hope this helps
Hi Ben,Firstly thanks very much for the reply and apologies for the delay in acknowledging it - have been on leave.
Regarding an SRX cluster work flawlessly for 3 years - dumb luck / great design ... sometimes these 2 are synonymousTo my questions / your replies1) On your say so I'll keep the current design although I will have all RG active at the primary site as I don't want too much Data Plane traffic at least until we can get to a stage where we can capacity manage it2) Nice to get the reassurance ... and I must have mild OCD, in fact my original design and implementation did not have a reth0 as RG 0 was system reserved. Reth1 is the first and is tied to RG 1 so all is right with the world3) Good to know. Unfortunately the switch pairs at the primary site are not clustered/stacked so ether channel across them is not an option.
4) As mentioned above no LACP possible in this scenario. Due to available port capacity I'll not be able to dual home every RG / Reth / DMZ but I will look to do so for the most mission critical (inside; outside; and 2 or 3 of DMZs) ... how do I configure priority of these links so that I'm not dependent on hashing and switch mac learning?
5) More reasurance for the current design with multiple RGs
Your answers helped a lot. My only remaining question is that in 4) I.e. How do I configure priority of these links so that I'm not dependent on hashing and switch mac learning?
Regarding 4) - unfortunately there is no way I know of to do this. All reth child-interfaces within a single node are treated equally.
Hashing will be automatic fail-over (the child interface will be physically down). As for MAC learning, test it out, but I wouldn't be surprised if gratuitous ARP played a role here to steer traffic away from the downed interface switches.
I'll post any discoveries that I make.