Hi, I'm hoping for some ponters nd freindly advise as to acheving a tricky solution.
I've read SRX Services Gateway Cluster Deployments Across Layer Two Networks and quite a bit of other stuff but I remain unsure.
Background
We have a an SRX650 cluster that works fine. The control and dual data fabric links are back-to-back plumbed – no Layer 2. Deployment model is currently Z-mode design. This was the first SRX I implemented and it seemed like the best thing to do at the time. It’s been stable for a few years.
One design driver was the requirement for 12 or so reth interfaces with a physical child on each node. It made sense as we had that many DMZs – each gets its own reth. A related decision made at the beginning which may seem strange to you who are more expert is that I put each of these reth in its own redundancy-group. With hindsight I’m not sure that was the best idea. I’m very unsure here as I can’t really think of a reason why it’s terrible practice other than I have not seen it done elsewhere and it may complicate things. Especially in light of what is coming up.
Now to the crux: there is a project to move one of the cluster nodes to a secondary DC while maintaining Virtual chassis. I plan to move away from the Z-mode to something closer to Line-mode while still using RGs. My reasoning for this is the that most of the active DMZ hosts are at the primary site. The secondary site is gradually becoming ‘hot’ from a business continuity stand point but it is not all the way there just yet
In truth I could carry on with Z-mode because the I have secured 3 1Gb CWDM links to carry each of fxp1, fab0 & fab0. The latency between sites is about generally <5ms. So essentially I will have a 1gb connection between fxp1, fab0, and fab1 just as I did when they were copper back-to-back
A downside is the network is looked after by a third party and it is cisco. I’ll never get visibility which will make issue resolution difficult. Nevertheless I gave then my requirements:
- jumbo frames,
- isolation for the HA links,
- Disable IGMP snooping on the switched network (I presume only on the VLANs used for the HA)
- Vlan tagging should not be an issue as we run 12.1 and Juniper states “From 10.2R3 onwards, VLAN tagging is not enabled, by default, on the control port”
So my questions for the forum are:
1) am I making a wise decision to at least do the initial implementation as Line mode but keep RGs? (It is an extremely vital piece of infrastructure and I need to go with safest option first)
2) Do I have too many RGs … should I be bundling groups of reths into a much smaller number of RGs?
3) I’ve found it difficult to get conclusive answer to the follow question: Can I put more than two physical Ifs into a reth – e.g. could I put 2 physicals into rethX on the node remaining at the primary site, and have a third on the node to be moved. Reason for this is that the primary site has 2 inside, 2 DMZ, and 2 outside switches as they were to allow virtual-chassis dual homing. The secondary DC only has 1 switch at the inside, DMZ, and outside layers. See the attached diagram for detail on this. Since the node left at the primary site will be permanently active (until disaster) I’d like to dual-home it across the dual switches. Are reth interfaces capable of this config.
4) Directly related to 3) above I’d like the failover order would go from active IF at the primary, then to the other at that same site (plumbed into the other switch), and then only if both those failed would it use the reth member on the secondary node.
5) What would trigger a full failover to the secondary node?
Any help, advice with this greatly appreciated. I’ve been cursed by an SRX cluster which has worked so well over the last 3 years that my skills are not refined … I’ve not had to do any hard core fault finding.
Attached diagram gives some clue of the final state just from the Cluster over L2 goes
Many thanks,
anonomike
#Layer2#SRX650#cluster