Log in to ask questions, share your expertise, or stay connected to content you value. Don’t have a login? Learn how to become a member.
Can I mix the 15.1X49-D180.2 and 15.1X49-D110.4 SRX Versions to form a chassis cluster? The spare that we have is 15.1X49-D180.2 and the version on production was 15.1X49-D110.4. One of the nodes has issue, we need to replace the faulty SRX with the current spare. We can't find the 15.1X49-D110.4 version on the download site. Badly need some help.
Thanks, in advanced.
Yes, the versions in a cluster must exactly match for the cluster to become live and active.
You can create or join with different versions but the mismatch has to be corrected before the full cluster functionality and failover will work.
Thanks Steve.We are able to match the SW version, the problem is the 2 SRX devices Chassis Cluster didn't go well. First, we've reseated the control link SFP because it did not start after the reboot. Then after reseating the device, the FPC on node 1 goes to offline. We had to revert the control link connection, the node 1 FPC was able to go offline but the policies on DMZ didn't work. We are able to ping to outside of the internet but when the servers inside the DMZ, we can't access. We have to reboot the node 1 since it was the active SRX node.What will be the best practice if we will retry the clustering of the 2 nodes? Thank you so much for your time and answers.Gerald
There are a lot of factors that go into the behavior of the cluster, so I can't be definitive for you situation. But here are a few of the principles to look at.For the cluster to properly function both the control and fabric link have to to up and active if both are present for the SRX model in question. So cluster status should be checked here and confirm these operations.When a cluster is formed in the normal Active/Passive mode only ONE of the physical SRX devices will be passing traffic typically node 0 and the other node 1 is passive waiting to take over in the event of a detected failover. Again cluster status will show which node is active at any given time.As a result the physical cabling of both the upstream and downstream devices serviced by the cluster have to be able to accommodate this node 0 to node 1 failover process. A common issue with failover is physical connections not being run in a way that the secondary node can successfully connect upstream to downstream traffic.When you say fpc offline above, I'm not sure if you were experiencing a true fpc failure where the device is going offline or simply the fpc becoming inactive due to being the backup node of a chassis cluster. Or if there was some kind of cluster error that occurred forcing the fpc offline as a result. Logs might help in clarifying which scenario is a play and therefore what the remediation steps might be.
Thank You So Much Steve for the input.Last question, what will happen to the RETH interfaces if Chassis Cluster is in SPLIT BRAIN Scenario?
The communications between the two SRX are setup in a way that split brain active / active would not occur. There is a possible state where both would be offline with one node in hold and the other lost.Redundancy groups should follow the active member while the backup or lost communication member would not be active.A list of possible status and troubleshooting steps are in this series of documentation links starting with the redundancy group failover article.https://www.juniper.net/documentation/us/en/software/junos/chassis-cluster-security-devices/topics/task/troubleshoot-srx-chassis-cluster-redundancy-group-not-failing-over.html
Thank you Steve