Hi all,
I have a cluster problem, and no clue to it.
After some years of running I had to stop one firewall node(srx550) - this was the node1. After the reboot it's interfaces were down (bot in fpc0 and in fpc3) - so I took if offlilne until replace the HW.
Later I tried to start the fw node withot any cable and the interfaces started normally, so I tried to put it back to the cluster.
When It started it immediately become the active node on RG0 but the reth interfaces remain in down status (with all the ge interraces up) so I turnd off again. No preemtion configured so the interfaces remained active in the other node node1.
After it I discovered, that node1 RG0 shows an error (GR) - probably this is the reason why node 0 took mastership when I plugged back.
Now node0 is turned off, I have this GR (GRES monitoring) error and the firewall is working.
I would like to take node0 back in charge, but first I want to clear this GR error.
When I check show chassis cluster information deatil I can see that gres-not-ready ....
{primary:node1}
user@firewall-node1> show chassis cluster status
Monitor Failure codes:
CS Cold Sync monitoring FL Fabric Connection monitoring
GR GRES monitoring HW Hardware monitoring
IF Interface monitoring IP IP monitoring
LB Loopback monitoring MB Mbuf monitoring
NH Nexthop monitoring NP NPC monitoring
SP SPU monitoring SM Schedule monitoring
CF Config Sync monitoring RE Relinquish monitoring
Cluster ID: 1
Node Priority Status Preempt Manual Monitor-failures
Redundancy group: 0 , Failover count: 0
node0 0 lost n/a n/a n/a
node1 255 primary no yes GR
Redundancy group: 1 , Failover count: 0
node0 0 lost n/a n/a n/a
node1 0 primary no no CS
{primary:node1}
user@firewall-node1> show chassis cluster information detail
node1:
--------------------------------------------------------------------------
Redundancy mode:
Configured mode: active-active
Operational mode: active-active
Cluster configuration:
Heartbeat interval: 1000 ms
Heartbeat threshold: 3
Control link recovery: Disabled
Fabric link down timeout: 66 sec
Node health information:
Local node health: Not healthy
Remote node health: Healthy
Redundancy group: 0, Threshold: 255, Monitoring failures: gres-not-ready
Please help me clearing this gr error.
Thanks,
Balázs