Hi guys,
This story is coming from here https://forums.juniper.net/t5/SRX-Services-Gateway/Junos-upgrade-fails-on-SRX340-cluster-from-15-1X49-D170-4-to-17/td-p/467752
I was strugling to upgrade a SX340 cluster to a newer Junos version, and finally with the help of some gurus, I made it upgrade to version 18.3R2.7 on both nodes.
Now after the upgrade however i'm facing new issues... I can't SSH the device anymore, on its single reth interface configured while i can on the console port with same root password... Also sometimes the HA shows fine, but some times it shows amber HA led, and the output of the regular commands shows as below:
root@SPCFW-BRAVO> show chassis firmware
node0:
--------------------------------------------------------------------------
Part Type Version
FPC 0 O/S Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
FWDD O/S Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
node1:
--------------------------------------------------------------------------
Part Type Version
FPC 0 O/S Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
FWDD O/S Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
root@SPCFW-BRAVO> show chassis cluster information
node0:
--------------------------------------------------------------------------
Redundancy Group Information:
Redundancy Group 0 , Current State: primary, Weight: 255
Time From To Reason
Sep 11 20:57:13 hold secondary Hold timer expired
Sep 11 20:57:22 secondary primary Better priority (200/100)
Redundancy Group 1 , Current State: primary, Weight: 0
Time From To Reason
Sep 11 20:57:13 hold secondary Hold timer expired
Sep 11 20:57:24 secondary primary Remote yield (0/0)
Chassis cluster LED information:
Current LED color: Amber
Last LED change reason: Monitored objects are down
Control port tagging:
Disabled
Failure Information:
Coldsync Monitoring Failure Information:
Statistics:
Coldsync Total SPUs: 1
Coldsync completed SPUs: 0
Coldsync not complete SPUs: 1
Fabric-link Failure Information:
Fabric Interface: fab0
Child interface Physical / Monitored Status
ge-0/0/2 Up / Down
node1:
--------------------------------------------------------------------------
Redundancy Group Information:
Redundancy Group 0 , Current State: secondary, Weight: 0
Time From To Reason
Sep 11 20:57:21 hold secondary Hold timer expired
Redundancy Group 1 , Current State: secondary, Weight: -255
Time From To Reason
Sep 11 20:57:22 hold secondary Hold timer expired
Chassis cluster LED information:
Current LED color: Amber
Last LED change reason: Monitored objects are down
Control port tagging:
Disabled
Failure Information:
Coldsync Monitoring Failure Information:
Statistics:
Coldsync Total SPUs: 1
Coldsync completed SPUs: 0
Coldsync not complete SPUs: 1
Fabric-link Failure Information:
Fabric Interface: fab1
Child interface Physical / Monitored Status
ge-5/0/2 Up / Down
{secondary:node1}
root@SPCFW-BRAVO> show chassis cluster status
Monitor Failure codes:
CS Cold Sync monitoring FL Fabric Connection monitoring
GR GRES monitoring HW Hardware monitoring
IF Interface monitoring IP IP monitoring
LB Loopback monitoring MB Mbuf monitoring
NH Nexthop monitoring NP NPC monitoring
SP SPU monitoring SM Schedule monitoring
CF Config Sync monitoring RE Relinquish monitoring
Cluster ID: 1
Node Priority Status Preempt Manual Monitor-failures
Redundancy group: 0 , Failover count: 0
node0 200 primary no no None
node1 0 secondary no no FL
Redundancy group: 1 , Failover count: 0
node0 0 primary yes no CS
node1 0 secondary yes no CS FL
root@SPCFW-BRAVO> show chassis cluster interfaces
Control link status: Up
Control interfaces:
Index Interface Monitored-Status Internal-SA Security
0 fxp1 Up Disabled Disabled
Fabric link status: Down
Fabric interfaces:
Name Child-interface Status Security
(Physical/Monitored)
fab0 ge-0/0/2 Up / Down Disabled
fab0
fab1 ge-5/0/2 Up / Down Disabled
fab1
Redundant-ethernet Information:
Name Status Redundancy-group
reth0 Down Not configured
reth1 Up 1
reth2 Down Not configured
reth3 Down Not configured
reth4 Down Not configured
Redundant-pseudo-interface Information:
Name Status Redundancy-group
lo0 Up 0
It seems that for some reason I can´t understand, fab0 ge-0/0/2 comes up sometimes, and comes down other times.
What do you think? should I resinstall the same Junos version? go back to 15.1?
Any help would be much appreciated
Thanks!