After upgrading a pair of SRX320s to 15.1X49-D210, I cannot get the cluster to reform.
The primary node comes up ok but I cannot get the secondary online.
I've tried doing the following on the secondary:
set chassis cluster cluster-id 0 node 0 reboot
...
load factory-defaults
set chassis cluster cluster-id 1 node 1 reboot
But on the primary, the status goes "lost -> hold -> secondary -> disabled".
On the secondary the only hint is in chassisid log file:
LCC: send: fpc 0 pic 0 online ack
LCC: pic attach pic 0, flags 0x0, portcount 58, fpc 0
LCC: pic_set_online: i2c 0x689 pic 0 fpc 0 state 3 in_issu 0
LCC: pic_type=1673 pic_slot=0 fpc_slot=0 pic_i2c_id=1673
LCC: hwdb: entry for pic 1673 at slot 0 in fpc 0 inserted
LCC: FPC 0 PIC 0, attaching clean
LCC: not in vc mode
LCC: Forwarding pic attach to FWDD fpc 0, pic 0
LCC: Got a pic attach ack from fwdd fpc 0pic 0
LCC: FWDD pic attach ack recd fpc 0, pic 0
LCC: pic_copy_port_info:Got SFP Rev= , Pno=NON-JNPR, Sno=PG54Q4Q
LCC: SIGWINCH handler
LCC: Node entering disabled state
CHASSISD_FRU_OFFLINE_NOTICE: Taking FPC 0 offline: Chassis cluster disable
LCC: fpc_down slot 0 reason Chassis cluster disable cargs 0xfa6120
LCC: fpc_srxsme_disconnect slot is 0
LCC: fpc_offline_now - slot 0, reason: Chassis cluster disable, error OK transition state 1
CHASSISD_SNMP_TRAP3: ENTITY trap generated: entStateOperDisabled (entPhysicalIndex 7, entStateAdmin 3, entStateAlarm 0)
LCC: fpc_offline_now - slot 0, is_resync_ready cleared
LCC: mic_get_mic_slot: clp1: fpc_slot=0, pic_slot=0, i2c=0x689
LCC: hwdb: entry for fpc 1929 at slot 0 deleted
CHASSISD_FRU_OFFLINE_NOTICE: Taking FPC 1 offline: Chassis cluster disable
LCC: fpc_down slot 1 reason Removal cargs 0x0
LCC: fpc_offline_now - slot 1, reason: Chassis cluster disable, error OK transition state 1
CHASSISD_SNMP_TRAP3: ENTITY trap generated: entStateOperDisabled (entPhysicalIndex 8, entStateAdmin 1, entStateAlarm 0)
LCC: fpc_srxsme_is_mpim_present: slot 1, FPC not present
LCC: fpc_srxsme_init: slot 1, FPC not detected
CHASSISD_FRU_OFFLINE_NOTICE: Taking FPC 2 offline: Chassis cluster disable
LCC: fpc_down slot 2 reason Removal cargs 0x0
LCC: fpc_offline_now - slot 2, reason: Chassis cluster disable, error OK transition state 1
CHASSISD_SNMP_TRAP3: ENTITY trap generated: entStateOperDisabled (entPhysicalIndex 9, entStateAdmin 1, entStateAlarm 0)
LCC: fpc_srxsme_is_mpim_present: slot 2, FPC not present
LCC: fpc_srxsme_init: slot 2, FPC not detected
...
LCC: Unable to read FPC 6 ID EEPROM
LCC: I2C read error for slot 6
...
There's an error in jam_chassisid but that file is not on either SRX:
jam_dso_find_open.776:dir: /usr/sbin/jam
jam_dso_find_open.799:Failed to Open Dir /usr/sbin/jam
jam_get_db_attribute.1013:DB Get failed for chasd.lc.modelinfo.711-062269 with ret 3
jam_get_modelnumstr.1176:Got model num str for partno: 711-062269
jam_dso_find_open.776:dir: /usr/sbin/jam
jam_dso_find_open.799:Failed to Open Dir /usr/sbin/jam
jam_get_db_attribute.1013:DB Get failed for chasd.lc.modelinfo.711-062269 with ret 3
jam_get_modelnumstr.1176:Got model num str for partno: 711-062269
jam_get_db_attribute.1011 ERR:DB Get failed for chasd.lc.modelinfo. with error 3
jam_get_modelnumstr.1176:Got model num str for partno:
jam_dso_find_open.776:dir: /usr/sbin/jam
jam_dso_find_open.799:Failed to Open Dir /usr/sbin/jam
jam_get_db_attribute.1013:DB Get failed for chasd.lc.modelinfo.711-062269 with ret 3
jam_get_modelnumstr.1176:Got model num str for partno: 711-062269
So I'm a bit confused about what to do next.... is the unit actually faulty?