srx 1500 cluster issue: received probe packet is ALWAYS zero on the primary node and fab monitor shows the interface fab0 is down although it is physically up

View Only

last person joined: yesterday

Ask questions and share experiences about Data Center Architecture and approaches.

Back to discussions

Expand all | Collapse all

srx 1500 cluster issue: received probe packet is ALWAYS zero on the primary node and fab monitor shows the interface fab0 is down although it is physically up

1. srx 1500 cluster issue: received probe packet is ALWAYS zero on the primary node and fab monitor shows the interface fab0 is down although it is physically up

0 Recommend
Maroua Saidani
Posted 01-16-2022 15:10

Reply Reply Privately
Dear Juniper Communit,
I`m facing a really strange issue with SRX 1500 cluster, the node is directly connected and no switch in between. it seems that the two nodes lost communication between each other. the primary node seems to not being able to become the master and the secondary node when it becomes the master, its not being able to route anything. all the services like bgp and ipsec vpn are currently down .even when i force the primary node to take over the mastership by executing "request chassis cluster failover redundancy-group 0 node 0 force"still nothing works and no traffic is being processed.
After checking, i noticed that the received probe packet in Fabric link is always 0 no mater what troubleshoot i did and "show chassis cluster interfaces" is always showing interface fab0 is down although it is physically up. I even used this hidden command "set chassis cluster no-fabric-monitoring" and rebooted node0 but still exactly the same:

Software version:JUNOS Software Release [15.1X49-D180.2]

Below is cluster configuration ( i deactivated the interface monitor to troubleshoot further):

@sn-dx-node0>show configuration chassis cluster

no-fabric-monitoring;

reth-count 128;

redundancy-group 0 {

node 0 priority 100;

node 1 priority 1;

}

redundancy-group 1 {

node 0 priority 100;

node 1 priority 1;

inactive: interface-monitor {

xe-0/0/16 weight 255;

xe-7/0/16 weight 255;

}

}

redundancy-group 2 {

node 0 priority 100;

node 1 priority 1;

inactive: interface-monitor {

ge-0/0/12 weight 255;

ge-7/0/12 weight 255;

}

}

and below is most of the show chassis cluster outputs after i manually forced failover to node 0:

@sn-dx-node0> show chassis cluster status

Cluster ID: 1

Node Priority Status Preempt Manual Monitor-failures

Redundancy group: 0 , Failover count: 1

node0 255 primary no yes None

node1 1 secondary no yes None

Redundancy group: 1 , Failover count: 1

node0 255 primary no yes HW

node1 1 secondary no yes None

Redundancy group: 2 , Failover count: 1

node0 255 primary no yes HW

node1 1 secondary no yes None

@sn-dx-node0> show chassis cluster statistics

Control link statistics:

Control link 0:

Heartbeat packets sent: 8688

Heartbeat packets received: 8695

Heartbeat packet errors: 0

Fabric link statistics:

Child link 0

Probes sent: 19323

Probes received: 0

Child link 1

Probes sent: 0

Probes received: 0

@sn-dx-node0> show chassis cluster interfaces

Control link status: Up

Control interfaces:

Index Interface Monitored-Status Internal-SA Security

0 em0 Up Disabled Disabled

Fabric link status: Down

Fabric interfaces:

Name Child-interface Status Security

(Physical/Monitored)

fab0 ge-0/0/11 Up / Down Disabled

fab0

fab1 ge-7/0/11 Up / Up Disabled

fab1

below is the configuration of fab0 :

@sn-dx-node0> show configuration interfaces fab0

fabric-options {

member-interfaces {

ge-0/0/11;

}

}
and here is the show interface ge-0/0/11 output:

msaidani@sn-dx-node0> show interfaces ge-0/0/11

Physical interface: ge-0/0/11, Enabled, Physical link is Up

Interface index: 323, SNMP ifIndex: 523

Link-level type: 64, MTU: 9014, LAN-PHY mode, Link-mode: Full-duplex,

Speed: 1000mbps, BPDU Error: None, MAC-REWRITE Error: None,

Loopback: Disabled, Source filtering: Disabled, Flow control: Enabled,

Auto-negotiation: Enabled, Remote fault: Online

Device flags : Present Running

Interface flags: SNMP-Traps Internal: 0x4000

Link flags : None

CoS queues : 8 supported, 8 maximum usable queues

Current address: c0:bf:a7:a5:30:30, Hardware address: c0:bf:a7:a5:2f:0b

Last flapped : 2022-01-15 20:06:49 UTC (02:27:45 ago)

Input rate : 0 bps (0 pps)

Output rate : 2264 bps (1 pps)

Active alarms : None

Active defects : None

Interface transmit statistics: Disabled

Logical interface ge-0/0/11.0 (Index 77) (SNMP ifIndex 572)

Flags: Up SNMP-Traps 0x4000 Encapsulation: ENET2

Input packets : 123

Output packets: 21292

Security: Zone: Null

Protocol aenet, AE bundle: fab0.0 Link Index: 0

Troubleshoot actions done so far:

Rebooting both devices several times

Rebooting a single device.

Performing a "set chassis cluster no-fabric-monitoring" then reboot node 0

Performing "request chassis cluster failover redundancy-group 0 node 0 force"

Logging onto the secondary and performing a "request chassis cluster configuration-synchronize"

Changing the physical cable to a different port and moving the configuration.

Swapping the physical cable completely with a new one.

Please help me as im really running out of options :(
Thank you in advance

------------------------------
Maroua Saidani
------------------------------

Data Center

srx 1500 cluster issue: received probe packet is ALWAYS zero on the primary node and fab monitor shows the interface fab0 is down although it is physically up

1. srx 1500 cluster issue: received probe packet is ALWAYS zero on the primary node and fab monitor shows the interface fab0 is down although it is physically up