MX240 MCLAG-ICCP Link BFD Flapping

View Only

last person joined: 2 days ago

Ask questions and share experiences about Junos OS.

Back to discussions

Expand all | Collapse all

MX240 MCLAG-ICCP Link BFD Flapping

1. MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
Erdem
Posted 12-27-2016 05:59

Reply Reply Privately
Hi There,

I have 2 Juniper MX 240 running MCLAG and ICCP with ICL interfaces. We have irbs configured and connected to the LAN side of the network. Both routers are using very little CPU. For the ICCP liveliness detection we have the following. Both routers have the same configs except the peer IPs

show protocols iccp

local-ip-addr 10.10.10.1;

peer 10.10.10.2 {

redundancy-group-id-list 1;

backup-liveness-detection {

backup-peer-ip 172.16.255.201;

}

liveness-detection {

minimum-receive-interval 1000;

multiplier 1;

transmit-interval {

minimum-interval 1000;

}

detection-time {

threshold 2000;

}

Router1

show bfd session

Detect Transmit

Address State Interface Time Interval Multiplier

10.10.10.1 Up 1.000 0.900 1

1 sessions, 1 clients

Cumulative transmit rate 1.1 pps, cumulative receive rate 1.0 pps

Client ICCP realm 10.10.10.2, TX interval 1.000, RX interval 1.000

Session up time 03:59:30, previous down time 00:00:02

Local diagnostic None, remote diagnostic None

Remote state Up, version 1

Session type: Multi hop BFD

Min async interval 1.000, min slow interval 1.000

Adaptive async TX interval 1.000, RX interval 1.000

Local min TX interval 1.000, minimum RX interval 1.000, multiplier 1

Remote min TX interval 1.000, min RX interval 1.000, multiplier 1

Threshold for detection time 2.000

Local discriminator 18, remote discriminator 18

Echo mode disabled/inactive

Multi-hop route table 0, local-address 10.10.10.1

Session ID: 0x0

1 sessions, 1 clients

Cumulative transmit rate 1.1 pps, cumulative receive rate 1.0 pps

Router 2

show bfd session

Detect Transmit

Address State Interface Time Interval Multiplier

10.10.10.2 Up 1.000 0.900 1

Client ICCP realm 10.10.10.1, TX interval 1.000, RX interval 1.000

Session up time 04:01:29, previous down time 00:00:01

Local diagnostic None, remote diagnostic None

Remote state Up, version 1

Session type: Multi hop BFD

Min async interval 1.000, min slow interval 1.000

Adaptive async TX interval 1.000, RX interval 1.000

Local min TX interval 1.000, minimum RX interval 1.000, multiplier 1

Remote min TX interval 1.000, min RX interval 1.000, multiplier 1

Threshold for detection time 2.000

Local discriminator 18, remote discriminator 18

Echo mode disabled/inactive

Multi-hop route table 0, local-address 10.10.10.2

Session ID: 0x0

1 sessions, 1 clients

Cumulative transmit rate 1.1 pps, cumulative receive rate 1.0 pps

The above BFD session is frequently flapping with the error that

Router 1 Same time

BFD Session 10.10.10.2 (IFL 0) state Up -> Down LD/RD(18/18) Up time:1w4d 08:08 Local diag: NbrSignal Remote diag: CtlExpire Reason: Received DO

WN from PEER.

Dec 27 08:18:40 edge1.da1 bfdd[3903]: BFDD_TRAP_MHOP_STATE_DOWN: local discriminator: 18, new state: down, peer addr: 10.10.10.2

Dec 27 08:18:40 edge1.da1 lacpd[24612]: mcae_icl_event_handle_icl_up_iccp_down_from_iccp_up: for mcae ae7 preferred_active is TRUE

Dec 27 08:18:40 edge1.da1 mib2d[3892]: SNMP_TRAP_LINK_DOWN: ifIndex 563, ifAdminStatus up(1), ifOperStatus down(2), ifName ae6.0

Dec 27 08:18:41 edge1.da1 rpd[3949]: RPD_OSPF_NBRDOWN: OSPF neighbor 216.52.72.9 (realm ospf-v2 irb.3312 area 0.0.0.0) state changed from Full to Init due to 1WayRcvd (event reason:

neighbor is in one-way mode)

Router 2: same time

BFD Session 10.10.10.1 (IFL 0) state Up -> Down LD/RD(18/18) Up time:1w4d 08:08 Local diag: CtlExpire Remote diag: None Reason: Detect Timer Expiry.

Any idea how to tune the BFD settings and make the BFD stable.

#mclag
#BFD
#iccp
2. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
sarathirao
Posted 12-27-2016 08:45

Reply Reply Privately
Hi,

Please note that BFD session for ICCP is multi-hop BFD session which works in cetralized mode. This means that keepalive exchange handled by the Routing Engine. It is not recommended to use very aggresive timers for BFD when it is operating in centralized mode. For more details please refer to these links:

Multichassis Link Aggregation Guidelines

multichassis link aggregation with ICCP

troubleshooting checklist - bfd

Hope this helps.

Thanks
3. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
Erdem
Posted 12-27-2016 09:02

Reply Reply Privately
Eventhough it is multihop and handled in RE, it is just a next hop to routers. However do you any idea what is the recommended values for multihop mode handled through RE. Additionally all of the configs refers QFX platforms not for MX.
4. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
sarathirao
Posted 12-27-2016 22:11

Reply Reply Privately
The documents which I shared also applies to MX. As the BFD for ICCP is a multihop BFD session which is handled by routing engine, timers should not be very aggresive. The timers you have used should be good, only thing I see here is the multiplier, which is set to 1. This number specifies - Maximum allowable number of liveness detection requests missed by the peer. The default value for it is 3, but here you have used as 1 which means just one packet lost is enough to bring the session down. If possible can you keep this value to default and check?

Thanks
5. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
Erdem
Posted 12-28-2016 08:23

Reply Reply Privately
Based on the reading and your suggesstion can you suggest me the importance of session-establishment hold time on the ICCP BFD setting? Can I set it along with mutlipler value around 3?
6. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
sarathirao
Posted 12-29-2016 22:46

Reply Reply Privately
There is no dependency of multiplier value and session-establishment-hold-time.

session-establishment-hold-time specify the time during which an Inter-Chassis Control Protocol (ICCP) connection must be established between peers. Configured session establishment hold time results in faster ICCP connection establishment. The recommended value is 50 seconds.

http://www.juniper.net/techpubs/en_US/junos/topics/task/configuration/interfaces-configuring-multi-chassis-link-aggregation.html

Hope this helps.

Thanks
7. RE: MX240 MCLAG-ICCP Link BFD Flapping

0 Recommend
python
Posted 12-28-2016 11:53

Reply Reply Privately
Hi Folks,

To add.. please find some pointers on BFD timers specific to QFX Series.

Configure the ICCP liveness-detection interval (the BFD timer) to be at least 8 seconds if you have configured ICCP connectivity through an IRB interface. A liveness-detection interval of 8 seconds or more allows graceful Routing Engine switchover (GRES) to work seamlessly. By default, ICCP liveness detection uses multihop BFD, which runs in centralized mode.

This recommendation does not apply if you have configured ICCP connectivity through a dedicated physical interface. In this case, you can configure single-hop BFD.

Use the peer loopback address to establish ICCP peering. Doing so avoids any direct link failure between MC-LAG peers. As long as the logical connection between the peers remains up, ICCP stays up.

http://www.juniper.net/documentation/en_US/junos16.1/topics/concept/lag-multichassis--guidelines.html

-A.Rengaramalingam

Junos OS

MX240 MCLAG-ICCP Link BFD Flapping

Erdem12-27-2016 05:59

sarathirao12-27-2016 08:45

Erdem12-27-2016 09:02

sarathirao12-27-2016 22:11

Erdem12-28-2016 08:23

sarathirao12-29-2016 22:46

python12-28-2016 11:53

1. MX240 MCLAG-ICCP Link BFD Flapping

2. RE: MX240 MCLAG-ICCP Link BFD Flapping

3. RE: MX240 MCLAG-ICCP Link BFD Flapping

4. RE: MX240 MCLAG-ICCP Link BFD Flapping

5. RE: MX240 MCLAG-ICCP Link BFD Flapping

6. RE: MX240 MCLAG-ICCP Link BFD Flapping

7. RE: MX240 MCLAG-ICCP Link BFD Flapping