Routing

View Only

last person joined: yesterday

Ask questions and share experiences about ACX Series, CTP Series, MX Series, PTX Series, SSR Series, JRR Series, and all things routing, including portfolios and protocols.

Back to discussions

Expand all | Collapse all

Challenging : OSPF adjacency flapping between Full to Loading...

1. Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Daboss
Posted 08-21-2014 14:30
| view attached

Reply Reply Privately
Hi all !

I have a flapping OSPF adjacency between FULL to EXCHANGE which I'm not able to solve. This adjacency is between a Juniper M320 (JunOS 9.3S5.1) and a Cisco uBR7225VXR (CMTS). Both of the adjacency are in the global routing table (no VRF).

Hello, interval, dead time are ok => adjacency passed the TWO-WAY state

MTU is OK => adjacency passed the EXCHANGE state + show interface and traceoption show the same MTU

DDB packets, LS Update and LS ackno are OK

Cisco device is complaining that it doesn't receive his own router-id on the hello packet.

WTF: OSPF: Cannot see ourself in hello from x.x.x.x on Port-channel1.850, state INIT

Juniper device is complaining that it stop receving the hello packet while the dead timer expired :

OSPF neighbor 10.122.0.14 (ae0.850 area 8.0.3.0) state changed from Full to Down due to InActiveTimer (event reason: neighbor was inactive and declared dead) (nbr helped: 0)

I tried both traceoption on Juniper and debug on Cisco, but I'm not able to understand what happend. No error, no warning !

So I sniffed the traffic between the Juniper and Cisco and I saw

that Cisco and Juniper correctly exhange hello packets, except that I suspect the Juniper (according to the timestamp of wireshark) doesn't respect the hello interval : Hello interval is set to 1 sec, but I saw many hello packet in less than 1s.

that from an unknow reason, the Juniper remove the Cisco router-id from the OSPF Hello packet it sends to the Cisco.

I was not able to determine if the Juniper remove the Cisco Router id from the OSPF hello packet AFTER it declares it down (capture traffic show it received the hello packets !) and BEFORE it declares it down

I would appreciate your help !

Thank you !

Salah

JNCIE-ENT

JNCIE-SP

Attachment(s)

Packet_capture.zip 120 KB 1 version
2. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Erdem
Posted 08-21-2014 16:11

Reply Reply Privately
Hi,

It's a quite interesting problem. I am not familiar with Cisco but I will still share my opinion.

From your snoop I noticed that the Junos device is declaring the adjacency down at packets 56, 112, 164, 238, etc...

Since the configured dead interval is 4s we must assume that the last received and processed Hello from the Cisco device was sent 4s + espilon earlier. If we check again the snoop we notice that 4 seconds earlier the Cisco device always sends an unicast Hello message to the Junos device. And apparently Junos always ignores the following multicast Hellos.

From RFC2238:

        On broadcast networks and physical point-to-point networks,
        Hello packets are sent every HelloInterval seconds to the IP
        multicast address AllSPFRouters. On virtual links, Hello
        packets are sent as unicasts (addressed directly to the other
        end of the virtual link) every HelloInterval seconds. On Point-
        to-MultiPoint networks, separate Hello packets are sent to each
        attached neighbor every HelloInterval seconds.

I am assuming your LAG is a regular broadcast network. Thus I don't see the point for the Cisco device to send unicast Hello packets. How is Junos supposed to deal with that? On its side Junos never sends unicast Hellos.

My idea: maybe Junos after seeing the first erroneous unicast Hellos is incorrectly discarding the following valid multicast Hellos.

As a mean to test this hypotesis, we could check the configuration on the Cisco device or we could try to configure a firewall filter on the Junos device to discard unicast Hellos. Not easy at first sight. How to distinguish unicast Hellos from other valid unicast OSPF messages? From the snoop it seems that all unicast Hellos sent from the Cisco have a packet size of 94... Well... this sounds crazy but why not try? 🙂
3. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
lyndidon
Posted 08-21-2014 16:28

Reply Reply Privately
Take a look at this article. It seems like the hello interval is too aggressive for OSPF even though that value can be configured. If you need more aggressivenes for failure detection use BFD instead.

http://www.juniper.net/documentation/en_US/junos13.3/topics/example/ospf-timers-configuring.html

Adjust the hello interval to about 5 secs ad then monitor it for stability.
4. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-21-2014 19:45

Reply Reply Privately
With the default value of 40sec for dead-interval, it means that there were no hello received on M320 for 40 sec. Can you confirm if there isn't lot of host bound traffic or interface congestion?
5. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Daboss
Posted 08-22-2014 02:06

Reply Reply Privately
Hi All !

Thanks for your reply. I forgot to mention that I did play with hello timer (hello interval and dead-interval) to check the stability. Whatever I put, I still have this flapping ! To reduce the database exchange, I use a stub area but flapping was still there....

Plus, I did check traffic drop or packet errors : all at zero ! (no drop, no packet error)

user@JUNIPER_re0> show configuration protocols ospf traffic-engineering; reference-bandwidth 100g; */ OUTPUT OMMIED */ area 8.0.3.0 { interface ae0.850 { hello-interval 5; dead-interval 40; } }

Cisco :

CISCO#show run | section router ospf 850 router ospf 850 router-id 89.158.252.0 log-adjacency-changes passive-interface default no passive-interface Port-channel1.850 network 10.122.0.0 0.0.0.15 area 8.0.3.0 network 89.158.252.0 0.0.0.0 area 8.0.3.0 default-metric 20 CISCO#show run int Port-channel1.850 Building configuration... Current configuration : 229 bytes ! interface Port-channel1.850 description BSOD encapsulation dot1Q 850 ip address 10.122.0.14 255.255.255.240 ip mtu 1570 ip ospf cost 10 ip ospf hello-interval 5
ip ospf dead-interval 40 ip ospf priority 10 mpls label protocol ldp mpls ip end

More idea ?
6. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Daboss
Posted 08-22-2014 05:39

Reply Reply Privately
I investigated more and I can say that after adjacency transition to the FULL state, the routing-engine doesn't receive any OSPF hello packet from the Cisco although the Cisco sent them. Comparing traceoptions + tcpdump (monitor traffic interface) on Juniper and debug on Cisco between specific time interval help me on that diagnostic. But the Juniper is able to see (or maybe) process again once the session was tear down... (sorry I'm not sure about my english 🙂 )

There is a switch (Extreme Networks) betweend the Juniper and the Cisco. The configuration of this switch was double checked and everything is ok. The last things I have to do is to sniff the traffic between the switch and the Juniper to undersantand a) if the switch drop some OSPF hello packet or b) if the Juniper is not able to process them (going to Juniper PFE and not to the RE...)

Is there any command that will help me to see if the OSPF packets are present on the PFE and not on the RE ?

Thanks for you help !

Salah
7. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-22-2014 07:00

Reply Reply Privately
The quickest way to check would be to apply firewall filter to count the incoming ospf packets and also run CLI command " monitor traffic interface ae0.850" to see if RE receive the ospf packet. This would help you in identifying any mismatch between PFE and RE.
8. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Daboss
Posted 08-22-2014 07:34

Reply Reply Privately
Do you mean that the firewall filter is applying to the PFE for this kind of packets (OSPF Hello packet) ?

Thanks for your reply !
9. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-22-2014 07:36

Reply Reply Privately
Yes, that's correct.
10. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Daboss
Posted 08-24-2014 23:48

Reply Reply Privately
I tried that this morning but it's not easy to troubleshoot like this : I can't filter different OSPF packet. So I will now if I receive and OSPF packet, but I won't know if it's a Hello, or DD packets.....

--

Salah
11. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Erdem
Posted 08-25-2014 01:51

Reply Reply Privately
Hi Salah,

At least you should be able to determine whether both types of Hello packets are received :

unicast Hellos: sent by the Cisco device to the physical IP address of the Junos device.

multicast Hellos: sent to 224.0.0.5

As I explained in my previous post, I suspect that the Junos device is only receiving and processing the unicast Hellos. For some reason it seems as if the multicast are filtered somewhere..
12. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
lyndidon
Posted 08-25-2014 10:32

Reply Reply Privately
You will only see the DD packets during the Adjacency formation. After the master/slave relationship is torn down, they use LSR,LSU and LSA to update the OSPF database. The best option to troubleshoot OSPF is to enable ospf traceoptions. That will record all the OSPF transactions.. Then also look at the OSPF database detail/extensive and you can match on a RID. Can you show the OSPF config? This sounds like a situation where there is a gre tunnel that is is being detected by OSPF as a route to a remote destination. Not saying this is the case, but it produces similar results.

>show ospf ?

This should help you troubleshoot:

http://www.juniper.net/documentation/en_US/junos13.2/topics/task/configuration/ospf-tracing.html

Here is some background information

http://www.juniper.net/techpubs/en_US/junos11.4/topics/concept/ospf-routing-packets-overview.html

http://www.juniper.net/techpubs/en_US/junos11.4/topics/concept/ospf-timers-overview.html
13. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-25-2014 12:40

Reply Reply Privately
Did you try to include "packet-length" parameter which would help to match only Hello packets?

family inet {
    filter count_ospf_hello {
        term a {
            from {
                packet-length 80;
                protocol ospf;
            }
            then count ospf_hello;
        }
        term b {
            then accept;
        }
    }
}
14. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
lyndidon
Posted 08-25-2014 12:42

Reply Reply Privately
apply it to the loopback interface
15. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-25-2014 12:47

Reply Reply Privately
>>> apply it to the loopback interface

Yes it would work, but it will count the Hellos coming from all interfaces. And would be ideal if you have single OSPF session.

If not, better to apply on interface level under respective subunit where you want to count the ospf hello packets.
16. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
lyndidon
Posted 08-25-2014 12:54

Reply Reply Privately
add one or two more match conditions for example:

from interface <int-name>
from address <>
from source-address <int-ip>
17. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Surya
Posted 08-25-2014 13:22

Reply Reply Privately
Like I said, it can be done, but isn't it too much overhead when the same can simply be achieved with firewall being applied on interface subunit?
18. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
lyndidon
Posted 08-25-2014 13:58

Reply Reply Privately
Not too much overhead. Juniper systems are capable of handling the debug options in addition to the fact that this is only a temporary test. In my opinion it is also a better way to see if the packets are making up to the routing engine as they are handled by the RE. However, if you are not comfortable with applying it to the lo0 interface, that is okay.

Have you enabled traceoptions for protocol ospf? if yes, what did you find? If no, why? That would be the first place to begin troubleshooting ospf problems.
19. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Erdem
Posted 09-10-2014 10:18

Reply Reply Privately
Hi Daboss,

Just for curiosity, did you manage to fix your problem?
20. RE: Challenging : OSPF adjacency flapping between Full to Loading...

0 Recommend
Erdem
Posted 05-16-2018 16:48

Reply Reply Privately
Hi

i am also facing similar issue.

Initally it was ip MTU mismatch which i resolved it.

OSPF usually working fine but randomly the Juniper device sends too many Hello packet say 4 packets with in short span of time. Amount the 4 hello packet int the last 2 packet it send without listing router-if od the ASR.

Since the ASR receive the hello packet without its own router-is it logg it as "Cannot see ourself in hello from <juniper routrer>, state INIT"

After few seconds this getting resolved.

i am able to capture the packet on ASR when the issue is happening but since its happening randomly i am not able to capture it on Junper device (any thoughts on how to capture when the issue is occuring)

Since i have Riverbed Steelhead in between the ASR and juniper i want to identify where the issue is happening.

Loooks like you too have the same issue so it should be Juniper.

How did you resolve this issue.

regards

Logesh

Routing

Challenging : OSPF adjacency flapping between Full to Loading...

Daboss08-21-2014 14:30

Erdem08-21-2014 16:11

lyndidon08-21-2014 16:28

Surya08-21-2014 19:45

Daboss08-22-2014 02:06

Daboss08-22-2014 05:39

Surya08-22-2014 07:00

Daboss08-22-2014 07:34

Surya08-22-2014 07:36

Daboss08-24-2014 23:48

Erdem08-25-2014 01:51

lyndidon08-25-2014 10:32

Surya08-25-2014 12:40

lyndidon08-25-2014 12:42

Surya08-25-2014 12:47

lyndidon08-25-2014 12:54

Surya08-25-2014 13:22

lyndidon08-25-2014 13:58

Erdem09-10-2014 10:18

Erdem05-16-2018 16:48

1. Challenging : OSPF adjacency flapping between Full to Loading...

2. RE: Challenging : OSPF adjacency flapping between Full to Loading...

3. RE: Challenging : OSPF adjacency flapping between Full to Loading...

4. RE: Challenging : OSPF adjacency flapping between Full to Loading...

5. RE: Challenging : OSPF adjacency flapping between Full to Loading...

6. RE: Challenging : OSPF adjacency flapping between Full to Loading...

7. RE: Challenging : OSPF adjacency flapping between Full to Loading...

8. RE: Challenging : OSPF adjacency flapping between Full to Loading...

9. RE: Challenging : OSPF adjacency flapping between Full to Loading...

10. RE: Challenging : OSPF adjacency flapping between Full to Loading...

11. RE: Challenging : OSPF adjacency flapping between Full to Loading...

12. RE: Challenging : OSPF adjacency flapping between Full to Loading...

13. RE: Challenging : OSPF adjacency flapping between Full to Loading...

14. RE: Challenging : OSPF adjacency flapping between Full to Loading...

15. RE: Challenging : OSPF adjacency flapping between Full to Loading...

16. RE: Challenging : OSPF adjacency flapping between Full to Loading...

17. RE: Challenging : OSPF adjacency flapping between Full to Loading...

18. RE: Challenging : OSPF adjacency flapping between Full to Loading...

19. RE: Challenging : OSPF adjacency flapping between Full to Loading...

20. RE: Challenging : OSPF adjacency flapping between Full to Loading...