Switching

 View Only
last person joined: 3 days ago 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
Expand all | Collapse all

EX Switch Losing ARP Requests

  • 1.  EX Switch Losing ARP Requests

    Posted 01-09-2020 15:29

    SRX <--> EX1 <-­-> MX <--> EX2 <--> ESXi <--> Workstation

     

    I am able to ping from the SRX to the MX interface facing the SRX and to the MX interface facing the workstation but not to the workstation. The workstation is able to ping the MX interface facing it. I setup two analyzer sessions. One to view the ingress and egress traffic on the port the MX connected to and the second to the ingress and egress traffic on the port the ESXI server is connected on.

     

     While trying to have the SRX ping the workstation on the first analyzer port I can see the MX send out an ARP request but no response returned from the workstation. Then looking at the traffic on the second analyzer port I never see the ARP request from the MX going to ESXi and the workstation. However, when trying to ping the MX from the workstation, I can see the ARP request and responses on both ports.

     

    I have beating my head against the wall for a couple days over this issue. Anyone have any ideas on how to fix this?

     

    The SRX interface ae6 with two physical ports, with one port connected to each member of the EX1 virtual chassis.

     

    The MX has interface ae3 with two physical ports, with one port connected to each member of the EX1 virtual chassis. The MX also has interface ae1 with two physical ports, with one port connected to each member of the EX2 virtual chassis. Within the MX ae3 is connected to a VRF routing instance and ae1 is connected to and EVPN virtual switch. The VRF and EVPN routing instances are then connected with an irb interface.

     

    The SRX is a SRX345 running JunOS 18.2R3.4

    EX1 is a Virtual Chassis of 2 EX4300-24T running JunOS 18.1R3.3

    MX is a MX10 running JunOS 19.2R1.8

    EX2 is a Virtual Chassis of 2 EX4300-48T running JunOS 18.1R3.3



  • 2.  RE: EX Switch Losing ARP Requests

     
    Posted 01-11-2020 11:21

    Hi,

     

    Let's clarify a few things in first place. I assume you have L2 on EX1 and EX2. And L3 routing is happening on MX through irb.

     

    From your description, it seems EX2 (VC) is dropped ARP request from MX to workstation. What does EX2 bridge table looks like? Does it learn MX irb and Workstation correctly? By any chance, do you have layer 3 on EX2? If so, let's delete those. 

     

    However, ARP should not be a problem, given the fact that "when trying to ping the MX from the workstation, I can see the ARP request and responses on both ports", I assume you should have ARP entries on MX and Workstation for each other? And does ping work between MX and workstation? To rule out ARP issue, we can probably configure a static ARP on MX IRB.

     

    For ae1 on MX, it's in EVPN L2 instance. What about its corresponding irb? Is it in default instance? If so, I suggest to put in a vrf or vr instance which is how we usually do

     



  • 3.  RE: EX Switch Losing ARP Requests

    Posted 01-11-2020 12:31

    Hi Mengzhe,

     

    Thanks for the reply. Both EX1 and EX2 are L2 only, all L3 is being done by the MX and SRX.

     

    EX2 is properly learning the MX irb and workstation MACs.  Here is the what the EX2 has which all appears to be corect.

     

    bretth@ATL01TORSW01> show ethernet-switching table vlan-id 450

    MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static, C - Control MAC
    SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC)


    Ethernet switching table : 2 entries, 2 learned
    Routing instance : default-switch
    Vlan MAC MAC Age Logical NH RTR
    name address flags interface Index ID
    vlan-450 00:00:00:00:04:50 D - ae0.0 0 0
    vlan-450 00:50:56:a3:5a:20 D - ae15.0 0 0

     

     

    Ping does work from the workstation to the MX irb, but does not work from the MX irb to the workstation. Interestingly ping will work from the MX irb to the workstation after the workstation has pinged the MX irb, but ping still will not work from the SRX to the workstation. The ARP never learns the workstation when it tries to ping the workstation, but does learn it when the workstation pings the MX

     

    Here are the MX interface and routing instance configurations

    interfaces {
    ge-1/0/4 {
    description "ATL01PUBSW01 ge-0/0/0";
    per-unit-scheduler;
    gigether-options {
    802.3ad ae3;
    }
    ge-1/0/8 {
    description "ATL01TORSW01 ge-0/0/47";
    gigether-options {
    802.3ad ae1;
    }
    }
    ge-1/1/4 {
    description "ATL01PUBSW01 ge-1/0/0";
    per-unit-scheduler;
    gigether-options {
    802.3ad ae3;
    }
    } ge-1/1/8 {
    description "ATL01TORSW01 ge-1/0/47";
    gigether-options {
    802.3ad ae1;
    }
    }
    ae1 {
    description "ATL01PUBSW01 ae0";
    flexible-vlan-tagging;
    encapsulation flexible-ethernet-services;
    aggregated-ether-options {
    lacp {
    active;
    }
    }
    unit 450 {
    family bridge {
    interface-mode trunk;
    vlan-id-list 450;
    }
    }
    unit 451 {
    family bridge {
    interface-mode trunk;
    vlan-id-list 451;
    }
    }
    ae3 {
    per-unit-scheduler;
    vlan-tagging;
    aggregated-ether-options {
    lacp {
    active;
    }
    }
    unit 2450 {
    vlan-id 2450;
    family inet {
    address 10.4.1.234/31;
    }
    family inet6 {
    address 2620:1a:e000::3:0/127;
    }
    }
    unit 2451 {
    vlan-id 2451;
    family inet {
    address 10.4.1.230/31;
    }
    family inet6 {
    address 2620:1a:e000::3:4/127;
    }
    }
    }
    irb {
    unit 450 {
    family inet {
    address 10.4.50.1/24;
    }
    family inet6 {
    address 2620:1a:e000:32::1/64;
    }
    mac 00:00:00:00:04:50;
    }
    unit 451 {
    family inet {
    address 10.4.51.1/24;
    }
    family inet6 {
    address 2620:1a:e000:33::1/64;
    }
    mac 00:00:00:00:04:51;
    }
    }

    }
    routing-instances {
    EVPN_Test {
    instance-type virtual-switch;
    protocols {
    evpn {
    remote-ip-host-routes;
    encapsulation mpls;
    extended-vlan-list 450-451;
    default-gateway do-not-advertise;
    }
    rstp {
    disable;
    }
    mstp {
    disable;
    }
    vstp {
    disable;
    }
    }
    interface ae1.450;
    interface ae1.451;
    bridge-domains {
    bd450 {
    domain-type bridge;
    vlan-id 450;
    routing-interface irb.450;
    bridge-options {
    interface ae1.450;
    }
    }
    bd451 {
    domain-type bridge;
    vlan-id 451;
    routing-interface irb.451;
    bridge-options {
    interface ae1.451;
    }
    }
    }
    route-distinguisher 10.255.1.4:450;
    vrf-target target:65000:450;
    }
    VRF_450 {
    instance-type vrf;
    protocols {
    bgp {
    group fw-v4 {
    type external;
    local-address 10.4.1.234;
    export bgp-direct;
    peer-as 65001;
    local-as 65002;
    neighbor 10.4.1.235;
    }
    group fw-v6 {
    type external;
    local-address 2620:1a:e000::3:0;
    export bgp-direct;
    peer-as 65001;
    local-as 65002;
    neighbor 2620:1a:e000::3:1;
    }
    }
    }
    interface ae3.2450;
    interface irb.450;
    interface lo0.2450;
    route-distinguisher 10.255.1.4:2450;
    vrf-target target:65000:2450;
    vrf-table-label;
    }
    VRF_451 {
    instance-type vrf;
    protocols {
    bgp {
    group fw-v4 {
    type external;
    local-address 10.4.1.230;
    export bgp-direct;
    peer-as 65005;
    local-as 65006;
    neighbor 10.4.1.231;
    }
    group fw-v6 {
    type external;
    local-address 2620:1a:e000::3:4;
    export bgp-direct;
    peer-as 65005;
    local-as 65006;
    neighbor 2620:1a:e000::3:5;
    }
    }
    }
    interface ae3.2451;
    interface irb.451;
    interface lo0.2451;
    route-distinguisher 10.255.1.4:2451;
    vrf-target target:65000:2451;
    vrf-table-label;
    }
    }



  • 4.  RE: EX Switch Losing ARP Requests

     
    Posted 01-13-2020 09:28

    Seems ARP lost somewhere between MX and WS in MX->WS direction.

    As I said earlier, I am positive write a static ARP on MX irb for WS can make some difference.

     

    I'd recommend to open JTAC cases to dig into your setup. Unless someone else can find something obvious that is missing



  • 5.  RE: EX Switch Losing ARP Requests

    Posted 01-13-2020 10:32

    Hi Mengzhe,

     

    Yes, did try a static ARP entry on the MX this morning and did make things work and when removed things would go back to not working.

     

    I also decided to try creating a another vm workstation this morning using CentOS instead of Windows 10 that the current problematic workstation was using. After creating the new workstation the MX was able to ping the new CentOS workstation. So I moved the Windows workstation to a different vlan that does not go through the topology I am working on and applied Windows Updates to it. I then moved it back into the new topology and MX had no problems ping it. So that leads me to believe that build of Windows 10 on the ISO used to build the workstation has a problem with it's network stack.

     

    So this puts me down to just having an issue now with the SRX not being able to ping either workstation and also neither workstation being able to ping the SRX. The SRX and the workstations are able to ping all the interfaces in the VRF and just not beyond the MX.

     

    I have to move on to some other things today, put hopefully tomorrow morning something with the configs will jump out as being the issue.

     

    I would love to open a JTAC case, but unfortunately this all lab gear that i do not have support contracts for.



  • 6.  RE: EX Switch Losing ARP Requests

    Posted 03-25-2021 09:41
    bonjour Bretth,
         I found your question while investigating a very similar problem that  affect our network of 250+ EX2300-C and would like to know what you are up to with this issue. 

    Our 2300s   will also basically stop answering ping for 8 -10 sec  every x minutes and disappear from the desktop ARP table trying to reach them.
    Putting a static ARP entry on the desktop  solves the problem. 
    we also foudn that if we hook a computer to a  2300 in the lab while the switch is still connected to the network, the arp loss occurs on the local connected computer. But the problem disappear if we just disconnect the switch from the network !
    so there seems to be a limitation somewhere in the switch ability to handle arp requests. 
    We started noticing some weeks ago our monitoring software kept issuing disconnecting alarms saying it loss connection to the 23000s for 0 sec. 

    So I was curious to know if you had any progress on that problem.
    I am about to open a case with JTAC regarding this. 
    Michel Lapointe

    ------------------------------
    Michel Lapointe
    ------------------------------



  • 7.  RE: EX Switch Losing ARP Requests

    Posted 03-25-2021 09:47
    Check your logs . That sounds like you have a duplicate IP .
     KERN_ARP_DUPLICATE_ADDR: duplicate IP address  ip.add.ress  sent from address: aa:aa:cc:cc:dd:11



  • 8.  RE: EX Switch Losing ARP Requests

    Posted 03-25-2021 09:52
    i'll check and keep you posted  - thanks.

    ------------------------------
    Michel Lapointe
    ------------------------------



  • 9.  RE: EX Switch Losing ARP Requests

    Posted 03-25-2021 10:44

    bonjour tgreaser, 
    just went through the 9 messages files of a switch that is doing the drops right now and could not find any mention of "duplicate" or "arp".

    my  colleague also activated the messages live monitoring and did not see anything. 

    duplicate Ip are always possible, but we keep a pretty tight  noose on these ones. 

     

    I'll be opening the case today.
    Michel

     

     

     

     






  • 10.  RE: EX Switch Losing ARP Requests

    Posted 09-29-2021 13:01

    Hello Michel,
    did you have any luck with TAC regarding your ARP issue? My problem is almost the same as you have described.

    Thank you, and looking forward to your answer,

    Pavel



    ------------------------------
    PAVEL VALACHPAVEL VALACH
    ------------------------------



  • 11.  RE: EX Switch Losing ARP Requests

    Posted 09-30-2021 11:16

    Bonjour Pavel,

     

    To make a long story short...

     

    The quick solution was to make a  fixed arp entries in the host trying to reach the 2300s. Basically , not doing ARP requests.

    The permanent solution is we ended up moving all hosts trying to access the 2300s behind a router and set it,s ARP cache duration to more than 200mns, so to achieve the same result.

     

    JTAC was helpful in confirming that what we were seeing  was a limitation of the 2300s. here is the relevant communication from them.

    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

    I've been checking this scenario with our internal resources, I found this internal report that explains how EX2300 handles ARP requests, the ARP handling is different than EX3400 or other platforms. As per previous experiences, I think this behavior is expected on EX2300 – product limitation.

     

    In EX2300, ARP requests for VLANs where there is excessive flooding is dropping the ARP request for vlans used for management purposes due to which ARP timeout happens once in a while when there is a flood of ARP traffic.

     

    Ø  PR info:In EX2300, transit ARP requests entering a port can get trapped to the CPU even if no IRB is configured on the VLAN. This can result in unnecessary ARP requests to the CPU and in extreme cases result in drops of genuine ARP requests in the ARP queue to CPU.

    https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1365642

    ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

     

    Seeing multiple consecutive ping being lost is always a concern, but in this case, the switch traffic was not affected, which was our main concern.

    Hope that helps.

     

    Michel Lapointe

     

     

     

     

     

    __________________________________________

    Michel Lapointe

    Projet GIRAT 2.0

    Michel.lapointe@girat.org

    Cell : (819) 279-0844

     

    GIRAT     http://www.girat.org

    (Gestion de l'inforoute régionale de l'Abitibi-Témiscamingue)

    Girat_Image30x116

     

     






  • 12.  RE: EX Switch Losing ARP Requests

    Posted 09-30-2021 14:47

    Bonjour Michel,

    thank you for the info. This helps a lot, and I might actually just want to use the dedicated management interface for our purposes.
    If the IRB is this unreliable with some funky traffic, then I would even dispute the "extreme cases" claim in the PR, as it is not *that* hard to generate such traffic to make IRB interface drop ARPs.

    Best regards,



    ------------------------------
    PAVEL VALACH
    ------------------------------