Switching

IMPORTANT MODERATION NOTICE

This community is currently under full moderation, meaning  all posts will be reviewed before appearing in the community. Please expect a brief delay—there is no need to post multiple times. If your post is rejected, you'll receive an email outlining the reason(s). We've implemented full moderation to control spam. Thank you for your patience and participation.



  • 1.  QFX + arp

    Posted 10-01-2021 16:22

    Hey All

     Given ARP seems to be a theme, we're playing with fabrics right now and have a small fabric of QFX5100s as leaf, QFX5110 as spine, MX's will be super spines for CRB but this is below that level as yet so think, 2 leafs, 2 spines, underlay, overlay (wombling free for those in the uk), and a RR Mesh peer between the spines. no irbs as yet. 

     Issue:  we seem to lose arp from the leaf QFXs -> connected hosts.    QFX's all running Junipers recommended 20.3R3.8

     servers (all ubuntu hosts running nothing other than to ping) 1+3 on leaf 1, same vlan/vni, server 2 on leaf 2 - same vni has servers 1+2.

     Symptoms:

    Server 1-> Server 3, so same leaf - ping works but we see a huge spike of around 200-900ms every 15 seconds or so.  really consistent in terms of replies and that spike.

    Server 1 or 3 -> Server 2 (across the fabric), ping is sporadic. we'll get maybe 40-50 replies then it just dies, and randomly may or may not come back.   looking at linux, it can't find an arp for the other side.

     If we throw a static arp on the 2 servers, ping is rock solid at 0.1ms and never drops and we see 958Mbs on iperf, rock solid.

     Take off those static arps and problem returns.

      We've checked: evpn mac tables, default switching tables, vteps, fpc l2 mac tables, and the macs are all there on both leaf's, constantly even when the linux boxes cannot find the arps.

     We're also seeing the following on each of the ports attached to the servers:  Leaf-2 fpc0 Tx VxLAN UCAST:ifd_out = xe-0/0/0 dst_gport is (c000002) so do not process pkt further

      
    There is literally nothing, nada, zilch on Google for this, we've traditionally been a Cisco house hence I'm now stumped, any ideas welcome!

    Thanks

    Chris



    ------------------------------
    CHRIS RUSSELL
    ------------------------------


  • 2.  RE: QFX + arp

    Posted 10-01-2021 16:54

     Update, seeing this even on the same leaf now:

    Server 3 on leaf1, xe-0/0/16:  

    root@server3:/home/pulsant# arp
    Address HWtype HWaddress Flags Mask Iface
    172.16.20.202 (incomplete) eno2  <-- this is what we're trying to ping - on the same leaf:

    From leaf 1:

    Ethernet switching table : 3 entries, 3 learned
    Routing instance : default-switch
    Vlan MAC MAC Logical SVLBNH/ Active
    name address flags interface VENH Index source
    bd9105 78:2b:cb:42:7e:a9 D xe-0/0/0.0
    bd9105 78:2b:cb:62:cd:76 D xe-0/0/16.0

    netconf@Leaf-1> request pfe execute command "show l2 manager mac-table" target fpc0
    SENT: Ukern command: show l2 manager mac-table


    route table name : default-switch.4

    mac counters
    maximum count
    0 3
    mac table information
    mac address BD learn Entry entry hal hardware info
    Index vlan Flags ifl ifl pfe mask ifl
    ----------------------------------------------------------------------------------
    78:2b:cb:42:7e:a9 3 0 0x0814 xe-0/0/0.0 xe-0/0/0.0 0 0x1 0
    78:2b:cb:62:cd:76 3 0 0x0814 xe-0/0/16.0 xe-0/0/16.0 0 0x1 0
    d4:ae:52:65:40:35 3 0 0x0014 vtep.32769 vtep.32769 0 0x1 0
    Displayed 3 entries for routing instance default-switch.4

     so in, the switch table, in the pfe, but the linux box cannot learn the arp from the qfx

    Chris



    ------------------------------
    CHRIS RUSSELL
    ------------------------------



  • 3.  RE: QFX + arp

     
    Posted 10-04-2021 05:33
    Hi Chris

    Please check if you are hitting the following

    https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1560173