Original Message:
Sent: 10-01-2021 16:38
From: CHRIS RUSSELL
Subject: QFX + arp
Update, seeing this even on the same leaf now:
Server 3 on leaf1, xe-0/0/16:
root@server3:/home/pulsant# arp
Address HWtype HWaddress Flags Mask Iface
172.16.20.202 (incomplete) eno2 <-- this is what we're trying to ping - on the same leaf:
From leaf 1:
Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
Vlan MAC MAC Logical SVLBNH/ Active
name address flags interface VENH Index source
bd9105 78:2b:cb:42:7e:a9 D xe-0/0/0.0
bd9105 78:2b:cb:62:cd:76 D xe-0/0/16.0
netconf@Leaf-1> request pfe execute command "show l2 manager mac-table" target fpc0
SENT: Ukern command: show l2 manager mac-table
route table name : default-switch.4
mac counters
maximum count
0 3
mac table information
mac address BD learn Entry entry hal hardware info
Index vlan Flags ifl ifl pfe mask ifl
----------------------------------------------------------------------------------
78:2b:cb:42:7e:a9 3 0 0x0814 xe-0/0/0.0 xe-0/0/0.0 0 0x1 0
78:2b:cb:62:cd:76 3 0 0x0814 xe-0/0/16.0 xe-0/0/16.0 0 0x1 0
d4:ae:52:65:40:35 3 0 0x0014 vtep.32769 vtep.32769 0 0x1 0
Displayed 3 entries for routing instance default-switch.4
so in, the switch table, in the pfe, but the linux box cannot learn the arp from the qfx
Chris
------------------------------
CHRIS RUSSELL
Original Message:
Sent: 10-01-2021 15:49
From: CHRIS RUSSELL
Subject: QFX + arp
Hey All
Given ARP seems to be a theme, we're playing with fabrics right now and have a small fabric of QFX5100s as leaf, QFX5110 as spine, MX's will be super spines for CRB but this is below that level as yet so think, 2 leafs, 2 spines, underlay, overlay (wombling free for those in the uk), and a RR Mesh peer between the spines. no irbs as yet.
Issue: we seem to lose arp from the leaf QFXs -> connected hosts. QFX's all running Junipers recommended 20.3R3.8
servers (all ubuntu hosts running nothing other than to ping) 1+3 on leaf 1, same vlan/vni, server 2 on leaf 2 - same vni has servers 1+2.
Symptoms:
Server 1-> Server 3, so same leaf - ping works but we see a huge spike of around 200-900ms every 15 seconds or so. really consistent in terms of replies and that spike.
Server 1 or 3 -> Server 2 (across the fabric), ping is sporadic. we'll get maybe 40-50 replies then it just dies, and randomly may or may not come back. looking at linux, it can't find an arp for the other side.
If we throw a static arp on the 2 servers, ping is rock solid at 0.1ms and never drops and we see 958Mbs on iperf, rock solid.
Take off those static arps and problem returns.
We've checked: evpn mac tables, default switching tables, vteps, fpc l2 mac tables, and the macs are all there on both leaf's, constantly even when the linux boxes cannot find the arps.
We're also seeing the following on each of the ports attached to the servers: Leaf-2 fpc0 Tx VxLAN UCAST:ifd_out = xe-0/0/0 dst_gport is (c000002) so do not process pkt further
There is literally nothing, nada, zilch on Google for this, we've traditionally been a Cisco house hence I'm now stumped, any ideas welcome!
Thanks
Chris
------------------------------
CHRIS RUSSELL
------------------------------