Good morning,
After having posted this to the DataCenter community, and being "without response" for around 1 month, I thought this community might be a better bet (I couldn't see how to cross-post a post once it had already been published, hence why I have copied the original post here)
We currently have a VXLAN/EVPN backbone in our Datacenter, using Juniper MX480 for Spine devices, and Juniper QFX5100 as Leaf devices.
The MX480 currently has version 21.4R3-S6.5
The QFX5100 currently has version 21.4R3-S5.17
When designing our EVPN network, we used OSPF as the Underlay IGP, and then BGP EVPN family for Overlay, in order to extend our VXLAN over the different devices.
We then have Proxmox servers connected to our different leaf devices, using ESI, such that each server is dual-homed to 2 leaf devices in the same rack.
We have followed the main EVPN design from - https://www.juniper.net/documentation/us/en/software/junos/evpn-vxlan/topics/example/evpn-vxlan-mx-qfx-configuring.html (although in our case, CORE and SPINE are the same device).
The Spine devices have the irb interfaces (it is all centrally-routed with bridging (CRB) EVPN architecture), and the leaf devices simply insert interfaces/VLANs in different VNIs.
On these irb interfaces, we work with both IPv4 and IPv6 for our end-VMs on our Proxmox servers.
The issue we are encountering is that, for whatever reason, the EVPN Type-2 MAC+IP routes for certain VMs suddenly dissapear, generating a disruption in service. The curious thing is that it only affects IPv4, not IPv6.
It doesn't seem that the MAC is being removed from the switching-table, seeing as IPv6 in theory is still announced via EVPN BGP.
When looking at our EVPN MAC Database, there are no signs of MAC duplication, or ARP issues, and looking at logs on all of our devices, we can't see why the MAC+IP routes are suddenly deleted.
The most curious thing though, is that it seems to only happen on TRUNK interfaces. We have a different VLAN (747) used only as ACCESS VLAN, and the VMs on this network never lose conectivity, but VLAN 305 which is TRUNK on different links, loses comunication randomly.
Does anyone have any idea as to what might be happening internally?
We can share configuration and logs if neccesary.
- James
------------------------------
JAMES HOPKINS
------------------------------