Switching

Expand all | Collapse all

Virtual chassi s tail drop pakcets

  • 1.  Virtual chassi s tail drop pakcets

    Posted 11-30-2020 12:35
    Hello

    We have a mixed virtual chassis that includes 4xEX4200 and 2xEX4550
    Running 15.1R5.5
    And we're experiencing a problem with tail-drop packets

    for example:
    admin@inter-BB> show interfaces queue ae3    
    Physical interface: ae3, Enabled, Physical link is Up
      Interface index: 132, SNMP ifIndex: 770
      Description: Uplink UCS-2
    Forwarding classes: 16 supported, 4 in use
    Egress queues: 8 supported, 4 in use
    Queue: 0, Forwarding classes: best-effort
      Queued:
      Transmitted:
        Packets              :           23692449417
        Bytes                :        16787250219509
        Tail-dropped packets :               1173192
        RL-dropped packets   :                     0
        RL-dropped bytes     :                     0
    Queue: 1, Forwarding classes: assured-forwarding
      Queued:
      Transmitted:
        Packets              :                     0
        Bytes                :                     0
        Tail-dropped packets :                     0
        RL-dropped packets   :                     0
        RL-dropped bytes     :                     0
    Queue: 5, Forwarding classes: expedited-forwarding
      Queued:
      Transmitted:
        Packets              :                     0
        Bytes                :                     0
        Tail-dropped packets :                     0
        RL-dropped packets   :                     0
        RL-dropped bytes     :                     0
    Queue: 7, Forwarding classes: network-control
      Queued:
      Transmitted:
        Packets              :                 61618
        Bytes                :               6935602
        Tail-dropped packets :                     0
        RL-dropped packets   :                     0
        RL-dropped bytes     :                     0
    ​


    We enabled this setting but didn't help:

    set class-of-service shared-buffer percent 100


    We moved some interfaces between switches from 4200 to 4550 , didn't help
    We also tried the following settings but also didn't help:

    set class-of-service drop-profiles terminal fill-level 100 drop-probability 100
    set class-of-service schedulers best-effort transmit-rate percent 100
    set class-of-service schedulers best-effort buffer-size percent 100
    set class-of-service schedulers best-effort priority strict-high


    We opened a ticket to TAC support , but there is no progress so far

    Has anyone encountered this problem and managed to solved it?



    ------------------------------
    Abed AL-Rahman Bishara
    ------------------------------


  • 2.  RE: Virtual chassi s tail drop pakcets

    Posted 11-30-2020 13:01

    Hello Abed,

    -First make sure this is not a layer 1 issue (physical problem in any of the child interfaces of this AE)
    -Check the duplex of the port
    -Confirm if these drops are a result of congestion (exceeding the interface bandwidth) 

    If you are not facing any of above problems, then most likely these drops are a result of micro-burst (a short spike of packets received in a small interval at a rate much higher than the configured guaranteed bandwidth for a given queue).

    You will have to capture some sample packets on that interface and analyze what type of traffic is causing the micro burst so you can identify the source of those micro-burst.

    Regards,




  • 3.  RE: Virtual chassi s tail drop pakcets

    Posted 11-30-2020 13:22
    Hi

    Thanks for your answer
    Thats not physical problem for sure

    If it is a micro-burst , then what is the most proper commands to increase the bandwidth of queue 0 to avoid the tail dropped issue?
    Do you have a sample of configuration set for ex4550 switch?

    Thanks


    ------------------------------
    Abed AL-Rahman Bishara
    ------------------------------



  • 4.  RE: Virtual chassi s tail drop pakcets

    Posted 11-30-2020 14:41
    It seems you already increased the shared-buffer to 100 percent, then to stop the drops it will be required to find the source of the traffic, since micro-burst can't be fixed with CoS as the congestion issues that can be mitigated with CoS.

    Regards,


  • 5.  RE: Virtual chassi s tail drop pakcets

    Posted 12-03-2020 02:23
    Hi

    I think the best way to accomplish this task (finding micro-bursts) is port mirroring to different port
    Right now we cannot port mirror from 20G LAG interfaces to some other device (since we need something strong to handle that amount of traffic)

    Do you have any other idea to accomplish this task?

    Thank you!

    ------------------------------
    Abed AL-Rahman Bishara
    ------------------------------



  • 6.  RE: Virtual chassi s tail drop pakcets

    Posted 12-07-2020 07:44
    Do you happen to have flow control enabled?

    > show interfaces xe-1/1/0
    Physical interface: xe-1/1/0, Enabled, Physical link is Up
    Interface index: 187, SNMP ifIndex: 625
    Description: Uplink to QFX 2/2 AE30
    Link-level type: Ethernet, MTU: 1522, Speed: 10Gbps, Duplex: Full-Duplex, BPDU Error: None, MAC-REWRITE Error: None,
    Loopback: Disabled, Source filtering: Disabled, Flow control: Disabled

    Flow control can cause massive tail drops.

    Micro bursts are an unaviodable evil of networking. Some solve it with larger buffers, but then you add delay instead and who wants a packet that is half a second old? In some cases it may be better to receive packets with delay than drop them, but not always. I think your problem is that you receive traffic on a high speed interface (10 G?) and try to put ot out on a slower one (1 G?). More interfaces in the LAG may solve the issue but not always as the packet distribution in a LAG is not always the best. Your best bet is to upgrade the link if at all possible. In witch switch do you see the drops? I guess in the EX4500 as that should be your central aggregation given that the EX4200's are 1 G switches. Where does the traffic come from?

    Another way of mitigating tail drops is to split the connection in VLAN groups. If you have, say 2 x 1 G with all your VLANs now, splitting out the worst VLAN to a LAG of its own will reduce or eliminate packet loss for the other VLANs. The EX4200 only has 2.5 MB packet buffers per PFE. PFE 0 serves the first 24 interfaces, PFE1 the next 24 interfaces if this is a 48 port switch and the last one serves the expansion slot (PIC 1). This means that if you have a 48 port switch, you may well see an improvement by splitting the LAG ports between the port groups, say ae0 has members ge-0/0/0 and ge-0/0/24, ae1 has ge-0/0/1 and /25 and so on. This will utilize the buffers from both PFEs for all LAGs. If you have the EX4200-24F, splitting LAGs between the ge-0/0/x and ge-0/1/x interfaces may work as well. Depending on your traffic patterns, this may or may not be optimal, but it's worth thinking about.

    Buffer memory and tail drops are only relevant for traffic sent out on an interface, not receiving it.


  • 7.  RE: Virtual chassi s tail drop pakcets

    Posted 12-07-2020 08:13
    Hi

    Thanks for you answer
    We don't have flow control enabled on affected interfaces
    And both10G interfaces in the LAG are connected to Cisco UCS , 10G interfaces all the way. No 1G interfaces in the middle.
    And also we moved all the interfaces from the 4200 to the 4550 switches. So no 4200 also in the middle

    Regarding splitting VLANs to another interfaces I don't think this is an option for now, since we have only 2 LAGs connected to the UCS and both of them are problematic.

    How much is the buffer size for ex4200 uplin module PIC 10GBASE ? and how much for the ex4550-32f fpc 32 ports ?

    Thanks

    ------------------------------
    Abed AL-Rahman Bishara
    ------------------------------



  • 8.  RE: Virtual chassi s tail drop pakcets

    Posted 12-07-2020 09:05
    Hi!

    In previous searches, I have found this:

    EX2200 has 1.5 MB per PFE (I don't know how the PFEs are distributed here)

    EX3200/4200 has 2.5MB (some say 3 MB) shared buffer per PFE. EX4200 can have 2 or 3 PFEs (2 in 24 and 3 in 48 port versions, the separate 4x1 / 2x10 G port have their own PFE). Edit: EX3200 has 1 or 2 PFEs I think (depending on 24/48 port model) as the optional SFP ports are shared with the last ports (0/0/20-23 or 44-47).
    EX4500 has 4MB shared buffer. The shared buffer on the EX4550 is also said to be 4MB

    Juniper EX4600 12 MB total packet buffers, TX Dynamic up to 8 MB
    Official from Junos Release notes: "EX4650 switches provide significantly more buffer memory (32 MB)"

    So, the uplink module of the EX4200 is 3 or 2.5 MB.

    /Fredrik