Switching

 View Only
last person joined: 2 days ago 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
  • 1.  Packet Drops - output error

    Posted 07-03-2023 07:29

    Hi all,

    After receiving user complaints, I took it upon myself to investigate the issue. Upon inspection of my entire network, including EX2300 (two or three units in a cluster) and EX3400 (two units in a cluster), I discovered packet drops occurring on the egress of the users' 1Gb ports. I promptly opened a ticket with Juniper for assistance, but unfortunately, they were unable to provide a solution. They did acknowledge the presence of microbursts, but beyond that, their support was limited.

    In an attempt to mitigate the issue, I configured the following:

    set chassis fpc 0 pic 0 q-pic-large-buffer
    set chassis fpc 1 pic 0 q-pic-large-buffer
    

    Regrettably, this configuration adjustment did not yield the desired results.

    If anyone has any further insights or suggestions on how to address this problem of packet drops on the egress ports, I would greatly appreciate your assistance.

    Thanks



    ------------------------------
    TOMAS
    ------------------------------


  • 2.  RE: Packet Drops - output error

    Posted 07-10-2023 03:59

    Hi, nobody any suggestion?



    ------------------------------
    TOMAS
    ------------------------------



  • 3.  RE: Packet Drops - output error

    Posted 07-11-2023 09:45

    Have you tried this?

    set class-of-service shared-buffer percent 100

    I recommend this setting for all deployments. I have seen drastic improvements by using this.




  • 4.  RE: Packet Drops - output error

    Posted 07-12-2023 09:10

    Hi, 

    Yes, I forgot to mention it here. I am using this command on every switch. I was wondering if 100 is the appropriate number. There isn't a specific "magic number" - what is considered best practice? It's not always beneficial to have the maximum value. If I don't use this command, what is the default value?

    Thanks



    ------------------------------
    TOMAS
    ------------------------------



  • 5.  RE: Packet Drops - output error

    Posted 07-12-2023 10:13
    The default setting is platform specific I think. Other vendors use like 25% and I suspect 25-40% is what Juniper uses. 100% has no negative impact in all normal and even odd use cases, just for "crazy corner cases". If multiple interfaces need the shared pool, they will contend for it, causing semi-fair access to the pool anyway. If that is often the case, you have too little capacity anyway, so you really should upgrade the links.





  • 6.  RE: Packet Drops - output error

    Posted 2 days ago

    Hi Tomas,

    Can you please advise if you have found some solution for this problem?
    Would be much appreciated if you can share how (if) you fixed the issue.

    BR,
    Andrei



    ------------------------------
    Andrei Cebotareanu
    ------------------------------



  • 7.  RE: Packet Drops - output error

    Posted yesterday

    Hi!

    Output drops generally occur if the uplink is faster than the problematic interface or you have multiple uplinks of the same capacity (multiple 1 G to a single 1 G). This is called rate conversion. The higher the rate conversion factor, the more likely you are to see tail drops. What also drives this is if you have multiple users behind an interface, like in a distribution switch, but single user interfaces can also see this.

    What happens is that one or more users request data at the same time (within milliseconds). Statistically, this will happen once in a while and the likelihood increases with the number of users. If the sources of the data, from where the users requested the data, can push the replies with a high data rate and the links to the switch have more capacity than the problematic one, you get a very temporary queue of packets that want to exit a specific interface. Say you have 2 x uplinks with 10 G and an access interface with 1 G. Even if you only have a single user on that interface, it is possible to request so much data that the buffer is overrun. That said, the problem is much, much more likelt to occur in a distribution switch where you have lots of users behind one interface, like if you have a stack of 4 x 48 ports.

    Sadly, there's little one can do about this apart from the 100% shared buffer setting, which has proved extremely successful for me in multiple scenarios with no negative side effects.

    One thing that can make things better if the distributions switches see this is to increase the number of links from the dist to access (as I assume that's where the issue is seen). If you have 2 x 1 G there, increasing to 4 x 1 G could make a difference, but going to 2 x 10 G will certainly do it. Splitting a VC/stack of 4 x switches into two VCs or even split them to single switches will, in my experience, make even more of a difference than going to 4 x 1 G as the number of potential users that can "collide" is reduced.

    Juniper's EX2300 and 3400 switches have, as other vendor's products in that segment, very shallow buffers. They are in the 2-4 MB range. The old EX2200 had 1.5 MB per PFE, meaning that the first 24 port had 1.5, if 48 port swith, the next 24 ports had 1.5 and the SFPs had 1.5 MB. The EX4200 had 3 MB per PFE, so double the amount. I don't have the numbers for EX2300 and up. Surely, the EX4100 series have more buffers and the EX4400 presumably even more. That brings the question: for how long do you want to wait for a packet? If the buffer can hold a second worth of traffic and uses it from time to time, you have a serious design issue!

    Sorry, not a solution, but perhaps some insight. The more info you can provide, the better replies you get!




  • 8.  RE: Packet Drops - output error

    Posted yesterday

    Hi, Juniper didn't give me any reliable solution. I have stopped monitoring this parameter, and thank God the users stopped complaining about connectivity.



    ------------------------------
    TOMAS
    ------------------------------



  • 9.  RE: Packet Drops - output error

    Posted yesterday

    I'm still relatively new to Juniper and am always looking to learn more.  In trying to find packet drops, is this a good command to do so?

    show interfaces extensive | match drops

    Output of this command looks like this:

    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
        Carrier transitions: 3, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
        Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
        Carrier transitions: 3, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
        Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Policed discards: 0, L3 incompletes: 0, L2 channel errors: 0, L2 mismatch timeouts: 0, FIFO errors: 0, Resource errors: 0
        Carrier transitions: 3, Errors: 0, Drops: 3574, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
        Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
        Carrier transitions: 0, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0
        Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Giants: 0, Policed discards: 0, Resource errors: 0
        Carrier transitions: 0, Errors: 0, Drops: 0, MTU errors: 0, Resource errors: 0

    From what I'm seeing, the sixth line with the 3574 drops would be an issue, correct?  My problem, is great I found something, but a) is it really anything and b) now that I have that discovered how in the world do I narrow it down to WHICH interface that is?  Is it the "third" configured interface when I view show interfaces terse?

    Thanks everyone!



    ------------------------------
    JIM VADEN
    ------------------------------



  • 10.  RE: Packet Drops - output error

    Posted yesterday

    Here is a rather advanced expression for you that prints the interface names and then the counters for interfaces with drops. All other lines with no drops (last column is a single 0) will be hidden. The heading with "Dropped packets" is also shown for clarity.

    Thsi works in EX4100-F at least. Older versions have slightly different headings so it is difficult to create an exact match expression that will comer all versions.

    show interfaces "[xg]e*" extensive | match "Physical|Dropped packets| 0 +[0-9]+ +[0-9]+ +[1-9]"

    To see that the expression actually does what it's supposed to, change the last [1-9] to [0-9], so even those lines that end with a single 0 are shown:

    show interfaces "[xg]e*" extensive | match "Physical|Dropped packets| 0 +[0-9]+ +[0-9]+ +[0-9]"

    Remove the heading to compress things:

    show interfaces "[xg]e*" extensive | match "Physical| 0 +[0-9]+ +[0-9]+ +[1-9]"

    Look at all the queues, bit just best effort (0):

    show interfaces "[xg]e*" extensive | match "Physical| [0-9] +[0-9]+ +[0-9]+ +[1-9]"

    If the above looks weird, read up on regex ,regular expressions and you'll understand (some day).

    You can also look in this section of the output from show interfaces extensive:

      Queue counters:       Queued packets  Transmitted packets      Dropped packets
        0                        703760236            703760236                    0
        1                                0                    0                    0
        2                                0                    0                    0
        3                         10682623             10682623                    0
        8                         57193202             48991457              8201745
        9                                0                    0                    0
        10                               0                    0                    0
        11                               0                    0                    0

    Here, queue 8 (multicast) has taken a beating due to a simulated loop. For normal tail drops, you will see drops in best effort, queue 0, which is what I focus on in the expressions above.