SRX

Expand all | Collapse all

Why traffic is very slow over ipsec

  • 1.  Why traffic is very slow over ipsec

     
    Posted 04-29-2019 00:25

    Hi all,

    I am having performance issue due to Ipsec traffic. I checked the onsite devices(srx/ex switches) that are fine. Can I pls ask any idea about why traffic thru datacentre is very slow. And also how to well verify whether or not re-sizing or fregmantation are happening between 2 end over tunnel? Appreciate your help.

     

    Shortly topology is:
    ex2200---->SRXbranch1------>3rd party ISP(mpls)------->650SRX(high number of IPSec tunnels are being terminated here)----->core mpls network(all resources here).

     

    SRXbranch>show configuration security | display set | match mss
    set security flow tcp-mss all-tcp mss 1450


    {master:0}
    ex2200> traceroute y.y.y.y source Z.Z.Z.Z ------------>Z.Z.Z.Z is a WIFI l3 vlan interface that sits on ex2200, y.y.y.y is a remote tunnel on 650SRX.
    traceroute to y.y.y.y (y.y.y.y) from Z.Z.Z.Z, 30 hops max, 40 byte packets
    1 c.c.c.c (c.c.c.c) 3.888 ms 3.290 ms 3.860 ms
    2 y.y.y.y(y.y.y.y) 25.513 ms 25.991 ms 37.813 ms
    {master:0}
    ex2200>

    master:0}
    ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1473
    PING y.y.y.y (y.y.y.y): 1473 data bytes
    ping: sendto: Message too long
    ping: sendto: Message too long
    ping: sendto: Message too long
    ....

    ......
    --- y.y.y.y ping statistics ---
    5 packets transmitted, 0 packets received, 100% packet loss

    {master:0}
    ex2200>


    {master:0}
    ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1472
    PING y.y.y.y (y.y.y.y): 1472 data bytes
    1480 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=30.230 ms
    1480 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=29.576 ms
    1480 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=28.392 ms
    ..
    ...
    ....
    --- y.y.y.y ping statistics ---
    100 packets transmitted, 83 packets received, 17% packet loss
    round-trip min/avg/max/stddev = 28.392/41.628/82.196/15.626 ms
    {master:0}
    ex2200>


    {master:0}
    ex2200> ping y.y.y.y source z.z.z.z size 1400
    PING y.y.y.y (y.y.y.y): 1400 data bytes
    1408 bytes from y.y.y.y: icmp_seq=0 ttl=63 time=42.407 ms
    1408 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=44.052 ms
    1408 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=35.322 ms
    1408 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=29.590 ms
    .
    ..
    ...
    1408 bytes from y.y.y.y: icmp_seq=47 ttl=63 time=29.715 ms
    ^C
    --- y.y.y.y ping statistics ---
    48 packets transmitted, 46 packets received, 4% packet loss
    round-trip min/avg/max/stddev = 29.106/41.144/73.689/13.243 ms

    {master:0}

     

    Erux..



  • 2.  RE: Why traffic is very slow over ipsec

     
    Posted 04-29-2019 10:10

    I'd suggest setting the mss for ipsec traffic to 1328 to account for the various sources of overhead. This will fragment larger packets prior to encryption, but should prevent fragmentation outside of the tunnel.

     

    set security flow tcp-mss ipsec-vpn mss 1328

     

    https://packetpushers.net/ipsec-bandwidth-overhead-using-aes/



  • 3.  RE: Why traffic is very slow over ipsec

    Posted 04-30-2019 12:12

    Hello Arix,

     

    Here is a breakdown of packet size in your network shown in the post.

     

    Assuming your traffic is using TCP protocol with IPv4   : -

     

    TCP Header (20 bytes) + IP Header (20 bytes) + ESP Header (38 bytes) + External IPv4 header (20 bytes) + Ethernet Switching including VLAN (18 bytes) + MPLS header (4 bytes) =  120 bytes

     

    In this case, an MTU of 1518 on SRX allows you to have 1398 bytes of payload.

     

    Note that the SRX MTU includes Ethernet switching header whereas other devices may only calculate it without Ethernet header and hence have a lower number.

     

    I would suggest you to set the MSS in the range 1350 bytes.

     

    If you simply want to see if the fragmentation is occuring or not, you can do a capture before SRX and see if any of the ESP packet has "More Fragment" flags available.

     

    Hopefully this helps!

     

    Thanks!



  • 4.  RE: Why traffic is very slow over ipsec

     
    Posted 05-06-2019 17:02

    Hi all,

    Thanks for reply...I have some questions relating to the same case here:

    • Why do we need to capture the packet on EX that is directly connected to srx. Why not on the SRX.
    • How can be easly captured the packet on Ex or srx? 
    • 3rd party ISP has mpls, how can we get about ISP's mss value?
    • After the ISP, how can we verify packet size, and mss size when packet arrives the other end -SRX3400 on the datacentre (End-to-End mss value verification for ipsec traffic.)


  • 5.  RE: Why traffic is very slow over ipsec

     
    Posted 05-08-2019 16:54

    Hi all,

    Any chance to address my concern that previously I posted?

     

    Really appreciate your ideas, technics, approaches..

    look forward to seeing your reply..

    Thanks



  • 6.  RE: Why traffic is very slow over ipsec

     
    Posted 05-10-2019 03:59

    Hello,

     

    This is a very common issue we see with performance over IPSec VPN. I would therefore first try to set the tcp-mss value for VPN traffic as suggested by "CRM" earlier and check for any performance improvement.

     

    set security flow tcp-mss ipsec-vpn mss 1328

     

    Please ensure to have this set on both sides of the VPN tunnel. On the branch and the hub location.

     

    Getting into packet captures can get messy and time-consuming. Fragmentation may not necessarily be happening on the firewall. Frag and de-frag anywhere along the path is a costly operation and can impact latency.

     

    Regards,

     

    Vikas



  • 7.  RE: Why traffic is very slow over ipsec

     
    Posted 06-05-2019 19:48

    Hi all,

    Just following up my previous post....

     

    When further delving into the case, Packet dropped and Fragment packet are rapidly increasing on branches and hub srx device. After clearing flow statistics, in 10-min-timeframe I have got the following output from one of the branches and hub srx devices. +500 sites connected to the hub over ipsec vpn. Only branches have been configured as mss 1450

    Here there are two things must be concerning. From the output, one is fragment packet and the second is Packet dropped.. Are these two things are different issues or same? And also their increasement nearly same at branch site. If a packet is fregmanted, why drop happens? It must be something different? How to determine these issues?

     

    Before putting mss 1328 into current configuration, I need some evidence from efficient troubleshooting that shows fragment and drop happening? And what is the impact when playing mss value start point of 1328 during the business hours?

    Look forward to seeing your replies.

    Note: Previously I have got your all value ideas, techniques, approaches, but this time I want to do more comprehensively.

       

    Branch site:

    >show security flow statistics
    Current sessions: 877
    Packets forwarded: 805455
    Packets dropped: 18626
    Fragment packets: 26961

     

    set security flow tcp-mss all-tcp mss 1450

     

    Hub site:

    >show security flow statistics
    Current sessions: 20662
    Packets forwarded: 14079819
    Packets dropped: 3851
    Fragment packets: 258276

     

     

    Thanks

    Ar



  • 8.  RE: Why traffic is very slow over ipsec

     
    Posted 06-06-2019 21:42

    any reply from my previous post?



  • 9.  RE: Why traffic is very slow over ipsec

    Posted 06-12-2019 00:20

    Arix, Can verify whether or not "replay errors" counter is incrementing via twice running the command "show security ipsec statistics"  



  • 10.  RE: Why traffic is very slow over ipsec

     
    Posted 06-12-2019 19:36

    Hi all

     

     I am not sure what your idea is about checking Replay errors? But I did for you.

     

    >show security ipsec statistics
    ESP Statistics:
       Encrypted bytes: 258828544
        Decrypted bytes: 323126770
        Encrypted packets: 842164
        Decrypted packets: 800696
    AH Statistics:
        Input bytes: 0
        Output bytes: 0
        Input packets: 0
        Output packets: 0
    Errors:
    AH authentication failures: 0, Replay errors: 0
    ESP authentication failures: 0, ESP decryption failures: 0
    Bad headers: 0, Bad trailers: 0

     

    My aim here is to find a SIGN/EVIDENCE from traceoptions or firewal filter's logs that says fregmentation is happening.

    Recently I've done the following traceoptions on the srx box. I couldn't see any sign that says fragmentation is happening. But only saw the the following things in red color. please see.

    Why can't we see a fragmentation is hapening to IPSec traffic as current mss configuration is ONLY "set security flow tcp-mss all-tcp mss 1450" as fragment packet's number from the sh sec flow statistic has been huge rapidly increasing. 

     

    >show security flow statistics
          Current sessions: 225
         Packets forwarded: 14444351807
         Packets dropped: 162144762
         Fragment packets: 864461746

     

    1-) Is the capturing the packet with traceoptions's location on SRX correct or it must be on Ex switch location for capturing?

    2-) If we can't see fragmentation on the traceoptions files from the flow module on the srx, is the fragmentation happening before packets go the flow module? If so, where is it happening? on Physical interface? Which tool should be used for? traceoptions, firewal filter?

    ..

    ....

    Jun 12 14:10:11 14:10:11.054473:CID-0:RT:pre-frag not needed: ipsize: 783, mtu: 9188, nsp2->pmtu: 9188

    Jun 12 14:10:11 14:10:11.085986:CID-0:RT:pre-frag not needed: ipsize: 844, mtu: 1422, nsp2->pmtu: 1422

    ....

    ........

     

    Set security flow traceoptions file Fregmentation_Check files 3 size 5m world-readable
    Set security flow traceoptions flag basic-datapath
    Set security flow traceoptions packet-filter packet-filter1 source-prefix 10.108.103.246
    Set security flow traceoptions packet-filter packet-filter2 destination-prefix 10.108.103.246

    Note: (all traffic routed to the ip address of 10.108.103.246 on SRX before goes to IPSec tunnel)

    Thx.

    Ar



  • 11.  RE: Why traffic is very slow over ipsec

    Posted 06-14-2019 00:44

    Regarding:

     

    Branch site:
    
    >show security flow statistics
    Current sessions: 877
    Packets forwarded: 805455
    Packets dropped: 18626
    Fragment packets: 26961

     

    The fragments counter only means that IP fragments were received at the flow module but dont necessarily mean that those fragments came over the tunnel,

     

    In junos 15.1X49-90 two new fields were included to that output: Pre fragments  and Post fragments . This is because when the SRX is about to send packets over  a VPN it can fragment the packets prior encapsulating them (because of a low MTU value on the st0 interface) or fragment the packets after encrypting them (based on the physical interface MTU). Please check  motd's reply on the following post:

     

    https://forums.juniper.net/t5/SRX-Services-Gateway/VPN-Fragmentation/td-p/87642

     

    Please share the MTU configured on your st0 interfaces, if its somthing like ~9000, then it is more likely that the packets are fragmented after being encrypted, in which case the pcap for confirming fragmented ESP packet will help you. Actually the traces tell you that pre-fragmentation wasnt needed:

     

    Jun 12 14:10:11 14:10:11.054473:CID-0:RT:pre-frag not needed: ipsize: 783, mtu: 9188, nsp2->pmtu: 9188
    
    Jun 12 14:10:11 14:10:11.085986:CID-0:RT:pre-frag not needed: ipsize: 844, mtu: 1422, nsp2->pmtu: 1422

     

    Again the fragmentation can be happening after the packets are encrypted. If you lower the mss to 1350 you have smaller packets being encrypted hence smaller encrypted packets and less chances that they will be fragmented.

     

    +And what is the impact when playing mss value start point of 1328 during the business hours?

    R/ MSS is dictated during the TCP 3-way-handshake, hence it will only affect new TCP connections being negotiated over the tunnel.

     

    I really hope this information helps you.

     



  • 12.  RE: Why traffic is very slow over ipsec

     
    Posted 06-18-2019 03:46

    Hi stwardlp,

    Thanks for your replies. I have read your posts. I will review again and get back to you. There is some interesting tips you pinpointed. I need some time to deal with it....

    Much appreciated....

     

    Thanks



  • 13.  RE: Why traffic is very slow over ipsec

    Posted 06-23-2019 09:09

    Hi Arix,

     

    Here are two interesting documents, you might want to look at them as well for df bit and fragmentation issue on traffic over VPN.

     

    https://rtodto.net/ipsec-tcp-mss-df-bit-and-fragmentation-in-srx/

    https://kb.juniper.net/InfoCenter/index?page=content&id=KB25625&cat=OBSOLETE&actp=LIST

     

    Thanks

    Mahesh



  • 14.  RE: Why traffic is very slow over ipsec

     
    Posted 07-05-2019 05:22

    hi all,

     

    Can anyone explain about when sending icmp via ping  throught Ipsec tunnel, what final packet size will be?

    There is only the following statement. -Set security flow tcp-mss all-tcp-mss 1460

    The st0 sits on the external pysical interface -vdsl that protocol MTU is 1500

     

    srx345>ping 10.10.10.10 source 20.20.20.1 ---->When pinging, what is the final packet size?

     

    1460B+20B int IP+20B int TCP header+38B ESP+20B exIP+20B icmp IP+8B icmp header? Is this correct or?

     

     



  • 15.  RE: Why traffic is very slow over ipsec

     
    Posted 07-06-2019 20:17

    Any idea about my previous post?



  • 16.  RE: Why traffic is very slow over ipsec

     
    Posted 03-28-2020 04:32

    Hi there.

     

    So far I do not have a resolution, but I am struggling with a similar issue.

    Upload performance from customer sites to their central HQ is bad, the same is download from from sites to the HQ, in other words it is bad when the direction of the traffic is towards the HQ.

    HQ: SRX1500 Cluster 15.1X49-D190

    Sites: SRX340 Cluster 15.1X49-D190 (D210 has been testet with no affects)

     

    IP gateways for the VLANs are generally on the Core switch.

    Edge Switches<L2>CORE<L3>SRX340<VPN>SRX1500<L3>CORE<L2>Edge Switches

     

    I have tested VPN to the HQ SRX1500 on SRX210 (12.1X46-D86) and SRX3400 (12.3X48-D95) and here performance is good. There is however no CORE switch routing first in this setup.

     

    We do have a TAC case, but despitate a lot of investigation there is still not resolution after two months.

     

    I will keep you posted.

     

    Any input to this out there is valued.

     

    Stay safe and healthy.

     

     



  • 17.  RE: Why traffic is very slow over ipsec

    Posted 06-13-2019 23:30

    Hi Arix,

     

    I can see that you have the following command: set security flow tcp-mss all-tcp mss 1450

     

    Note that if you care about the traffic passing over the VPN then the option you need is "set security flow ipsec-vpn mss [value]". This way you only affect VPN traffic which is the one having an extra overhead due to the esp and new IP headers added.

     

                   If TCP packet enters an IPsec VPN tunnel, then an ipsec-vpn mss value has high priority over all-tcp mss value, hence ipsec-vpn mss value is set.

                  Ref: https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/security-edit-tcp-mss.html

     

    Note the caveat of this command: it only affects traffic (TCP SYN messages) entering a VPN, not coming via a VPN. This is very important because in order to affect both ways traffic, we need to set the command on both ends of the tunnel.

     

    Another consideration is that the regular MSS value for a TCP segment is 1460 bytes hence you are not changing much on the final packet size with a value of 1450. Please review epaniagua's explanation of MSS on the following forum:

     

               https://forums.juniper.net/t5/SRX-Services-Gateway/Site-to-Site-VPN-TCP-MMS-Issue/td-p/444842

     

    1460B of Data + 20B of TCO header + 20B of IP Header = an IP packet of 1500Bytes (the common MTU value derived from the common 1460B MSS)

     

    As stated in that forum, its a best practice to use MSS 1350 for VPN tunnels on both ends. Maybe you can try it and let us know the results.

     



  • 18.  RE: Why traffic is very slow over ipsec

    Posted 06-13-2019 23:37

    Regarding the ping test#1:

     

    ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1473
    PING y.y.y.y (y.y.y.y): 1473 data bytes
    ping: sendto: Message too long
    ping: sendto: Message too long
    ping: sendto: Message too long
    

     

    This packet wont leave your EX device. Juniper by default uses an MTU of 1500 on its interfaces (logical interfaces/units), meaning that the maximum size of a packet that the interface can send is 1500bytes. The packet your are trying to send has this size:

    1473B of Data + 8B of ICMP Header + 20B of IP header = 1501B hence exceding the MTU of the sending interface and getting dropped due to the DF bit being set.

     

    However the ping test#2 give us more valuable information:

     

    {master:0}
    ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1472
    PING y.y.y.y (y.y.y.y): 1472 data bytes
    1480 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=30.230 ms
    1480 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=29.576 ms
    1480 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=28.392 ms
    ..
    ...
    ....
    --- y.y.y.y ping statistics ---
    100 packets transmitted, 83 packets received, 17% packet loss
    round-trip min/avg/max/stddev = 28.392/41.628/82.196/15.626 ms
    {master:0}

     

    First thing to note is a packet loss of 17% of the traffic, which I believe could be the source of your slowness. The amount of data being sent (1472B) accounts for a final IP packet size of 1500B which is the standard MTU size (no problem) but after being encrypted its size will increase. This increase on the packet size can generate fragmentation on future hops/routers carrying those encrypted packets to the remote end of the VPN, where they will be decrypted and the packet will recover their normal size of 1500B. If you take a packet capture on the remote SRX's external interface you could confirm if you are receiving fragmented ESP  packets.

     

    Besides the fact that processing fragments is CPU intesive, if a fragments is lost then the whole original packet will have to be retransmitted and thus generating more fragmentation. This is why it is a best practice to avoid it.

     



  • 19.  RE: Why traffic is very slow over ipsec

    Posted 06-14-2019 00:07

    I also would like to clarify some points:

     

    1) "Note that the SRX MTU includes Ethernet switching header whereas other devices may only calculate it without Ethernet header and hence have a lower number."

     

    This mentioned statement is not entirely true and I would like to avoid any confusions. Juniper handles two type of MTU values:

     

    Protocol MTU (layer 3): this is the maximum size of an IP packet that can be sent/received on a logical interface/unit. Default value: 1500Bytes
    Interface MTU (layer 2): this is the maximum size of an Ethernet frame that can be sent/received on a physical interface. Default value: 1514Bytes (IP packet + 14bytes of Ethernet header):

     

    user@host> show interfaces fe-0/2/1 extensive
    Physical interface: fe-0/2/0, Enabled, Physical link is Up
    Interface index: 129, SNMP ifIndex: 23, Generation: 130
    Link-level type: Ethernet, MTU: 1514, Speed: 100mbps, Loopback: Disabled,
    Source filtering: Disabled, Flow control: Enabled
    .
    .
    .
    Logical interface fe-0/2/0.0 (Index 66) (SNMP ifIndex 46) (Generation 133)
    Flags: SNMP-Traps Encapsulation: ENET2
    Protocol inet, MTU: 1500, Generation: 142, Route table: 0
    Flags: DCU, SCU-out


    Fragmentation happens at Layer 3, the IP header is the header with the fields used for fragmentation; because of this we care about the MTU at layer 3: 1500B by default. We need to make sure that the packets wont exceed 1500B in size else the sending interface will be fragmenting them.

     

    Regarding your questions:

     

    +Why do we need to capture the packet on EX that is directly connected to srx. Why not on the SRX.
    R/ This is not needed, as stated the pcap is needed on the remote SRX to determine if we are receiving fragmented esp packets.

     

    +How can be easly captured the packet on Ex or srx?
    R/not needed

     

    +3rd party ISP has mpls, how can we get about ISP's mss value?
    R/ MSS is a TCP concept (the amount of data that can be carried on a TCP segment). Before the data reaches the MPLS cloud it has to be encapsulated on TCP, then IP, then esp, the IP again. MSS is a concept relevant on the sending host side, where we need to lower it if we want to end up with smaller packets when they reach the MPLS cloud, where they will be encapsulated in MPLS hence ending up bigger in size.


    +After the ISP, how can we verify packet size, and mss size when packet arrives the other end -SRX3400 on the datacentre (End-to-End mss value verification for ipsec traffic.)
    R/ You can take a pcap on the external interface of the SRXbranch1 and there we will be able to see the size of the packet. Then we could sum up 4bytes of the MPLS header being added by your ISP. Im not sure what you meant with SRX3400, I thought that the remote SRX was a SRX650 as per the topology.