Strange problem I'm seeing. Customer currently runs an IPsec VPN over an MPLS link, but they would like to save some money and move to an internet-based IPsec VPN. Problem is, even though the internet-based VPN comes up fine, some application traffic is failing. Switch traffic back to the MPLS-based IPsec VPN and everything comes good.
We've narrowed the issue to MTU/MSS. For the sake of clarity, the two setups look like this:
App client --> SRX1(st0.x) --> VPN --> SRX2 --> F5 --> App server (This works)
App client --> SRX1(st0.y) --> VPN --> ??? (Cisco device I think) --> F5 --> App server (TCP/SSL setup correctly, but some app traffic fails)
I have control over SRX1 and SRX2, but nothing else. SRX1 and SRX2 both have a tunnel interface MTU of 1400 and an IPsec tcp-mss value of 1350. The traffic always originates from the left of the flow above, transiting SRX1 first.
I have been given a packet capture taken from the app client computer and can see:
- When the traffic uses the MPLS link, the SYN/ACK in the TCP handshake has the MSS value set to 1350, which matches the setting on SRX1 and SRX2
- When the traffic uses the internet-based VPN, the SYN/ACK in the TCP handshake keeps the original MSS value set by the app client of 1460
After some reading up on the subject, my understanding is that, with the TCP MSS configuration that we have, SRX1 should ALWAYS replace the MSS value in the SYN packet coming from the app client before sending the SYN to the destination through the VPN tunnel. This would mean that the SYN packet would have an MSS value of 1350 when arriving at the app server. Given what I've seen in the packet capture I mentioned, I am assuming this is working properly when the traffic is using the MPLS-based IPsec VPN, but for some reason not when using the internet-based VPN.
- Both IPsec VPNs originate on SRX1
- Both VPN configs leave the DF-bit setting as default (ie. clear)
- Both the st0.x and st0.y interfaces are in the same security zone and routing instance
- Nothing changes in the path from "app client" to SRX1 regardless of which VPN is used
- The only change I make to have the app traffic use the internet-based VPN is to add a static route, pointing the traffic at st0.y
- When using the MPLS-based VPN, the route is learned via BGP
- The only difference between the st0.x interface that uses the MPLS link is that is has an IP address, whereas the st0.y interface that uses the internet does not
- Routing, security zones and security policies are all set up correctly. In the internet-based VPN scenario we still get a successful TCP handshake, followed by a successful SSL handshake for the app client to app server connection. It's only once the app client tries to send data that the communication fails. In pcaps, I have seen that the data packets sent by app client are obviously larger when the route is switched to the internet-based VPN. Hence the MSS conclusion
I've done datapath-debug and flow traces on SRX1 and cannot find anything that helps me understand where the problem lies. I could assume that SRX1 isn't changing the MSS value in the TCP SYN when using the internet-based VPN, but that would seem like a bug and I'd rather exhaust all other possibilites first.
Happy to hear thoughts and suggestions 🙂
As I understand, when the server sees the SYN packet the TCP MSS should have been adjusted to 1350 by SRX1, however the server may not reply with an MSS of 1350. The payload could be higher depending on its MTU:
This Maximum Segment Size (MSS)
announcement (often mistakenly called a negotiation) is sent from the
data receiver to the data sender and says "I can accept TCP segments
up to size X". The size (X) may be larger or smaller than the
default. The MSS can be used completely independently in each
direction of data flow. The result may be quite different maximum
sizes in the two directions.
Devices may wish to use a larger MSS if they know for a fact that the MTUs of the networks the segments will pass over are larger than the IP minimum of 576. This is most commonly the case when large amounts of data are sent on a local network; PMTUD is used to determine the appropriate MSS.
I understand when a large IPSec packet is received by SRX1 it will send ICMP destination unreachable Fragmentation needed and DF set [ICMP Type 3 Code 4]. If icmp is filtered on the other end of the tunnel, PMTUD is not possible and the packet size will not be adjusted.
Maybe you could check this on SRX2 if you see any ICMP Type 3 Code 4.
Thanks for the response, Ashvin.
I definitely agree with your first point about the SRX adjusting the MSS in the TCP SYN packets. Thanks for the reference to RFC879 as well. I've now had a read of that and I think I understand what the problem is likely to be.
Based on the reading that I've done so far, my understanding of the SRX1 behaviour when the df-bit setting on an IPsec VPN is left at default (ie. "clear) is that the SRX won't send the ICMP type 3, code 4 response to the sender, it will simply do the fragmentation needed before putting the encrypted packet into the VPN tunnel. This setting is identical on both SRX1 and SRX2, therefore I would be very suprised to see any ICMP type 3, code 4 packets being used.
What I now think is happening is that SRX2 is picking up the SYN-ACK packet being returned to the app client and adjusting the MSS in that packet to 1350, since that what the tcp-mss ipsec-vpn setting is on that SRX. This is why the SYN-ACK packet in the capture I've seen shows the adjusted MSS value, rather than the typical value of 1460.
When traffic is shifted to the internet-based VPN, the terminating device at the other end (??? Cisco) is not adjusting the MSS value on the SYN-ACK packet coming back from the app server, therefore the app client recieves an MSS of 1460 and thinks it can send larger packets. This fits with the behaviour and the packet captures that I've seen.
I could possibly affect this behaviour by changing the df-bit setting on the specific IPsec tunnel to "copy" to force SRX1 to send the ICMP type 3, code 4 response back to app client. That may be one solution that could work. However, I'm hoping to convince the other end to do the MSS adjust on the SYN-ACK packet.
Thanks again for your response - it provided me a new track to investigate.
This looks like an "MTU blackhole" problem to me. It happens when transit network is SILENTLY dropping packets starting from size X. Normally this should never happen: packets that are too large must be either fragmented or dropped with ICMP sent back to source.
I suggest that you try to use ping with different packet sizes (with dont fragment option) to figure out if this is happening. Then - try to understand where this is happening. TCP-MSS setup is good for increasing VPN performance, but it is not meant to workaround MTU blackholes. In particular, it only works for TCP traffic. Aslo, to have performance improvement, you need to adjust TCP-MSS on both ends.
I won't type out my whole response again (see above), but what you have said makes good sense. Perhaps I will be better served by changing the df-bit setting on my SRX1 rather than relying on a configuration change at the other end to make this work. However, if I can get them to update their settings as well, then perhaps that is the best result.
I had thought of the ping-of-different-sizes test but, as I don't have control over the app client - nor do I have permission to allow a test host to ping the app server - this option isn't available to me. Thus far, trying to marshall application guys to get tests like this done all at the same time has been somewhat difficult.
Thanks for your response.
-> Sender here is the app client and the ingress packets can be fragmented by the SRX, so ICMP Type 3 Code 4 not required to be sent back to sender
-> Agree. Also, SRX2 may be fragmenting return ingress packets from the app server [Default df-bit clear behavior] as both SRXs have tunnel mtu of 1400.
-> Not sure this would definitely resolve the issue as it seems to me its large return packets that are probably being dropped. If IPSec tunnel MTUs are assymentrical on Cisco v/s SRX [Cisco being larger than SRX], and if large return encapsulated packets from app-server have outer DF-bit set, the packet would be dropped by SRX and I think ICMP Type 3 Code 4 would need to be sent back to app-server.
Two additional things that you could verify/change:
1. Use same tunnel MTUs on both sides of tunnel
2. Ensure ICMP Type 3 Code 4 is allowed through to app server for PMTUD
Hope this helps.