So we've some strange behaviour with SMB/CIFS through an IPSEC VPN tunnel. I suspect a MTU/MSS issue however I'm unable to pinpoint the root cause. The situation is:
server#1 <> SRX650 <internet/IPSEC VPN> Cisco RV320 <> server#2
On the SRX650 I've lowered the MSS:
set security flow tcp-mss ipsec-vpn mss 1350set security flow tcp-session no-syn-check (this was set for issues with another customers VPN)
When I login to server#1, and open a share on server#2 (both are windows servers, share opened in Explorer \\server#2\share), I get the following speeds:
Download: ~ 5 MB/s (copy file from server#2 \\server#2\share towards server#1) Upload: 355 KB/s (copy file from server#1 towards server#2 \\server#2\share)
As you can see, something is really wrong when transferring files from server#1 to server#2. When I do the same test from server#2, speeds are the same (upload would be ~5MB/s and download 355KB/s),
I've ran wireshark on server #1 and saw lots of retransmissions when uploading to server#2. I'm almost certain that it as something to do with MTU/MSS of IPSEC traffic, however the tcp-mss ipsec-vpn statement should fix this. But it doesn't.
So any suggestions are welcome!
I do not belive the MSS setting should have an impact here.
IPSEC MSS would apply in both directions from Server1 to Server2 and from Server2 to Server1
Say Server 2 is sending a large MSS value the SRX will modify it to 1350 before sending it to server1. This way Server 1 would modify its packet size when sending to Server1, which would explain the reduced throughput.
However this should apply in the other direction too in the same way, When data is being sent from Server 2 to Server 1
> Did you get a chance to check the actual MSS advertised by Server1 and Server2 in the captures?
> Since you say you see a lot of re-transmissions from Server 1 to Server 2 it could be something with the ISP circuit either on SRX side or Cisco side.
> Do you have another VPN where the same test can be done for reference?
Hi Vikas,Thanks for taking the time to respond. Sorry for the delay in responding. We've tested the same VPN through another SRX650, same results. I'd created some additional captures, One thing I noticed is:When "downloading" from server2 to server1, which is fast, I see that the TCP data segments have the following values:Len=1350However, when I "upload" from server1 to server2, which is slow, I see the following values:Len=1400I believe the clue might lie somewhere in those values.I did a single capture of a "fast" download from server2 (192.168.1.212) to server1 (10.39.77.35). Looking at the capture (@server1), I see several SYN packets, these are from the data stream:4779 60.398813 10.39.77.35 192.168.1.212 TCP 66 61055 → 445 [SYN, ECN, CWR] Seq=0 Win=8192 Len=0 MSS=1460 WS=256 SACK_PERM=14782 60.443103 192.168.1.212 10.39.77.35 TCP 66 445 → 61055 [SYN, ACK, ECN] Seq=0 Ack=1 Win=8192 Len=0 MSS=1400 WS=256 SACK_PERM=14783 60.443159 10.39.77.35 192.168.1.212 TCP 54 61055 → 445 [ACK] Seq=1 Ack=1 Win=131584 Len=0This results in data packets like:6222 79.718252 192.168.1.212 10.39.77.35 TCP 1404 445 → 61055 [ACK] Seq=25223 Ack=6336 Win=64768 Len=1350 [TCP segment of a reassembled PDU]>>>>>FULL DETAILSFrame 6336: 1404 bytes on wire (11232 bits), 1404 bytes captured (11232 bits) on interface 0Internet Protocol Version 4, Src: 192.168.1.212, Dst: 10.39.77.35Transmission Control Protocol, Src Port: 445, Dst Port: 61055, Seq: 117023, Ack: 6336, Len: 1350 Source Port: 445 Destination Port: 61055 [Stream index: 79] [TCP Segment Len: 1350] Sequence number: 117023 (relative sequence number) [Next sequence number: 118373 (relative sequence number)] Acknowledgment number: 6336 (relative ack number) 0101 .... = Header Length: 20 bytes (5) Flags: 0x010 (ACK) Window size value: 253 [Calculated window size: 64768] [Window size scaling factor: 256] Checksum: 0x7f67 [unverified] [Checksum Status: Unverified] Urgent pointer: 0 [SEQ/ACK analysis] [iRTT: 0.044346000 seconds] [Bytes in flight: 1350] [Bytes sent since last PSH flag: 85050] TCP payload (1350 bytes) [Reassembled PDU in frame: 7435] TCP segment data (1350 bytes)<<<<<<<<<<<<<<<<<Then I did a completely new single capture of a "slow" upload from server1 (10.39.77.35) to server2 (192.168.1.212). Below the packets:4250 87.604773 10.39.77.35 192.168.1.212 TCP 66 61129 → 445 [SYN, ECN, CWR] Seq=0 Win=8192 Len=0 MSS=1460 WS=256 SACK_PERM=14254 87.648893 192.168.1.212 10.39.77.35 TCP 66 445 → 61129 [SYN, ACK, ECN] Seq=0 Ack=1 Win=8192 Len=0 MSS=1400 WS=256 SACK_PERM=14255 87.648961 10.39.77.35 192.168.1.212 TCP 54 61129 → 445 [ACK] Seq=1 Ack=1 Win=131584 Len=0This results in data packets like:4860 94.929241 10.39.77.35 192.168.1.212 TCP 1454 61129 → 445 [ACK] Seq=43040 Ack=6075 Win=131584 Len=1400 [TCP segment of a reassembled PDU]>>>>>FULL DETAILSFrame 4860: 1454 bytes on wire (11632 bits), 1454 bytes captured (11632 bits) on interface 0Internet Protocol Version 4, Src: 10.39.77.35, Dst: 192.168.1.212Transmission Control Protocol, Src Port: 61129, Dst Port: 445, Seq: 43040, Ack: 6075, Len: 1400 Source Port: 61129 Destination Port: 445 [Stream index: 108] [TCP Segment Len: 1400] Sequence number: 43040 (relative sequence number) [Next sequence number: 44440 (relative sequence number)] Acknowledgment number: 6075 (relative ack number) 0101 .... = Header Length: 20 bytes (5) Flags: 0x010 (ACK) Window size value: 514 [Calculated window size: 131584] [Window size scaling factor: 256] Checksum: 0x1f59 [unverified] [Checksum Status: Unverified] Urgent pointer: 0 [SEQ/ACK analysis] [iRTT: 0.044188000 seconds] [Bytes in flight: 23800] [Bytes sent since last PSH flag: 42000] TCP payload (1400 bytes) TCP segment data (1400 bytes)<<<<<<<<<<<<<<<<<I also checked ping with different sizes and do-not-fragment set, limit is 1472 bytes:>ping -f -l 1472 192.168.1.212Pinging 192.168.1.212 with 1472 bytes of data:Reply from 192.168.1.212: bytes=1472 time=44ms TTL=126Ping statistics for 192.168.1.212: Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),Approximate round trip times in milli-seconds: Minimum = 44ms, Maximum = 44ms, Average = 44msControl-C^C>ping -f -l 1473 192.168.1.212Pinging 192.168.1.212 with 1473 bytes of data:Packet needs to be fragmented but DF set.Ping statistics for 192.168.1.212: Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),Control-C^C
Interesting detail, when I run the same 'ping test' from server2 towards server1, I get a much lower allowed byte size, this way the limit is only 1410 bytes:
server2>ping -n 1 -f -l 1411 10.39.77.35Pinging 10.39.77.35 with 1411 bytes of data:Packet needs to be fragmented but DF set.Ping statistics for 10.39.77.35: Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),server2>ping -n 1 -f -l 1410 10.39.77.35Pinging 10.39.77.35 with 1410 bytes of data:Reply from 10.39.77.35: bytes=1410 time=44ms TTL=126Ping statistics for 10.39.77.35: Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),Approximate round trip times in milli-seconds: Minimum = 44ms, Maximum = 44ms, Average = 44ms
I'm going to try to setup a simple SRX for this VPN, just to rule out other configurations. If you might have any suggestions, please let me know.Regards,
What about making mss 1250 ? And setting same on both sides?
Sorry for the late reply 😉 Unfortunately I can't really change anything on the other side (like mss etc) because it's a Cisco RV320.
I still find it really strange that I get ~ 5 MB/s one way, and only ~ 900 KB/s the other way ...
I'm still exploring possible solutions ...
Did you manage to solve this ?