Hi all, let me begin by saying my networking skills are not very good. I completed my JNCIA certification many years ago but due to the nature of my work since then I have never really been tasked with any network administration, so forgive me if I do not make much sense - if you need any further clarification please let me know and I will do my best to answer.
We have 3 sites, 1 in Australia, 1 in Malaysia, 1 in China. We also have a cloud environment (AWS).
These 4 sites are connected by site-to-site VPN's and everything was working fine up until about 2 weeks ago when all of a sudden or China office could no longer establish VPN with AWS. The VPN's from China to Malaysia and China to Australia continue to operate but we cannot establish IPSEC tunnels with AWS.
I have gone through various troubleshooting methods with AWS and the only answer they could give me was that their endpoint was receiving Phase 1 proposal and responding but our Juniper SSG140 on-site in China did not receive response and eventually timed out. We went to the ISP who suggested it may be ChinaNet (their backbone) dropping IKE traffic as per government crackdown on VPN's.
I am hoping to understand if it is possible to route traffic from our site in China to AWS via Malaysia as the tunnel between China/Malaysia remains up and tunnel between Malaysia/AWS remains up?
I figure if this is possible I would need to setup some sort of static route however this is where I get a bit lost. If I setup a static route on China firewall for all traffic to AWS (10.2.x.x) to be delivered to the tunnel interface of China-Malaysia VPN it just times out and I cant figure a way to do this via Source routing on the Malaysia firewall?
I assume I may be way off, but if someone can point me in right direction I would be eternally grateful!
You are on the right track.
As you mention first you need to route the AWS prefixes from China to Malaysia.
Next the VPN at Malaysia to AWS will need to include the China subnet. This will require changes on both nodes of this VPN.
I assume the VPN between China and Malaysia already includes the China subnet that accesses AWS so no changes would be needed there if this is a Juniper route based VPN. But if it is policy based you also need to add the AWS subnet to the traffic selectors here.
Finally, you will need to look at the security policies in place at both sites. Policies are written from zone to zone in the direction that traffic is initiated. So if AWS pushes information to China the policy from AWS zone to Malaysia will need to allow the China subnets and the opposite direction if China pushes to AWS.
Likewise on both China and Malaysia the vpn traffic security policy needs to allow the AWS subnets between the two sites on both sides in the correct direction of zone to zone.
Thanks for your response.
I'm still not having much luck, for further clarification all the VPN's are route based.
From within AWS I deleted the existing VPN tunnel to China is it has been down for quite some time and I put a route to our China network (10.47.0.0/16) pointing to the virtual gateway which the VPN from Malaysia connects to. There is a similar entry here for the Malaysia (10.3.0.0/16) network which is propogated from BGP I assume - and since Malaysia can reach AWS I assume this is where the change on AWS side needs to be made.
From China I have included a route for all traffic to AWS (10.2.0.0/16) to go via Malaysia-China tunnel interface and give it the gateway for the Malaysia side, I assume this is working as when I run a trace-route from a domain controller in China to a domain controller in AWS and I can see first hop is switch, then the China firewall and third hop reaches this gateway with some latency which I would expect given the physical distance between the two. Then it times out.
I've made sure that route maps we have setup which are attached to various interfaces on Malaysia firewall (for example the BGP route to AWS) allow for 10.3.x.x and 10.47.x.x and I am pretty sure I am not missing something here.
Trying to map it out logically, I still cant understand how the SSG in Malaysia differentiates between traffic between it and China and traffic from China meant for AWS and I assume that this is why the traceroute times out, it just doesnt know where to send it? Do I need to configure something to allow this? You mentioned I could probably do it via policy, however as these are route-based there isnt a policy assigned to them?
Thanks again for your response, and your patience, after this I fully intend to go back over some networking basics because I am drawing blanks when it comes to this!
Sounds like your routing is good now. Routing only cares about the destination address of the packet. So in Malaysia any source address with a destination of AWS will be routed to the tunnel. And any source address with a destination of China will be routed to that tunnel. At AWS you have verified the China route goes to the Malaysia VPN so we are good there.
The traceroute verifies that the vpn between China and malaysia is good for the traffic.
We need to verify that the vpn from Malaysia to AWS will accept the China subnet. Check the phase 2 configuration and make sure there are no proxy-id pairs configured that limit the tunnel. This is likely ok.
Next we confirm the security policy allows the traffic. You can manually check the policy from Malaysia to AWS by looking at the two interfaces on the VPN. The vpn tunnel interface from China is the source zone and the vpn tunnel interface to AWS is the destination zone. You need a policy to permit traffic between these zones for China to ping AWS. And a reverse zone policy to allow AWS to ping China.
If you cannot see this manually, setup debug flow basic on the Malyasia SSG and run the ping again from China.
The results will tell us what policies are in place and probably blocking the traffic. You can interpret the results here or post the file.
Not long after my response I noticed the routing working correctly as I needed it to. I believe this may have been because I deleted the VPN configuration from AWS side and instead attached the subnet of the China network and attached it to the virtual gateway in effect forcing AWS to look for that network through one of the other endpoints (Malaysia specifically). It didnt happen straight away but after spending all day on it I had an early night and next morning when I woke up to continue on it found it to be working (perhaps route tables took a while to propagate?)
Anyway, sincere thanks for your help - really appreciate it!