SRX

 View Only
last person joined: 14 hours ago 

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.
Expand all | Collapse all

Is ECMP supposed to work on SRX cluster?

  • 1.  Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 00:20

    Hi All,

    I'm trying to set up ECMP (per-flow load balancing) to 2 different uplinks in
    a chassis cluster in my lab. The links are

    se-1/0/0 - on node0
    se-5/0/1 - on node1

    I have static route with 2 next-hops and policy that allows balancing

    set routing-options static route 0.0.0.0/0 next-hop 172.18.1.1
    set routing-options static route 0.0.0.0/0 next-hop 172.18.2.1
    set routing-options forwarding-table export balance
    set policy-options policy-statement balance then load-balance per-packet

    In forwarding table, both next hops are seen,

    lab@jsrx# run show route forwarding-table
    Routing table: default.inet
    Internet:
    Destination        Type RtRef Next hop           Type Index NhRef Netif
    default            user     0                    ulst 262142     2
                                  ff.3.0.21          ucst   549     2 se-1/0/0.0
                                  ff.3.0.21          ucst   564     3 se-5/0/1.0

    However all my transit sessions go to se-1/0/0 (tried with ping and telnet)

    Session ID: 296, Policy name: default-policy/2, State: Active, Timeout: 1794, Valid
      In: 172.20.100.10/54872 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 53, Bytes: 2945
      Out: 172.31.15.1/23 --> 172.20.100.10/54872;tcp, If: se-1/0/0.0, Pkts: 41, Bytes: 2757

    Session ID: 300, Policy name: default-policy/2, State: Active, Timeout: 1792, Valid
      In: 172.20.100.10/61428 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 36, Bytes: 2046
      Out: 172.31.15.1/23 --> 172.20.100.10/61428;tcp, If: se-1/0/0.0, Pkts: 29, Bytes: 2055

    Session ID: 329, Policy name: default-policy/2, State: Active, Timeout: 1784, Valid
      In: 172.20.100.10/62054 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 10, Bytes: 547
      Out: 172.31.15.1/23 --> 172.20.100.10/62054;tcp, If: se-1/0/0.0, Pkts: 9, Bytes: 529

    By the way, show route shows

    0.0.0.0/0          *[Static/5] 00:57:57
                          to 172.18.1.1 via se-1/0/0.0
                        > to 172.18.2.1 via se-5/0/1.0

    Does this mean that ECMP is not supported on chassis cluster? I haven't found an
    indication for that in documentation, however. Junos 10.2R3.10.


    #ECMP


  • 2.  RE: Is ECMP supposed to work on SRX cluster?
    Best Answer

    Posted 07-03-2011 02:12

    ECMP on a cluster is indeed a special case. I don't know of any documentation about this, but when I tried this, I observed similar behavior. As far as I can remember, If there are two active routes for the same destination, the SRX will prefer the one which has a local interface, rather than crossing the FAB link.

    This would make sense for A/A deployments, otherwise the FAB link would become the bottleneck.

     

    You could test this theory by performing a failover, if I'm right then the sessions should use se-5/0/1 instead.



  • 3.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 02:37

    Hi motd,

    Thanks a lot for your reply. I tried the RG failover, however the routing did not
    fail over and traffic is now traversing the fabric link, leaving through se-1/0/0.

    I've done failover of both RGs,

    lab@jsrx> show chassis cluster status   
    Cluster ID: 1
    Node                  Priority          Status    Preempt  Manual failover

    Redundancy group: 0 , Failover count: 1
        node0                   200         secondary      no       yes
        node1                   255         primary        no       yes

    Redundancy group: 1 , Failover count: 5
        node0                   200         secondary      yes      yes
        node1                   255         primary        yes      yes


    And sessions are still leaving through node0:

    {primary:node1}
    lab@jsrx> show security flow session   
    node0:
    --------------------------------------------------------------------------

    Session ID: 2425, Policy name: default-policy/2, State: Active, Timeout: 1624, Valid
      In: 172.20.100.10/50561 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 0, Bytes: 0
      Out: 172.31.15.1/23 --> 172.20.100.10/50561;tcp, If: se-1/0/0.0, Pkts: 0, Bytes: 0

    Session ID: 2437, Policy name: default-policy/2, State: Active, Timeout: 1790, Valid
      In: 172.20.100.10/55822 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 0, Bytes: 0
      Out: 172.31.15.1/23 --> 172.20.100.10/55822;tcp, If: se-1/0/0.0, Pkts: 0, Bytes: 0
    Total sessions: 2

    node1:
    --------------------------------------------------------------------------

    Session ID: 548, Policy name: default-policy/2, State: Backup, Timeout: 1622, Valid
      In: 172.20.100.10/50561 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 28, Bytes: 1611
      Out: 172.31.15.1/23 --> 172.20.100.10/50561;tcp, If: se-1/0/0.0, Pkts: 23, Bytes: 1664

    Session ID: 593, Policy name: default-policy/2, State: Backup, Timeout: 1790, Valid
      In: 172.20.100.10/55822 --> 172.31.15.1/23;tcp, If: reth0.100, Pkts: 24, Bytes: 1399
      Out: 172.31.15.1/23 --> 172.20.100.10/55822;tcp, If: se-1/0/0.0, Pkts: 20, Bytes: 1490
    Total sessions: 2

    Looks like SRX cluster is always using the first entry from the forwarding table
    and ECMP is not working

    lab@jsrx> show route forwarding-table   
    Routing table: default.inet
    Internet:
    Destination        Type RtRef Next hop           Type Index NhRef Netif
    default            user     0                    ulst 262142     2
                                  ff.3.0.21          ucst   549     2 se-1/0/0.0
                                  ff.3.0.21          ucst   564     2 se-5/0/1.0

    Any other ideas (except Junos upgrade)?



  • 4.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 03:24

    The sessions still appear to be active on node0 and backup on node1, looks like that were sessions that were created when node0 was primary. Have you tried setting up new telnet sessions after the failover?

     

    I'll see if I can find some of my old lab notes, we never implemented this because the customer wanted more granular control over which link was used and we ended up with FBF+VRs.



  • 5.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 05:15

    Hi

    These were new sessions, and they always go to node0's se-1/0/0.

    I've tried upgrading to 11.1R3 now and it is the same, ECMP not working,
    everything goes to se-1/0/0.

    I'm adding another record to my list of things I don't understand about
    SRX routing. Some other things, by the way, are:

    - How is the routing table consulted for the reverse route during session init (routing instances, zones, etc - what matters?)
    - How exactly do routing changes propagate into the session

    In many cases it "just works", but it would be good to have a detailed
    explanation. Because you never know in what case it will "just won't work"
    (as this ECMP question)...



  • 6.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 06:25

    Thats interesting. I'll see what I can find tomorrow, I'm quite sure this was working in my lab setup. But that could have been with reth interfaces (we didn't use non-reth because before 10.2R3 we experienced issues with packet loss on the FAB link).

     

     

    All good questions, juniper should write some KB articles about all this. This is where the SRX is different from the routers because it is stateful and because you can also use per-packet filters on SRX (for example for FBF), this can get really confusing.

     

     


    How is the routing table consulted for the reverse route during session init (routing instances, zones, etc - what matters?)

    Each interface is in exactly one virtual-router. Either the default one or a routing-instance of type virtual-router. When the first packet of a session arrives on an interface, a route lookup is performed in the corresponding table, for both source and destination.

     

    Say you have something like this:

       Client ---- [reth1] (Client-VR) (Server-VR) [reth2] ---- Server

     

    If the client sets up a connection to the server, a destination route lookup is performed in Client-VR and a session is created. When the server responds, the SRX performs a session lookup, finds the existing session and will lookup up the reverse route in Client-VR again. Server-VR doesn't even need a route back to the client. Any FBF filters you may have applied on reth2 are ignored as well!

     

    I use this behavior a lot to connect multiple ISPs to the SRX, each with its own PA address space. If a client connects to an IP from ISP1, it is important that the response is routed back through that same ISP because ISP2 would simply drop the traffic. So place each ISP in its own virtual-router and everything will work fine.

     

     


    How exactly do routing changes propagate into the session

    We have had quite a few discussions about this and it is on my list of things to test. Here are some things I do know:

    - the route can't change to another egress zone. That would require a new policy lookup which simply doesn't happen.

    - if the original route is still valid, nothing changes and the traffic is still routed that way. I often change route preferences to redirect traffic and this only affects new sessions, existing sessions still use the old path.

    Sometimes this can be a problem. This is a good example: http://geogeeks.net/blog/2010/12/juniper-srx-udp-problem/

     

    Questions still to be answered:

    - Can a session failover be done from one interface to another if both interfaces are in the same zone?

    - Can a session failover by done from an ethernet to a VPN interface? ScreenOS always had a problem with this but its possible in the more recent versions.

     

     

     

    I need more time for labs 🙂



  • 7.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 11:33

    Hi

     

    Thanks for the info. Yes, a lot of lab testing is still needed 🙂

    Please update the thread if you will have more info.

     

    And I forgot to add another "don't understand" point. The book "Junos Security" says that

    asymmetric routing is 100% supported on SRX, but gives no details. Again, is that true

    in general case (different zones/routing instances)? In what exact cases it will or will

    not work?



  • 8.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-03-2011 12:21

    And I forgot to add another "don't understand" point. The book "Junos Security" says that

    asymmetric routing is 100% supported on SRX, but gives no details. Again, is that true

    in general case (different zones/routing instances)? In what exact cases it will or will

    not work?


    The main restriction is that the zones need to be the same. Asymmetric routing sometimes occurs if you have two BGP peers. You never know which route is used by the client so a request may be received on interface1 but your reverse route may point to interface2. If both these interfaces are in the same zone, that will work. If they are in different zones, the response packets will be dropped with a "zone mismatch" error.

    This also implies that both interfaces need to be in the same routing instance as a zone can only be used in a single instance (I really wish they would remove that limitation).



  • 9.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-05-2011 02:20

    @motd wrote:

    And I forgot to add another "don't understand" point. The book "Junos Security" says that

    asymmetric routing is 100% supported on SRX, but gives no details. Again, is that true

    in general case (different zones/routing instances)? In what exact cases it will or will

    not work?


    The main restriction is that the zones need to be the same. Asymmetric routing sometimes occurs if you have two BGP peers. You never know which route is used by the client so a request may be received on interface1 but your reverse route may point to interface2. If both these interfaces are in the same zone, that will work. If they are in different zones, the response packets will be dropped with a "zone mismatch" error.

    This also implies that both interfaces need to be in the same routing instance as a zone can only be used in a single instance (I really wish they would remove that limitation).


    And saying this limitation wasn't there in the first junos-es releases... still wondering why they removed it.

     

    Grtz,

    Frac



  • 10.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-05-2011 12:44

    Hi motd, Hi Frac,

    Thanks for your replies. Regarding asymmetric routing, do you know if it
    works on high-end SRX (3k/5k)? For example, packet comes in NPC1, leaves
    through NPC2. The return packet is routed back through NPC3. Assume zone is
    the same. The doc says that session is installed in incoming and outgoing
    NPC, so it will be 1 and 2, and NPC3 will not know about the session and
    drop the traffic. Right? Unfortunately I can't check this in lab...



  • 11.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-09-2011 06:17
    After performing some tests, I'll have to correct what I wrote earlier. Route changes do affect sessions but only if no routing-instances are involved.
    If a session is established over interface X and a routing change occurs sending the traffic out via interface Y, the following happens:
    -All traffic is now sent out via interface Y. if X and Y are in the same zone => no problem. As expected the session can still be matched. Note that in "show security flow session", the egress interface still shows X even though the traffic is now routed through interface Y.
    - If X and Y are in different zones => the egress packest are dropped with message "packet dropped, pak dropped since re-route failed". At this time, the session will remain in the session table and all subsequent packets will be dropped with the same error message. If the "route-change-timeout" setting is set, the timeout of the session will be reduced to that value as soon as this error is encountered. It doesn't change when the routing changes, but only when traffic is seen.
    - If the sessions exit through another routing-instance, route changes do NOT affect the session. This is why in our setup I can change the default route to another ISP without losing the already established sessions. New sessions follow the new route, the existing ones still use the original route.
    I couldn't find a way to cause the session to be re-routed by changing routes in the client-side or server-side routing-instances.

    Regarding asymmetric routing, do you know if it

    works on high-end SRX (3k/5k)? For example, packet comes in NPC1, leaves
    through NPC2. The return packet is routed back through NPC3. Assume zone is
    the same. The doc says that session is installed in incoming and outgoing
    NPC, so it will be 1 and 2, and NPC3 will not know about the session and
    drop the traffic. Right? Unfortunately I can't check this in lab...


    Don't know about that one, I would assume that because NPC3 doesn't know about the session, the packet is send to the CP which knows about the session and installs it in NPC3 as well. But I don't have the lab equipment available to test this either. I know there is a detailed explanation about this in the courseware (I think in AJSEC), but I don't have access to those right now - which I could get them in PDF.



  • 12.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 07-10-2011 03:43

    Hi motd,

     

    Thanks a lot for that lab results, this is what I really wanted to know but was too lazy (or busy?)

    to test myself...

     

    I've got AJSEC book, it has some details about packet processing in the Appendix, but

    I could not find a clear answer to asymmetric routing question. It is not obvious for me

    if the return packet from different NPC will be sent to CP and if the CP will be able to

    unterstand that it belongs to an existing session and new NPCs should be programmed

    for the same session...

     

     



  • 13.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 01-14-2012 22:42

    pk, reall nice and informative discussion ...

     

    regards



  • 14.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 04-17-2012 21:23

    http://kb.juniper.net/InfoCenter/index?page=content&id=KB23417

     

    SRX support ECMP flow-based forwarding after 12.1.



  • 15.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 04-18-2012 03:06

    Hi JJJ

     

    Thanks for updating this old thread. Can you confirm that this is also

    working on the cluster? Sorry but I lost too much blood with this problem

    so I will not mark thread as "solved" until I'm absolutely sure it works... 



  • 16.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 03-12-2013 02:45

    Hi All

     

    Just an update, I tested ECMP with cluster running 12.1X44-D10.4. It is working if I have several

    uplinks on one node, but for uplinks connected to different nodes, there is no load balancing. 

     

    However if load balancing is turned on (export policy to forwarding table, etc), outgoing traffic seems to 

    choose the incoming node's uplink, so fabric link forwarding is minimized this way.

     

    So I will mark Motd's first answer a a solution for now.



  • 17.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 05-10-2013 18:37

    Hi, All.

     

    Given that the SRX would create the session ingress/egress interfaces based on the forwarding table as shown at another thread, how would ECMP influence it now?

     

    If reth0 and reth1 are both ECMP default routes, is there a guaranteed behavior that sessions created for incoming connections at reth0 would not user reth1 for return traffic because of ECMP?

     

    I am migrating our design to multiple VRs for ISPs and came accross the KB for ECMP support on 12.1. It would be great if someone could confirm the SRX behavior in this use case.

     

     

    - asbestos-muffin



  • 18.  RE: Is ECMP supposed to work on SRX cluster?

    Posted 10-10-2019 11:19

    I know this is a very old thread but have recently been facing the very same problem on clustered SRX1500 running JTAC recommended 15.1X49-D170.4. I am updating the thread as I may have found a workable solution which I have been testing.

     

    The topology in the thread aims to ECMP traffic over dual active ISP interfaces on A/A SRX Cluster. Using physical interfaces the traffic egresses through the primary route interface even though both next-hop addresses are available in the forwarding table.

     

    Moving the physical interfaces to ethernet-switching and members of a VLAN, with the VLAN l3-interface as the ISP IP end-point

    ECMP load-balancing does work. I assume this is because the VLAN will exist on both units regardless of location of physical link, and rouing over the VLAN inheriently sends the ECMP over the FAB link.

     

    #show vlans | display set

    set vlans ISP1 vlan-id 901
    set vlans ISP1 l3-interface irb.901
    set vlans ISP2 vlan-id 902
    set vlans ISP2 l3-interface irb.902

     

    #show routing-instances TRAFFIC | display set

    set routing-instances TRAFFIC instance-type virtual-router
    set routing-instances TRAFFIC interface irb.901
    set routing-instances TRAFFIC interface irb.902
    set routing-instances TRAFFIC routing-options graceful-restart
    set routing-instances TRAFFIC routing-options static route 0.0.0.0/0 next-hop 1.1.1.1
    set routing-instances TRAFFIC routing-options static route 0.0.0.0/0 next-hop 2.2.2.2
    set routing-instances TRAFFIC routing-options static route 0.0.0.0/0 preference 10

     

    #show interfaces | display set

    set interfaces ge-0/0/1 unit 0 family ethernet-switching vlan members ISP1
    set interfaces ge-7/0/12 unit 0 family ethernet-switching vlan members ISP2
    set interfaces irb unit 901 family inet address 1.1.1.2/30
    set interfaces irb unit 902 family inet address 2.2.2.1/30