Switching

Expand all | Collapse all

ethernet-switching table issue QFX10002

Jump to Best Answer
  • 1.  ethernet-switching table issue QFX10002

    Posted 06-14-2020 08:28

    I observe a strange situation with the learning of mac addresses.

    Scheme is attached.

    Fact:

    If any host from internet (like 2.2.2.2) try setup tcp session with my host (1.1.1.1) - my switch (QFX10002-1 and QFX10002-2) does not learn the address from the downstream switch EX4550.

    In result "switch" downgrade to "hub" and "broadcast" traffic 2.2.2.2 -> 1.1.1.1 to all ports.

    If I in my router MX480 try:

    1) ping 1.1.1.1 from router

    or

    2) clear arp hostname 1.1.1.1

    admin@QFX10002-nl-1> show ethernet-switching table | grep 90:b1:1c:30:3b:1e
    vlan_350 90:b1:1c:30:3b:1e DR - ae3.0 0 0

    admin@QFX10002-nl-2> show ethernet-switching table | grep 90:b1:1c:30:3b:1e
    vlan_350 90:b1:1c:30:3b:1e DL - ae3.0 0 0

    Ok. mac here.

     

    admin@QFX10002-nl-2> show configuration protocols l2-learning

    {master:0}

     

    admin@QFX10002-nl-1> show configuration protocols l2-learning

    {master:0}

     

    MX480:

    start shell
    % sysctl -a | grep arp_cache
    net.link.ether.inet.arp_cache_size: 2565
    net.link.ether.inet.arp_cache_perm_size: -365
    net.link.ether.inet.arp_cache_size_threshold: 0
    net.link.ether.inet.arp_cache_timeout_size: 2034
    net.link.ether.inet.arp_cache_rearp_size: 0
    net.link.ether.inet.arp_cache_retry_size: 170
    %

     

    If any other details are needed, then I will write. How to properly troubleshouting and fix it.

     

    https://imgur.com/u9Cuvf7photo_QFX10002-issue.jpg



  • 2.  RE: ethernet-switching table issue QFX10002

     
    Posted 06-14-2020 09:59

    Hello A.leb,

     

    have you tried the default procedures for troubleshooting? These would be:

    *) Look at the "/var/log/messages" file for errors

    *) Look at "show ethernet-switching mac-learning-log"

    *) Configure l2-learning traceoptions to check the detailed debugging output:

    https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/traceoptions-edit-protocols-l2-learning-qfx-series.html



  • 3.  RE: ethernet-switching table issue QFX10002

     
    Posted 06-14-2020 10:21

    @A.leb this is very expected behavior, because your EX4550 is not 'talking' to you QFX10002.  You host are talking to each other via some router.  Not sure what device is DefGW for 1.1.1.1 and 2.2.2.2, but for these to communicate, neither MAC of EX4550 or QFX10002 are used if both of them operation as pure L2 switches.

     

    Now if there is come direct communication between EX4550 and QFX10002, such as say LLDP, then each should learn some MAC of the other, which would be one associated with the RE.  All MACs from these switches are created from base MAC.  This can be found via

    show chassis mac-addresses (I think this should work on EX as well)

    https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/show-chassis-mac-addresses.html

     

    I am not sure this is what you are asking about from your question.



  • 4.  RE: ethernet-switching table issue QFX10002

    Posted 06-14-2020 11:07

    I added a picture for understanding.

    TD;LR.

    QFX10002 dont RECEIVE packet with mac source of host 1.1.1.1!!! So, dont learn.

    But question1: why not? In arp protocol host 1.1.1.1 MUST send arp packet back to 1.1.1.10 (aka MX480).

    yes. tcp back traffic avoid l2-path: host -> EX4450 -> QFX10002 -> Mx480.

    but for mac learning enough packets from arp reply.

    And test with clear arp confirm this.

    Why this machanics dont work "in wildlife" 

    But question2: how to transfer mac address table to QFX10002.

     

     

    1) Diagnostics

    {master:0}

    In QFX10002:
    admin@QFX10002-nl-1> show lldp neighbors | grep ae3
    xe-0/0/9:0 ae3 3c:8a:b0:d3:83:00 to QFX10002-1
    xe-0/0/9:1 ae3 3c:8a:b0:d3:83:00 to QFX10002-1
    xe-0/0/9:2 ae3 3c:8a:b0:d3:83:00 to QFX10002-1
    xe-0/0/9:3 ae3 3c:8a:b0:d3:83:00 to QFX10002-1

    {master:0}

     

    admin@QFX10002-nl-1> show configuration protocols lldp
    interface all;


    admin@QFX10002-nl-1> show chassis mac-addresses
    FPC 0
    Base address 0c:86:10:d3:96:00
    Count 4096

     

    admin@QFX10002-nl-1> show configuration interfaces ae3
    description "downlink to EX4550-VC";
    native-vlan-id 1;
    aggregated-ether-options {
    minimum-links 1;
    link-speed 10g;
    lacp {
    system-id 00:00:00:00:00:03;
    admin-key 1;
    }
    mc-ae {
    mc-ae-id 3;
    redundancy-group 1;
    chassis-id 0;
    mode active-active;
    status-control active;
    }
    }
    unit 0 {
    family ethernet-switching {
    interface-mode trunk;
    vlan {
    members [ vlan_800 vlan_300 vlan_350 vlan_400 vlan_450 vlan_500 vlan_550 ];
    }
    }
    }

     

    In EX4550:

    admin> show lldp neighbors | grep ae10
    xe-2/0/16.0 ae10.0 0c:86:10:d3:96:00 to EX4550-VC QFX10002-nl-1
    xe-2/0/17.0 ae10.0 0c:86:10:d3:96:00 xe-0/0/9:1 QFX10002-nl-1
    xe-2/0/18.0 ae10.0 0c:86:10:d3:96:00 xe-0/0/9:2 QFX10002-nl-1
    xe-2/0/23.0 ae10.0 0c:86:10:d3:96:00 xe-0/0/9:3 QFX10002-nl-1
    xe-3/0/17.0 ae10.0 54:4b:8c:cf:d4:32 to EX4550-VC; QFX10002-nl-2
    xe-3/0/18.0 ae10.0 54:4b:8c:cf:d4:32 to EX4550-VC; QFX10002-nl-2
    xe-3/0/19.0 ae10.0 54:4b:8c:cf:d4:32 to EX4550-VC; QFX10002-nl-2
    xe-3/0/23.0 ae10.0 54:4b:8c:cf:d4:32 to EX4550-VC; QFX10002-nl-2

     

    admin> show configuration protocols lldp
    interface all;

     

     

    And how to transfer the entire table of mac addresses from EX4550 to QFX10002? 

     

     

     

    photo_QFX10002-issue-2.jpg



  • 5.  RE: ethernet-switching table issue QFX10002

     
    Posted 06-14-2020 14:21

    @A.leb let's start with #2.  Why should or would you need/want MAC table of EX4550 to be 'transferred' to QFX10002?

    MAC table entries consist of 2 parts - MAC address and interface learned on.  Very likely the interfaces any MAC is learned on the EX4550 would be learned on a different interface on the QFX10002.

     

    I have no idea why you would even ask this?

     

    Before we go deeper, do you actually have a communication issue in your network?  If yes between where and where?  If no, are you concerned that some switch is not learning MAC properly?  For #2, what specific MAC (associated with what IP) are you learning, say on EX4550, that you think is not being learned properly on the QFX10002?

     

    BTW, from diagram MX480 1.1.1.2 you have listed as the DefGW for 1.1.1.1.  So why would 1.1.1.1 ARP for 1.1.1.10?  1.1.1.1 likely does not even know 1.1.1.10 is a Router.  For 1.1.1.1 to communicate with 2.2.2.2, it will ARP for 1.1.1.2, for which EX4550 should learn 1.1.1.1 MAC on direct interface, and QFX10002 will on link to EX4550.  From them, 1.1.1.1 sends packets to MAC of 1.1.1.2, which QFX never sees.  1.1.1.2 uses its Router MAC to talk to 1.1.1.10 (next hop to get to 2.2.2.2), so QFX should learn MAC of 1.1.1.2, but eventually time-out MAC of 1.1.1.1.

     

    All I can say is that L2 MAC learning works properly on both EX4550 and QFX10002 or LOTS of people would have issues.  With 99% certainty I suspect that what ever you are seeing is correct, and that there is some good explaination for what ever you are seeing.  I am still confused as to what your basic concern is, or if you have an actual communication issue?  I assume no issues, as you said 1.1.1.1 can talk to 2.2.2.2.

     

    BTW, the QFX (and EX4550) will never learn 2.2.2.2 MAC.  At 1.1.1.10 Router that MAC is always striped and substituted by some Router MAC from 1.1.1.10.  EX4550 and QFX (if pure L2) can only learn MACs from local VLANs (which = some subnet for IP).



  • 6.  RE: ethernet-switching table issue QFX10002

    Posted 06-14-2020 16:53

    TD;LR Make QFX10002 as l2 "core switch" - the most correct solution to the problem.

     

    Whole problem is asymmetric routing.
    By design: traffic goes from the Internet (this is one of the non-main ways) through mx480-2 (where host 2.2.2.2 is "connected" - He’s somewhere on the Internet, this is not a direct connection)
    2.2.2.2 -> .... hop (s) .. ->  mx480- 1.1.1.10 -> l2 -> QFX10002 -> EX4550 -> host 1.1.1.1
    But to 2.2.2.2 traffic going 1.1.1.1 -> EX4550 -> 1.1.1.2 (MX480-1) (and here, depending on the routing table) or to -> internet .. 2.2.2.2 or to via -> another l2 path -> 1.1.1.100  -> internet
    In any case, this traffic path from Mx480-1 dont send any eth packet with src mac 11:11:11:11:11 (host 1.1.1.1).
    I did not explain in the before scheme that both routers are connected by ibgp.

    In fact, the architectural problem is that the QFX10002 should be a l2 "core switch" correctly.

    If the upward and downward movement of traffic went through it, then the described problem would not exist.
    But, for historical reasons, primarily because of the complexity of performing work in the data center, a situation has turned out when the network scheme has been in the process of being reworked.
    What is the essence of the problem: traffic comes in one way, but comes back in another.

     

    The table arp lifetime in MX480-2 does not match the mac address table lifetime in QFX10002.

     

    As a result, QFX10002 has no information in the mac table (it has nowhere to come from - because arp requests are rare (l2 packet with mac 11:11:11:11:11:11 from 1.1.1.1), and there is no constant traffic from 1.1.1.1 through QFX10002)

     

    On the other hand, there is constant traffic 2.2.2.2 -> ... -> 1.1.1.10 -> QFX10002 -> EX4550 -> 1.1.1.1 exist.

    In result: all constant traffic from 1.1.1.10 copy to all l2 interfaces QFX10002.

     

    Accordingly, we need to either update setup the arp table ttl on mx480-2 very often so that arp requests are constantly present.

    Then these requests (and answers) will cause QFX10002 to learn that mac 11: 11: 11: 11: 11 from behind EX4550.

    Or, on the contrary, the ttl of the mac address table would be so long that once having learned it, we would not be worried that it would disappear from the table.
    So I see the solution. With standard scheme "learning mac".
    Why am I talking about "copy" (I dont know protocol or method for it) the mac address table from EX4550 to QFX10002.

    This would allow us not to depend on arp requests from 1.1.1.10.

     

    photo_QFX10002-issue-3.jpg

     

     

     

     



  • 7.  RE: ethernet-switching table issue QFX10002

     
    Posted 06-14-2020 17:54

    For asymetric routing, you have a routing situation not a L2 MAC-learning issue, AFAIK.  Without Stateful FW in the picture, there should be no issue with asymetric routing, except maybe to troubleshoot.

     

    Might not a static MAC entry on the QFX10002 provide what you need?

     

    https://www.juniper.net/documentation/en_US/junos/topics/topic-map/mac-addresses.html#id-adding-a-static-mac-address-entry-to-the-ethernet-switching-table-on-a-switch-with



  • 8.  RE: ethernet-switching table issue QFX10002

    Posted 06-14-2020 18:11

    Yes. It is also possible.

    Fix the table of mas addresses manually.

    But quite inconvenient from a practical point of view.

    Possible as  "workaround".

    But correct way:

    1) "delete" l2 path 1.1.1.1 -> EX4550 - 1.1.1.2

    2) move l2/l1 "capacity" to 1.1.1.1 -> EX4550 -> QFX10002 -> 1.1.1.2

     

    photo_QFX10002-issue-4.jpg



  • 9.  RE: ethernet-switching table issue QFX10002
    Best Answer

    Posted 06-14-2020 21:00

    Hello,

     


    @A.leb wrote:

    Yes. It is also possible.

    Fix the table of mas addresses manually.

    But quite inconvenient from a practical point of view.

    Possible as  "workaround".

    But correct way:

    1) "delete" l2 path 1.1.1.1 -> EX4550 - 1.1.1.2

    2) move l2/l1 "capacity" to 1.1.1.1 -> EX4550 -> QFX10002 -> 1.1.1.2

     

    You don't need to jump through all these hoops, You just need to make MAC address timeout and ARP timeout the same throughout Your estate. Which is the best practice, BTW.

     

    JUNOS MAC address timeout https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/layer-2-services-address-learning-and-forwarding-configuring-table-timeout-interval.html

     

    JUNOS ARP timeout https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/configuring-arp-aging-timer.html

     

    HTH

    Thx

    Alex

     

     

     



  • 10.  RE: ethernet-switching table issue QFX10002

    Posted 06-15-2020 05:18

    Agree. Indeed, this is the most practical solution.

    Thank you all for the discussion.