Switching

IMPORTANT MODERATION NOTICE

This community is currently under full moderation, meaning  all posts will be reviewed before appearing in the community. Please expect a brief delay—there is no need to post multiple times. If your post is rejected, you'll receive an email outlining the reason(s). We've implemented full moderation to control spam. Thank you for your patience and participation.



Expand all | Collapse all

QFX10008 Flooding unicast dhcp like broadcast/unknown

  • 1.  QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 05-28-2021 16:27

    The title says it all but some context here. We have a jtac case open and they are not able to reproduce the behavior in their lab yet.

    We have 2 of these boxes doing this. When a unicast dhcprequest or dhcpack traverses any vlan on the box, its forwards that traffic out all interfaces where vstp is forwarding for the vlan even though traffic captures from switches linked to the 10k's which receive the traffic show that the traffic is l2 unicast.

    Code on 1 is 20.2R2-S3 and the other is base 20.2R2.

    Neither I nor JTAC can see anything in the config that would cause this....but I did find one thing I"m not sure about. Under the vstp config stanza, I'm using an interface range, which in looking at this doc doesn't seem like it's officially supported on the qfx, or that's my interpretation of it, as for the ex line it seems to specifically list vstp. So I tried a port outside of the range. Flooding still occurs:

    https://www.juniper.net/documentation/us/en/software/junos/interfaces-ethernet-switches/topics/topic-map/switches-interface-range.html

    Has anyone ever heard of such a thing?

    Not sure at this point what parts of sanitized configs should be posted, etc but would appreciate any insight. 



  • 2.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

     
    Posted 05-29-2021 02:10
    The simplest reason is that your 10K doesn't have the destination MAC in its switching table--so it floods the packet to all ports. Have you checked this?


  • 3.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 11 days ago
    Team,

    There is an issue here in handling of transit detected DHCPv4/v6 packets coming to the hostpath which was getting re-injected to the ASIC as type flood due to which ucast DHCP packets were getting flooded in the Layer2 domain. A fix has been identified and will be available in the upcoming EVPN Recommended releases in Q4'22.  TAC ticket 2021-0526-0545 has been updated with additional details.

    Let us know if there are any additional details required.

    Regards,
    Aquin Mathai

    ------------------------------
    Aquin Mathai
    ------------------------------



  • 4.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 05-29-2021 10:54
    I have checked that. It does have the destination mac in it's ethernet switching table. It floods every  dhcp packet (frame) in every vlan even if it is unicast and has a known destination mac address.  Was confirmed by myself and JTAC.



  • 5.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

     
    Posted 05-30-2021 05:45
    Well that is the definition of a Junos bug.  Had JTAC checked the PR (problem report) database to see if it has already been discovered and fixed?

    If not, they will need to duplicate your configuration and topology in the lab and generate a new PR for software development to add to the pipeline.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 6.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 05-30-2021 11:50
    I agree. So far JTAC has not found a matching PR for this issue. Also they have not been able to reproduce the issue on the same version of Junos in their lab, but they have confirmed it in our environment. They aren't using our exact config and there some differences for sure, such as their config doesn't use multiple routing-instances of type virtual router, or have a dhcp relay configured on their QFX-10008. 

    Since I have two of these boxes I'm going to try an alternate version of Junos on one of them. One is currently running 20.2-R2-S3 and the other 20.2R2, the former being the current recommended release and the later has not been upgraded yet. I was thinking to start on the other recommended releae 18.4R2-S8 unless someone can suggest a well-known stable release. 

    Other than that I'm going to test the confluence of certain features in the config. I have a theory that vlans that I'm seeing this issue in also have an irb bound to them as a l3 interface and also have a dhcp relay configured in the routing instance.


  • 7.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-01-2021 12:37
    Seems there is a correlation between having an irb interface bound to the vlan and this issue. On vlans where there is no irb configured as the L3 interface I'm not seeing the issue. I don't think JTAC saw it this way but they did not have dhcp relay configured on the qfx, or multiple virtual routers with dhcp relay either.


  • 8.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-15-2021 05:25
    Just fyi, they are telling me this behavior is by design.


  • 9.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-01-2021 14:08
    If anyone is interested, the issue seems to be triggered by adding a dhcp relay in a non-default routing instance of type virtual-router and having snooping enabled on the relay. Was able to replicate this with JTAC today. Adding "set routing-instance name forwarding-options dhcp-relay no-snoop" stops the flooding. I've been told though that this should not be needed though and that that the are filing a PR for the issue.


  • 10.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

     
    Posted 06-02-2021 01:06
    Weird! Thanks for the follow-up.


  • 11.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

     
    Posted 06-02-2021 05:32
    Thanks for the update and following through with JTAC.  It can be a pain to be the first to report an issue and get the PR properly documented so we all benefit.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 12.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-03-2021 08:24
    For sure. Case moved to ATAC now and hopefully PR process will go smoothly.


  • 13.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-06-2021 14:05
    As an update. While "no-snoop" under dhcp relay stops the flooding of dhcp packets it also stops the box from relaying anything as well. 

    Beyond frustrated......

    This should have never left development with this issue and nobody at Juniper seems to care. We cannot be the only shop that relay's dhcp across multiple virtual routers on a qfx-10008. 

    In their current state these boxes are 100% unusable.


  • 14.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 03:26
    I've deployed 10K8s with relays over multiple routing instances. The only issue I've come across is no smart-relay support with EVPN-VXLAN.. Although I guess you're not using EVPN-VXLAN? 

    I the meantime, perhaps you can move the relay function to a standalone device. This is also something I've seen in the past. You would place another switch (kinda on a stick) in the fabric configured with all your VLANs that require relay. Might be enough to get you going whilst JTAC work on the PR?

    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 15.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 08:31
    Thanks. I apppreciate the advice. 

    Nope, no EVPN. When you deployed those 10k's with dhcp relay in multiple virutal routers, what version of Junos was that running on?

    I don't think I have the hardware on hand to do the relay on a stick thing. We do have some ex4300-mp's on a stick right off these 10k's but they have their own issues with dhcp....another open case that is also dhcp related. We'e also had some other major issues with the platform that make it unsuitable for this.


    Honestly I really wish Juniper would dedicate some resources to solving these dhcp issues. Since we've been a customer, I'd estimate that 90% of my tickets have been dhcp-related. Seems a lot of the dhcp knobs are ISP-oriented. We'd like a clean dhcp relay that is enterprise focused. 

    I really wish rather than having support teams dedicated to product lines, there was an enterprise focused theam. 

    Another big want would be a long term, stable version of Junos that is supported both by JTAC and engineering, for customers who aren't really interested in feature enhancements.


  • 16.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 10:56
    I think it was 18.4 that we used on the 10k8s. We did see a lot of issues with the platform but once we ironed them out, it's been solid. 

    I get your point on the JTAC support. Although you'll almost certainly have to upgrade for security fixes anyway... 

    Have you considered using EVPN-VXLAN? It sounds like you're using this in a campus environment. Even a collapsed spine could be a good option for you. No need for STP then, just ESI lag to your dist/agg switches. re-designing because of bugs isn't ideal, but it's an option..

    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 17.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 11:03
    JTAC tried 18.4 in the lab and the issue was still present. We've thought about vxlan but don't have access layers stuff supporting it do make it worth it at this point.  This isnt' really an stp issue, its that once you enable dhcp relay in a non-default routing instance all dhcp packets, well I should say frames are flooded out all ports.....even if it is truly unicast traffic. 

    Yeah I get the  need to upgrade. I just don't want to upgrade to some "recommended" release and have issues like this surface out of the blue. Would be an instant meltdown at scale. I mean I honestly think with sufficient quality assurance this should have not made it out of dev. If Juniper's solution is that we must move relay's to other gear when it is stated that this feature is supported like this, well then there has to be some concession on the hardware to do that. 



  • 18.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 11:09
    A collapsed leaf/spine architecture means that you only run EVPN-VXLAN on the 10k8s. All your other gear would be normal L2. The reason I mentioned STP was I thought you said previously that VSTP seemed to be related to the issue. So my thinking is, no STP no issue. 

    I'm deploying a very large campus with EVPN-VXLAN and a ton of DHCP relay. I sure hope I dont run into the same issue :-/

    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 19.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 13:25
    Sorry I missed the "collapsed" part of your post. Yes this issue has been a show stopper for sure. Case seems to be getting punted around internally and other than that it is radio silence. 

    It doesn't seem to be stp related, or so it seems at this point as I can make the flooding stop by doing "no-snoop" under all the relay configs, but this it doesn't actually relay any packets.



  • 20.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 13:32
    Do you have more than one gateway per VLAN?

    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 21.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-07-2021 13:39
    Yes. There are 2 of these 10k's doing VRRP in these vlans and they are both doing relay.

    Here is the sticky thing though...it doesn't matter even if I disable the IRB's....if there is a relay configured in any non-default routing-instance on these boxes it triggers flooding dhcp flooding in every vlan on the box that has an irb configured and bound to the vlan. Even those not associated with the non-default routing instance. Just adding this for context. 

    I'd have to get rid of the relay's in non-default instances and to do that I'd have to revert relay and probably those l3 gateways back to the aging Cisco devices that these are replacing.


  • 22.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 06-08-2021 04:00
    oh wow, that's brutal! Hopefully JTAC will get you a fix asap. Good luck with it and keep us posted. And if you need any workaround ideas then feel free to drop me a message.

    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 23.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-14-2021 09:44
    So I was told yesterday by JTAC/DEV that when configuring dhcp relay on an irb interface that this flooding (reinjection as DEV calls it) is done by design. Was told that I must set "no-snoop" under my relay config. But it does not work. Totally breaks the relay.
    This is the only vendor I've ever seen take this approach. It it doesn't make sense and violates a basic rule of ethernet switching about not flooding unicast frames with known destination addresses. 

    So my only choices are to wait while JTAC/DEV figures out how to make "no-snoop" work, or to do dhcp relay on non-jnpr hardware.
    I realize I'll offend the juniper faithful with this, but this is really a deal breaker that should be documented ahead of time and that our SE should have advised about. But we go through SE's like water, and oddly none of them have a networking background. 



  • 24.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

     
    Posted 09-15-2021 05:41
    I'm having trouble wrapping my head around how JTAC considers this expected and normal behavior.  For me the definition of unicast is to forward ONLY to the destination.

    And even their response admits the bug for what I would call their work around.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 25.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-15-2021 14:35
    We just went through that same battle. Along with other things and the UDP bug.  Sorry to hear it. It was never resolved for me and I have taken a different approach (another vendor)

    ------------------------------
    JOSHUA HOLCOMBE
    ------------------------------



  • 26.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-16-2021 06:07
    Sorry to hear you're still having issues with this.

    I've seen similar weird behaviour with one of my customers, we resolved this by moving the relay function to a dedicated QFX5110. We then used the following knob on the 10K8:

    set routing-instances XXX forwarding-options dhcp-relay forward-snooped-clients all-interfaces

    Less than ideal I know, but it might buy you some time. I'm going to follow this issue up with JNPR through some of my channels. Would you mind sharing your JTAC ref with me via DM?


    ------------------------------
    DANIEL HEARTY
    Principal Engineer
    ------------------------------



  • 27.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-15-2021 14:59
    It's awful, frankly. Did you abandon the QFX platform totally or just move relay to another vendor? Did you try the 'no-snoop' option?

    I  challenge anyone from Juniper Networks to defend this design.  I've never seen this on any other vendor's platform. 

    I'll just say it....JUNIPER NETWORKS DOES NOT UNDERSTAND DHCP IN THE CONTEXG OF ENTERPRISE NETWORKING, and overall their quality control is worst of any vendor I've ever worked with. 



  • 28.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 09-16-2021 11:22
    Sure I'll send that over, the case number I mean.

    Just one question, when you said this "We then used the following knob on the 10K8:

    set routing-instances XXX forwarding-options dhcp-relay forward-snooped-clients all-interfaces"

    Did you mean you did that on the 10k or the 5k. Wasn't sure because I thought you said you disabled it on the 10k entirely? 

    I've done that on the 10k and the issue still persists. 



  • 29.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 11 days ago
    So the fix is over a year away? Post says q4 2022 or did you mean 2021? 

    Also you reference even. Our case did not involve that feature.


  • 30.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 9 days ago
    Is this problem specific to this hardware platform?


  • 31.  RE: QFX10008 Flooding unicast dhcp like broadcast/unknown

    Posted 11 days ago

    Hi Scott,

    My apologies. It's a typo. The release with fix would be available in Q4'21 itself.

    https://jtacworkbench.juniper.net/#/caseDetail/2021-0526-0545

    The fix that would be committed against PR reported through this TAC case would address the ucast  DHCP packet flooding issue being discussed in this thread.

    Regards,

    Aquin Mathai



    ------------------------------
    Aquin Mathai
    ------------------------------