I just wanted to let the community know about issues that occur with EX2200 switches running the 15.1Rx Junos versions.
The upgrade from 12.3 to 15.1.R1 worked well and just as expected while each further 15.1 release we upgraded to (also the recent R4 to R5) caused major stability issues with many of the switches.
We tracked down several aspects so far:
- When you copy a new software package to the switch you will already see the switches memory jump well over 75-80% consumed. At that step we did not observe any other problems. However that consumed memory is not going to be cleared once the file has been transferred.
- If you continue and issue the 'request system software add..' for the package the switch is going to consume the remaining memory up to 100% and also 100% CPU load. What you can expect at that point is STP loops, unavailability of that switch and every related subnets caused the loop..
- Doing the copy and upgrade in one command e.g. with 'request system software add http://someserver/jinstall... reboot' will lead to the same issue.
We had few switches where the upgrade made it through to the switch reboot after which the device was up with the new software but the majority of 2200's just fail in operation.
The only workaround we found so far is to copy over the upgrade files to /var/tmp and do a reboot before actually installing the upgrade.
I assume this is also happening on other platforms but had no time to verify it until now as well as opening one more JTAC case.. other platforms like the EX3300 work however which I think is just the case as they got more RAM than the 2200 (512MB).
Hope this will save some of you from problems 🙂
Did you have these problems upgrading to 15.1R5? The fact that your post is from Jan 5th makes me think that the reference to 15.1R1 might be a typo? I strongly suggest opening a JTAC case to make sure nothing unusual is going on. Please let me know if you end up opening a JTAC case.
@Akeskin: We've not yet opened a case still. However there is no typo. We have adopted the 15.1 code pretty early due to some policy saying "you must not use firmware that is going to be EoE" - that was before Juniper decided to extend 12.3 support by one year...
I mentioned it in another thread - we still had the problems when we moved our test equipment from R4 to R5. As in the past R5 often was considered to be the first "mature" version of a new train I have some hope that things will become better now that the version overall has become recommended.
Just as a note also EX8200 NSSU is broken in the 15.1 code and leaves you with the XREs plus half the SREs updated but an error "access denied" appeares and so the other SREs do not upgrade even if you reboot them manually.. Plain software add with whole VC reboot works.
I cannot tell if this also applies to other models as I only tested NSSU for the 82's.
It is not very relevant to this topic here but similar behavior happened with our QFX5100 where I did the NSSU from the recent 14.1X53-D35 to the latest D40 interim release and after the NSSU OSPF was not working anymore. A reboot of the overall still working (checked via console) VC fixed this.. OSPF was in init state and not able to peer with the linked Cisco 6509's, while the links worked, according to LACP+LLDP. Also pinging the L3 interfaces worked. Restarting the routing process did not help. I just avoid NSSU whereever I can and only try it if it worked for my test equipment...
Edit: Tested nssu for EX3300, 4300 and 4500 now and they worked, whereas these were configured as simple L2 devices with basic inband l3 vlan interface for mgmt, without using any, more fancy, routing features.
12.3R12 Recommended Release for EX2200 and EX2200-C
Not only that - we found out, that the recommended for the EX2200-C is broken and leaving the switch to always show the red alert due to "can't find temperature-sensors" - even worse the Revision before, the EX2200-C shuts itself down as soon as it boots up because of "missing temp sensor" and issuing "fire shutdown"...
We went back to the 12tree and the issues disappeared.