Junos OS

 View Only
last person joined: 2 days ago 

Ask questions and share experiences about Junos OS.
  • 1.  How about a catastrophe scenario ...

    Posted 02-03-2023 08:27

    Bonjour everybody

    Let's see when YOU can spot the problem before it happens

    1)      I use mgmt port in my lab on my 10   2300-C switches for convenience

    2)      I don't use  mgmt. port on the 250 switches in the field. I have a management vlan shared by all switches

    3)      Lab and fields run on Junos 20.4R2S2

    4)      Last year, we had to add an sfp with xe interface on 2 field switches.

    5)      I tested a config in my lab and, when I was sure it worked , copied it from the lab  remotely on the 2 sites.

    6)      Last week we wanted to upgrade the 250 prod switches to Junos 22.3

    7)      I read the note on 22.3 and don't notice anything that applies to my configurations

    8)      I did the same steps that I have been doing for 3 years, hundreds of time

    a.       Have coffee

    b.       Put the new image in /tmp directory on each switch

    c.       Md5 check that the file is OK

    d.       Request software add <new image> no-copy no-validate

    e.       Make sure the image is correctly installed with the last line " will take effect at next reboot"

    f.        Notify clients of the coming 15mns re-start of the 2300s at scheduled date

    g.       Have coffee

    9)      On the scheduled date, at 23:00, I ask for a reboot on all switches

    10)   After 15 mns, all switches were back online, running on Junos 22.3 ... all except 2 for which I had lost total contact.

     

    Took me a while to figure it out. I learned from my mistake and I'll share the solution with you next week J

    Best regards,

     

    __________________________________________

    Michel Lapointe

    GIRAT



  • 2.  RE: How about a catastrophe scenario ...

    Posted 02-06-2023 21:31
    now before I write the solution, I must say that I did what I thought was due diligence: 
    I installed  22.3 on  dozen EX2300 located in a separate rack, always powered on and ready to be configured for the next client installation.  Everything worked fine.
    Before upgrading the 250 switches in production, I also upgraded 5  EX2300 on some sites nearby that I controlled just in case something went wrong. Nothing went wrong. . 
    SOLUTION: 
    As far as I can tell, what happenned is this...
    at step 4, I mentionned that I had used my lab setup configuration to prepare a configuration using xe-interface. I copied the configuration to the 2 produciton switches, but without paying attention to the fact that those 2 prod switches now had a me0 management port configured. 
    a year  (and some images updates later) I finally pushed the new 22.3 image on the prod switches. My 2nd mistake was to not pay attention to the fact that that add software  command had the no-validate options. 
    last chance: do you see what's coming ???? 
    when I asked remotely for a reboot of all 2300s, I lost access to the 2 switches that had an me0 mgmt interface with an ip address since 22.3  will refuse a commit when vme AND me0 interface are configured. vme was stil there from the factory configuration 

    error: Address cannot be configured on me0 and vme at the same time

    error: configuration check-out failed

    I was therefore rebooting a switch with a non-working configuration. And since I had carefully updated the rescue configuration to the running one before reboot, even the rescue configuration did not installed on the switch. 
    Had I not used the no-validate option when doing the image update,  the installation would have aborted with an clear error message 

    Interface control process: <message>Address cannot be configured on me0 and vme at the same time</message>

    Interface control process: </xnm:error>

    mgd: error: configuration check-out failed

    Validation failed

    ERROR: Current configuration not compatible with junos-arm-32-22.3R1.11.tgz


    My lesson learned is not to use the no-validate flag when doing image upgrade.  and also upgrading ny small groupsL you never know when an surprising sequence of events will destroy your best laid plans.

    Michel Lapointe

    ------------------------------
    Michel Lapointe
    ------------------------------