Switching

 View Only
last person joined: 2 days ago 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
  • 1.  Sudden infrequent reboots of EX4200 0x2:watchdog

    Posted 04-25-2022 18:45

    My EX4200 started to infrequently reboot after about 7 years in operation with no reboots. The intermittent reboots are about once a month.

    The reboot reason is 0x2:watchdog

    show chassis routing-engine
    Routing Engine status:
      Slot 0:
        Current state                  Master
        Temperature                 48 degrees C / 118 degrees F
        CPU temperature             48 degrees C / 118 degrees F
        DRAM                      1024
        Memory utilization          49 percent
        CPU utilization:
          User                       6 percent
          Background                 0 percent
          Kernel                     4 percent
          Interrupt                  0 percent
          Idle                      89 percent
        Model                          EX4200-48T, 8 POE
        Serial ID                      BP0208249061
        Start time                     2013-12-17 20:31:57 PST
        Uptime                         1 hour, 4 minutes, 54 seconds
        Last reboot reason             0x2:watchdog
        Load averages:                 1 minute   5 minute  15 minute
                                           0.16       0.35       0.39
    

    The was an uninformative core dump:

    > show system core-dumps core-file-info /var/tmp/vmcore.0.gz
    fpc0:
    --------------------------------------------------------------------------
    'junos' process terminated
    Stack trace:
    #0  0x00000000 in ?? ()
    #0  0x00000000 in ?? ()
    
    {master:0}

    The master log has nothing: it suddenly stopped writing there and started with the reboot messages, without even inserting  a carriage return:

    > show log messages.4.gz
    Apr 25 11:01:49  365main-c0326-swi-01 sshd: SSHD_LOGIN_FAILED: Login failed for user 'root' from host '134.209.168.212'                               
    Apr 25 11:01:49  365main-c0326-swi-01 sshd[10932]: failed to update number of consecutive login attempts for user: root                               
    Apr 25 11:01:49  365main-c0326-swi-01 sshd[10932]: Received disconnect from 134.209.168.212: 11: Bye Bye [preauth]                                    
    Apr 25 11:01:49  365main-c0326-swi-01 inetd[1330]: /usr/sbin/sshd[10932]: exited, status 255                                                          
    Apr 25 11:01:52  365main-c0326-swi-01 sshd[10935]: Failed password for root from 61.177.173.5 port 52375 ssh2                                         
    Apr 25 11:01:52  365main-c0326-swi-01 sshd: SSHD_LOGIN_FAILED: Login failed for user 'root' from host '61.177.173.5'                                  
    Apr 25 11:01:52  365main-c0326-swi-01 sshd[10935]: failedDec 17 20:33:57  365main-c0326-swi-01 eventd[967]: SYSTEM_OPERATIONAL: System is operational 
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel: Copyright (c) 1996-2013, Juniper Networks, Inc.                                                        
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel: All rights reserved.                                                                                   
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel: Copyright (c) 1992-2006 The FreeBSD Project.                                                           
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994                               
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel:  The Regents of the University of California. All rights reserved.                                     
    Dec 17 20:33:57  365main-c0326-swi-01 /kernel: JUNOS 12.3R5.7 #0: 2013-12-18 01:32:43 UTC           

    I have two identically configured EX4200s and both started these random reboots around the same time, a few months ago. Thus I thought it was not a hardware problem. system uptime showed excessive load of about 1.5, and tcpdump in the shell showed a UDP NTP  flood. I reconfigured the firewall to protect the routing engine from the NTP traffic, and the load averages dropped down to 0.3-0.5. The second switch has never rebooted since, but his one has. There are no chassis alarms.

    What might a possible reboot reason be?



  • 2.  RE: Sudden infrequent reboots of EX4200 0x2:watchdog

    Posted 04-25-2022 19:24
    There is a known software bug that is noted in PR1047142 and will require a software update to one of the fixed versions.

    https://supportportal.juniper.net/s/article/Junos-Service-Release-12-3R9-S1-Released

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 3.  RE: Sudden infrequent reboots of EX4200 0x2:watchdog

    Posted 04-26-2022 05:25

    Thanks. The PR says:

    • PR1047142   -  EX4200 rebooting randomly due to watchdog:0x2 without any logs and core.
    In this case, there was a core dump, but it was uninformative. Do you think it still can be fixed by this update?


  • 4.  RE: Sudden infrequent reboots of EX4200 0x2:watchdog

    Posted 04-26-2022 05:26
    Assuming your Junos version is before the listed one, yes seems like the same issue as the core is empty.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------