All I did was reboot a EX4200 stack and one member didn't come up. It was working fine before the reboot. This member has the larger power supplies for POE on all 48 ports. The front panel display for 'Status' shows PWR0 and PWR1 'Absent' which is nuts because they are both present and running (otherwise the front panel wouldn't work). Also shows FANS 'Failed' but I hear them. TEMP is 'Ok'. But no LED's on ports are on and J-web shows it as 'Inactive'.
If the failures are false alarms and not a real failure, I think your best bet will be to:
Thanks. I forgot to mention the version of JunOS showed on the front panel display was 12 instead of 15, which the rest of the stack is running. However I don't know if this indicates anything wrong- to stack members get JunOS from the master when booting?
Yes the version is a good clue.
This likely means the partition was corrupted and the ex booted to the backup partition which had the old Junos on it.
You will need to upgrade this member again to be the same version as the rest of the VC.
And then you will want to run a snapshot to put the correct version onto the backup partition for all members
request system snapshot slice alternate
Can the member be upgrade while still a part of the stack?
I logged into the inactive member and a 'df' shows the root disk is /dev/da0s1a. On the other active members the root is /dev/da0s2a. So it appears it may have booted to a backup partition as suggested.
I would appreciate advice what to do next. Can this be resolved while it's part of the VC? If I returned this member to factory default would this repair /dev/da0s2a? Would the switch then automatically update and join the VC?
The safest way is to disconnect the VC cable and run the upgrade process then reconnect when complete.
But I have run upgrades while connected to the VC without issue in the past on an ex4200 stack member.
We disconnected the member and did a return to factory default. It was now running version 12. We reconnected one VC cable and attempted to update from 12 to 15. This failed due to lack of disk space. We then did the update from USB.
Everything is working again however the primary on this member is da0s2a. If I do a:request system snapshot slice alternate
Will it be smart enough to realize that slice 2 is the primary not slice 1? I don't want it to copy slice 1 to 2.
Also after copying how would I return slice 1 to being the primary?
And last where is the 'request system snapshot' command fully documented? Not in the CLI manual.
The designation of primary partition is dynamic based on circumstances so there is not issue with either one being primary and no need to force a change.