I've got a 3 member virtual-chassis EX4200 that won't commit its config. This morning, member0 rebooted and dumped core immediately after a commit was performed. Console showed a lot of this, repeatedly:
I=16105UNEXPECTED SOFT UPDATE INCONSISTENCYCLEAR? yes
DIRECTORY CORRUPTED I=8205 OWNER=0 MODE=40755SIZE=512 MTIME=Jun 14 02:58 2013DIR=?UNEXPECTED SOFT UPDATE INCONSISTENCYSALVAGE? yes
There are also some errors having to do with FIPS failures before the boot finished. Now when I attempt to commit, I get:
# commit checkfpc1:configuration check succeedsfpc0:2020-10-22 11:57:46 EDT: Running FIPS Self-testsveriexec: /boot/loader: No such file or directory/sbin/kats/file_integrity: cannot open /boot/loader: No such file or directoryFailed SHA1 checksum of /boot/loader@ 1603382266 [2020-10-22 11:57:46] fips-error: FIPS Error 1: File integrity test failedAbort trap (core dumped)2020-10-22 11:57:47 EDT: FIPS Self-tests Failederror: configuration check-out failedfpc1:error: remote commit-configuration failed on fpc0fpc2:configuration check succeedsfpc1:error: configuration check-out failed
I also ran a disk check and I'm seeing 'bad read' errors:
% nand-mediack -CMedia check on da0 on ex platformsZone 06 Block 0186 Addr 18ba00 : Bad readZone 06 Block 0502 Addr 19f600 : Bad readZone 06 Block 0523 Addr 1a0b00 : Bad readZone 06 Block 0700 Addr 1abc00 : Bad readZone 06 Block 0807 Addr 1b2700 : Bad read
I am pretty sure I need to re-install Junos from USB and format the disk in the process, but is there any safe way to commit pending changes without doing this in the meantime?
- Unfortunate, configuration changes cannot be committed as long as this member report HW failure.
-Re-installing the Junos with Format is mandatory as the trail before declaring this switch as dead hence RMA will be needed.
you may use the steps described in this KB for Recovery
I would think that commit will not go through until you fix that member, but you can try the following,
#commit full force (this will make all daemons to check the config applied)
Besides that you can try removing the member giving issues, in this case it looks like fpc0 is the one not allowing the commit, so you can remove the vc cables from this member and try to commit again, commit will be made for the other members, if this is the master you can try to swap the mastership first to avoid issues.
>request chassis routing-engine master switch
Thanks for the response. What about a snapshot on the alternate slice, rebooting, then snapshotting onto the primary slice? Is that worth a shot or just go right to install via the loader from USB?
I am afraid to say that as long as bad block recovery commands like "nand-mediack / fsck" didn't help, So your switch is experiencing real bad blocks.
However, snapshotting from the alternate slice is worth a trail as last resort
Please refer to the KB https://kb.juniper.net/InfoCenter/index?page=content&id=KB23180 to restore main slice.
1) snapshot from backup slice to the main slicerequest system snapshot media internal slice alternate
2) check result
3) reboot from the main slicerequest system reboot slice alternate media internal
But I would recommend being ready to install from USB in case of failure
Hope this helps. 😎
Please mark "Accept as solution" if this answers your query. Kudos are appreciated too!
On legacy EX switches, file system check (fsck) is run with the -C option, which skips the file system corruption check if the partition has been marked clean during the boot "nand-media" check. Due to this, there have been multiple instances where the partition has had file system issues even when cleanly shut down.
In the rare instance that the file system check (fsck) is completed and file system corruptions continue to be seen, you would need to perform an install -format. This will format the file system and all file system corruptions will be removed, along with any previous logs and configuration. To perform format install, refer to
Mark this as an "Accepted Solution" so that it can help others.
Unfortunately, the booting to the alternate slice did not help. I ended up having to install from USB and formatting the disk in the process:
This process was relatively painless and very simple.