I've encountered some branch SRX'es over the past years that would not boot correctly after a power failure. Most of the time they will boot from secundairy partitition and a "clean" reboot is enough to fix this. However sometimes it will not boot at all and get stuck at the bootloader with the following message:
can't load '/kernel'
can't load '/kernel.old'
Press Enter to stop auto bootsequencing and to enter loader prompt.
Now, the first thing most people do is search for solutions, and so did i. There are a ton of posts on this, but none of them would give me satisfying results.
So i want to share what i have tried on various occasions and what won't work. At the moment i have 2 separate SRX300's with this problem and i've decided to tackle this problem and get the procedure right once and for all. The SRX in question is running junos-srxsme-15.1X49-D170.4-domestic
I've started with formatting a 16GB USB stick with FAT32 and copying junos-srxsme-15.1X49-D170.4-domestic.tgz onto it. Booted the SRX300 into the bootloader and trying:
install file:///junos-srxsme-15.1X49-D170.4-domestic.tgz
It starts the installation and looks promising, until:
Starting JUNOS installation:
Source Package: disk1:/junos-srxsme-15.1X49-D170.4-domestic.tgz
Target Media : internal
Product : srx300
Computing slice and partition sizes for /dev/da0 ...
awk: division by zero
input record number 1, file
source line number 3
The target media /dev/da0 (0 bytes) is too small.
The installation cannot proceed
ERROR: Target media is too small
Now, some suggest putting --format in the command, but this won't work and i've found somewhere that this isn't support on SRX (branch?) devices. Also on the other faulty SRX300 i've gotten the message:
Target device selected for installation: internal media
cannot open package (error 22)
I'll get back to this after i fix the current one.
There are some posts that suggest making a bootable USB using a healty SRX and booting from that. I reformated my 16GB USB disk to FAT32 and plugged into a healty SRX300 and used the following command: "run request system snapshot media usb"
When it finished after a while i plugged it into the faulty one and rebooted:
SPI stage 1 bootloader (Build time: May 3 2016 - 23:48:30)
early_board_init: Board type: SRX_300
U-Boot 2013.07-JNPR-3.1 (Build time: May 03 2016 - 23:48:31)
SRX_300 board revision major:1, minor:9, serial #: CV3417AF1962
OCTEON CN7020-AAP pass 1.2, Core clock: 1200 MHz, IO clock: 600 MHz, DDR clock: 667 MHz (1334 Mhz DDR)
Base DRAM address used by u-boot: 0x10fc00000, size: 0x400000
DRAM: 4 GiB
Clearing DRAM...... done
Using default environment
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB
Found valid SPI bootloader at offset: 0x90000, size: 1481840 bytes
U-Boot 2013.07-JNPR-3.1 (Build time: May 03 2016 - 23:50:19)
Using DRAM size from environment: 4096 MBytes
checkboard siege
SATA0: not available
SATA1: not available
SATA BIST STATUS = 0x0
SRX_300 board revision major:1, minor:9, serial #: CV3417AF1962
OCTEON CN7020-AAP pass 1.2, Core clock: 1200 MHz, IO clock: 600 MHz, DDR clock: 667 MHz (1334 Mhz DDR)
Base DRAM address used by u-boot: 0x10f000000, size: 0x1000000
DRAM: 4 GiB
Clearing DRAM...... done
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB
PCIe: Port 0 link active, 1 lanes, speed gen2
PCIe: Link timeout on port 1, probably the slot is empty
PCIe: Port 2 not in PCIe mode, skipping
Net: octeth0
Interface 0 has 1 ports (SGMII)
Type the command 'usb start' to scan for USB storage devices.
Boot Media: eUSB usb
Found TPM SLB9660 TT 1.2 by Infineon
TPM initialized
Hit any key to stop autoboot: 0
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB
SF: 1048576 bytes @ 0x200000 Read: OK
## Starting application at 0x8f0000a0 ...
Consoles: U-Boot console
Found compatible API, ver. 3.1
USB1:
Starting the controller
USB XHCI 1.00
scanning bus 1 for devices... 2 USB Device(s) found
USB0:
Starting the controller
USB XHCI 1.00
scanning bus 0 for devices... 2 USB Device(s) found
scanning usb for storage devices... Device NOT ready
Request Sense returned 02 3A 00
2 Storage Device(s) found
FreeBSD/MIPS U-Boot bootstrap loader, Revision 2.8
(slt-builder@svl-ssd-build-vm06.juniper.net, Tue Feb 10 00:32:30 PST 2015)
Memory: 4096MB
SF: Detected MX25L6405D with page size 256 Bytes, erase size 64 KiB, total 8 MiB
[2]Booting from usb slice 1
\
can't load '/kernel'
can't load '/kernel.old'
Press Enter to stop auto bootsequencing and to enter loader prompt.
So, no joy there.. After some more searching i found a post and tried something else:
I stopped autoboot and got into the shell starting with "Octeon srx_300_ram#"
Octeon srx_300_ram# env print
autoload=n
baudrate=9600
boardname=srx_300
boot.btsq.len=0x00010000
boot.btsq.start=0x007e0000
boot.current=primary
boot.devlist=eUSB:usb
boot.env.size=0x00002000
boot.env.start=0x007f0000
boot.upgrade.loader=0x00200000
boot.upgrade.loader.data=0x00200000
boot.upgrade.loader.hdr=0x002fffc0
boot.upgrade.uboot=0x00000000
boot.upgrade.uboot.data=0x00000100
boot.upgrade.uboot.hdr=0x00000030
boot.upgrade.uboot.maxsize=0x00200000
boot.upgrade.uboot.secondary=0x00000000
boot.upgrade.ushell=0x00300000
boot.ver=3.1
bootcmd=sf probe; sf read 0x100000 $(boot.upgrade.loader) 0x100000; bootelf 0x100000
bootdelay=5
disk.install=disk1
dram_size_mbytes=4096
ethact=octeth0
ethaddr=58:00:bb:a0:f1:00
loadaddr=0x20000000
loaddev=disk0:
numcores=2
octeon_failsafe_mode=0
octeon_ram_mode=1
serial#=CVXXXXXXXXXX
stderr=serial
stdin=serial
stdout=serial
ver=U-Boot 2013.07-JNPR-3.1 (Build time: May 03 2016 - 23:50:19)
wmem_selector=9448
Environment size: 1010/8188 bytes
Octeon srx_300_ram# setenv boot.current alternate
Octeon srx_300_ram# boot
This time it would boot all the way, but the funny thing is, that it booted from USB that was still plugged in. The faulty SRX had the config of the healthy SRX i used to create the partition.
Every few minutes a message shows in console:
(da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
Opened disk da0 -> 6
So, where do i go from here? Also setenv boot.current alternate and the rebooting with the USB removed results in the kernel.old message.
Kind regards,
Jeroen R