Log in to ask questions, share your expertise, or stay connected to content you value. Don’t have a login? Learn how to become a member.
Generally speaking, I really like working with the SRX. We use 210, 220, and 240 models throughout the company. It's trivially easy to set up tunnels with OSPF to do all kinds of neat inter-office connectivity, and working with JTAC is WAY better than Cisco TAC. (we have a Cisco phone system)
Five years ago, we bought 15 new SRXes from an authorized Juniper dealer, and each was installed in a separate geographic location.
I'm having GRAVE concerns about their reliability. In the past 2 years, 5 of the 15 have failed with a 6th one heading to the toilet.
6 failures out of 15... that's a 40% failure rate in 5 years. For the record, all are on APC UPSes of varying capacities, and utility power problems are extremely rare.Is the SRX really this much of a failure-prone dog? Juniper Netscreens we bought circa 2005-06 are still running TODAY with no problems at all... which is why I was so anxious to adopt the SRX at new locations. But wow.... the problems never end.Are we alone in this experience?
No - you're not the only one. We didn't have any jack- or pushbutton-issues, but loads of problems with bad blocks in NAND which often lead to problems during upgrade (i.e. change of boot partition). ISSU going haywire, Systems responding extremly slow after config change (had to re-image the divice). Or SRXes stuck in bootlaoder for no reason - issuing a 'boot' then brings them up (had it with severeal SRX300 so far) - but of course that has to be done from console, i.e. driving to customers site and do it locally since customers usually don't have serial adapter nor want to / are able to revive their equipment. Not to mention the extended downtime at customers site...
And it's not the SRXes alone - in the last few months, we had increasing problems with EX-switches too.
Corrupted filesystems (no power outage - NAND simply 'slowly dies' during regular operation within 2 years. JTAC tells me that's normal and we have to live with this). Update of a 9 chassis- VC left 4 of the chassis in boot-prompt
Sponatnoues reboot after a simple commit, false emergency fire-shutdowns due to possible bug in CPU temp sensor.
JUNOS Quality suffered massively - we ran into many bugs in the past - most of them 'confidential', i.e. we didn't even had a chance to circumvent them. To make things worse many (not all!) JTAC engineers have a strange way of tackeling problems ('please try to install a different JUNOS-Version in your production environment- we don't know if it will work (potluck), but hey - it's just half an hour of downtime (if you're lucky) and a drive to the customers site (since you might loose network access to the devices and need console access) - it might cost you a few thousand bucks, but be honest-money is not an issue...) or (well NAND problems ar inadvertable - please check nand on all your (200+) devices once a week to quickly identify problems...).
And I have the feeling that often, they didn't even try once to actually install their recommended versions of Junos on the corresponding devices - we had it more than once that the recommendation didn't work at all on the device (too little memory). Funny things then happen (e.g. systems boots, and forward packets but doesn't NAT anymore - no error messages...).
I already complained multiple times toward Juniper to beef up their QA again - so far in vein.