we have a virtual chassis with 6 EX4300 members.
The firmware version is: 18.2R1.9
We have high CPU usage on the primary routing engine.
This load is caused by our SNMP monitoring. We query normal values. For example, temperature, CPU usage or just the traffic of the individual ports.
It is precisely this cancellation of the traffic (values "IF-MIB :: ifHCInOctets" "IF-MIB :: ifHCOutOctets") that causes the load.
Here is a top excerpt:
last pid: 11414; load averages: 1.63, 1.42, 1.30 up 418+05:36:18 16:13:0666 processes: 3 running, 63 sleepingCPU states: 38.0% user, 0.0% nice, 41.5% system, 0.5% interrupt, 20.1% idleMem: 986M Active, 81M Inact, 152M Wired, 560M Cache, 112M Buf, 81M FreeSwap:
PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND1793 root 1 76 0 59772K 38904K RUN 3246.6 30.08% mib2d1646 root 2 -52 -52 564M 208M select 2637.2 17.14% pfex_junos1792 root 1 76 0 35796K 25132K select 1214.3 12.45% snmpd1798 root 1 51 0 41172K 23352K select 518.2H 3.42% pfed1633 root 1 49 0 64888K 29128K select 334.9H 2.59% chassisd11407 root 1 42 0 11836K 5572K select 0:00 0.22% sshd
We collect the values every 10 seconds. We also do this on our older ex4200 switches. We have absolutely no problems with the ex4200.
If I set that we only collect the values for the graphs every 20 seconds, then the load decreases, but we have peaks in the graph every 20 seconds.
What could be the problem here? I think we are not the only ones requesting SNMP values, so I hope that one or the other has a tip.
Best regards 🙂
Good day !
Please attach the RSI from the working and non-working switch in-order to check .
Please don't ask people to publicly post RSI files there are way too many details in there that should not be on the open internet. Ask for the more targeted specific information needed for the issue at hand.
Anything less than 5 minute snmp collection is aggressive and certainly under 1 minute collection will put a load on cpu, especially with older devices like the ex4200. If you need that level of detail in collection talk to your monitoring vendor about grpc support instead of snmp.
I agree with steve, the SNMP polling interval seems to be aggressive. Tweaking the polling interval to a higher value might reduce the CPU load, as the reduction in SNMP requests would drastically reduce the CPU load.
Please mark "accept as solution" if this answers your query. Kudos are appreciated too !
Thanks for the confirmation, sharatainapur. 🙂