Log in to ask questions, share your expertise, or stay connected to content you value. Don’t have a login? Learn how to become a member.
The IIC controller is used for communication over the I2C bus, used to read environment variables (power voltage - PSU, temp sensor, etc.) from the FRUs in the system. I could see based on the output of the command requested on your previous emails that multiple modules are presenting active errors ( i.e. TMP75_EXHAUST_48, TMP75_INTAKE_4E, I2CLTC3880, and I2CADS7830 ). Please note, the error numbers are very low considering the transient hardware issue is quite common in modern computer systems - many operating systems, especially the user OSes in PC or smart phone, won't generate these types of alarm. However, the Juniper MX router is mission critical device and we choose to report as much information as we can. For example, every 5 seconds ichsmb device driver issues 2 byte read request to PSU device. This request is generated by chassisd to read PSU status in a periodic, if there is any read issue, such as read timeout due to bus signal collision, link noisy etc, system will generate the minor alarm even the next request action is successful.
Please find below a similar kb for the reported alarm:
RE (e.g. RE0) will raise an alarm of "RE0 IIC access Error" when the resiliency I2C bus access error is detected and in normal circumstances alarm is cleared within 24h. JTAC suggested to reboot RE0 in a safe maintenance window, if the reboot doesn't fix, then we shall create RMA for RE0. Most of cases, these minor alarms are just information and there is no service impact at all.
In previous JUNOSes, we generate the minor alarm whenever there is a transient hardware failure (e.g. ichsmb read timeout) – which might not be necessary. Those requests are periodical, one or two times missed won't be any problem.
We do have an internal PR/1615863 to suppress the alarm unless there are continuously three times failure and avoid alarming the user:
The transient ichsmb failures are retried three times. failures are logged on every instance, but successful reads are silent. The calling function in chassisd should log an error message only once all 3 retries fail to avoid alarming the user. The ichsmb driver continues to log all failures so that there is visibility on failures when required.
The enhancement had been applied-in evo:22.1R3-EVO evo:22.2R2-EVO evo:22.3R1-EVO evo:22.4R1-EVO
junos:21.2R3-S2 junos:22.1R3 junos:22.2R2 junos:22.3R1
Hope the above helps