I would have an LUCHIP related question to You.
We are supporting a service provider network, where the P/PE nodes are MXs (MX80/104/240/480/960, MPC2, MPC2E, MPC3E and MPC5E cards are deployed) running on 16.1R6-S3 now.
We have seen that lots of LUCHIP related error messages are raising month by month.
I have seen in Juniper KB, that most of these issues are harmless, this is the situation in our case too: there is no any service degradation or FIB corruption issue, but it disquieting that something unusual happens on the PFE level without knowing its reason.
In most cases we are facing IDMEM/GUMEM memory integrity issues, and we do not know what event or routing update(?) can trigger the inconsistency: https://kb.juniper.net/InfoCenter/index?page=content&id=KB24641&actp=search&viewlocale=en_US&searchid=1350354498136&act=login
My questions would be to You:
Thank you so much for your help in advance!
Senior System Engineer
As noted in mentioned KB, these are transient errors, which are corrected by the LU driver, as soon as they are detected(software resiliency feature).
You can read about transient errors in following KB article: https://kb.juniper.net/InfoCenter/index?page=content&id=KB32086
So if message appeared once and stops, no further actions required.
If these errors are frequently occurring and have service impact, then it is a matter of concern and you will have to contact JTAC.
These errors are part and parcel of the everyday operational issues and frankly should not be concerning. The error print only indicates the occurrence of such events. If the error is repaired, the subsequent memory checks will pass; and the error message will stop. If these error messages are continuously appearing then please proceed with the reloading of the FPC once. If issue persists after reload then raise a JTAC case.
Could you please porvide the errors logs that you are getting in PFE?
LUCHIP(0) IDMEM read error
This is something like you are seeing?
These messages are harmless and wont create any issue for traffic service, they are related to internal DATA memory of LUCHIP
These error messages result from failed reads intended to locate failing memory locations and repair them in proactive manner.This thread only checks locations that have been initialized by the control plane. It is not uncommon for this thread to encounter an error. This issue is also seen due to a 'race' condition in SW that generates a syslog message with no impact.
Hope this information will help you to understand error.