Kindly help us make sense of this log messages:
May 9 17:58:40 PE-0-NSSF-MSA-KE-re0 rpd: task_jsr_alloc: File exists
May 9 17:58:40 PE-0-NSSF-MSA-KE-re0 kernel: jsr_sdrl_setup_primary: Socket already has replication setup (state 0x1)
May 9 17:58:40 PE-0-NSSF-MSA-KE-re0 kernel: jsr_sdrl_pri_alloc: failed 17
May 9 17:58:41 PE-0-NSSF-MSA-KE-re0 rpd: task_jsr_alloc: File exists
May 9 17:58:41 PE-0-NSSF-MSA-KE-re0 kernel: jsr_sdrl_setup_primary: Socket already has replication setup (state 0x1)
Our NMS is showing this alarms regarding BGP on this particular MX:
"MSA-MX BGP AS37305 peer 2C 0F FC 70 00 00 00 08 00 00 00 00 00 00 00 00 has lost more than 20% of receiving prefixes
SP-M120 BGP AS37305 peer 2C 0F FC 70 00 00 00 08 00 00 00 00 00 00 00 01 has lost more than 20% of advertising prefixes
SP-M120 BGP AS37305 peer 29 4F 08 02 is DOWN"
Appreciate your help.
Howdy, please get the output from the following commands:
“show task replication” from master RE“show system switchover” from back up RE
If one of them is showing "incomplete" or "NotStarted", this is the root cause of the issue. With a RE switch over back and forth should be enough to solve this issue, you need to do it during a Maintenance window.
If this solves your problem, please mark this post as "Accepted Solution" so we can help others too \:)/
Lil DexxJNCIE-ENT#863, 3X JNCIP-[SP-ENT-DC], 4X JNCIA [cloud-DevOps-Junos-Design], Champions Ingenius, SSYB
Thank you for the assistance. i got bgp replication not started.
-re0# run show task replication
Stateful Replication: Enabled
RE mode: Master
Protocol Synchronization Status
will RE switchover solve this or there is another way ?
show system switchover on back RE looks like this:
-re1> show system switchover
Graceful switchover: On
Configuration database: Ready
Kernel database: Ready
Switchover Status: Ready
From the output, we can see that it is indeed a replication Issue caused by BGP synchronization, the switchover is the way to go, here are the steps you need to follow:
1. Perform mastership switchover from RE0 to RE1
[[!! This will bounce the BGP neighbors due to NSR being broken !!]]
2. Reboot RE0 after it becomes backup.
3. Check the task replication state again. Make sure BGP goes to "Complete" status.
4. If BGP remains "NotStarted", then do another mastership switch from RE1 back to RE0.
NOTE: Based on my experience, just doing steps 1&2 did the job.
Also, check if your JunOS is not within the resolved in section of this link if not, you are matching this bug:
Thank you for the help. that solved the issue.
Easist way to resolve issue is to disable GRES/NSR, reboot RE1 and enable back GRES/NSR. But you may want to get the logs and coredumps from both RE to understand by task replication fails between RE0 and RE1 in first place