Hello,
I've been setting up automatic internet fail-over on our SRX-300 devices and have a problem with the event handling.
The "back-to-normal-maininet" event policy (to be run when RPM indicates the ping monitor is succeeding) is being run multiple times per day when there should be no reason to. The "no-prefer-maininet-pingfail" event policy (to be run when the RPM monitor fails) is never run at any time before these events.
Any ideas?
SRX-300
JunOS 18.2R3-S2.9
system {
services {
syslog {
file rpm.log {
any any;
match rmopd;
archive size 5m files 1;
explicit-priority;
}
}
}
services {
rpm {
traceoptions {
file RPM_status.txt size 10m files 2 world-readable;
flag all;
}
probe main-inet-probe {
test probe-oogle {
probe-type icmp-ping;
target address 8.8.8.8;
probe-count 3;
probe-interval 1;
test-interval 3;
thresholds {
successive-loss 3;
}
destination-interface ge-0/0/5.0;
next-hop 71.13.X.U;
}
}
}
}
event-options {
policy no-prefer-maininet-pingfail {
events [ ping_test_failed ping_probe_failed ];
within 30 {
trigger until 4;
}
within 25 {
trigger on 3;
}
attributes-match {
ping_test_failed.test-owner matches main-inet-probe;
ping_test_failed.test-name matches probe-oogle;
}
then {
priority-override {
severity warning;
}
change-configuration {
retry count 3 interval 20;
commands {
"set routing-options static route 0.0.0.0/0 preference 42000";
}
commit-options {
log "RPM ping fail, disable primary default route";
}
}
execute-commands {
commands {
# this is just help me better monitor when these events are triggered
"request snmp spoof-trap mplsTunnelDown";
}
}
}
}
policy back-to-normal-maininet {
events ping_test_completed;
within 20 {
trigger on 3;
}
within 25 {
trigger until 4;
}
attributes-match {
ping_test_completed.test-owner matches main-inet-probe;
ping_test_completed.test-name matches probe-oogle;
}
then {
priority-override {
severity warning;
}
change-configuration {
retry count 3 interval 20;
commands {
"delete routing-options static route 0.0.0.0/0 preference";
}
commit-options {
log "RPM ping success, return to normal routing";
}
}
execute-commands {
commands {
# this is just help me better monitor when these events are triggered
"request snmp spoof-trap mplsTunnelUp";
}
}
}
}
traceoptions {
file EVENTOPTIONS.txt size 5m files 2 world-readable;
flag all;
}
}
% grep '09:18:' /var/log/RPM_status.txt | egrep -v 'Next-hop|rmop_calc_jitter'
...
Nov 18 09:18:32 test_done: sent 3, test 0
Nov 18 09:18:32 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:32 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:18:37 test_done: sent 3, test 0
Nov 18 09:18:37 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:37 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:18:42 test_done: sent 3, test 0
Nov 18 09:18:42 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:42 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:18:47 test_done: sent 3, test 0
Nov 18 09:18:47 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:47 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:18:56 test_done: sent 3, test 0
Nov 18 09:18:56 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:56 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:05 handle_timeout: time out 3, test 0 entry->probe_state 1
Nov 18 09:19:05 handle_timeout: time out 3, test 0
Nov 18 09:19:05 test_done: sent 3, test 0
Nov 18 09:19:05 PING_TEST_FAILED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:05 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state FAIL
Nov 18 09:19:05 icmp_notify: not sent, test 0, ping 2
Nov 18 09:19:10 test_done: sent 3, test 0
Nov 18 09:19:10 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:10 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:15 test_done: sent 3, test 0
Nov 18 09:19:15 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:15 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:22 test_done: sent 3, test 0
Nov 18 09:19:22 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:22 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:26 handle_timeout: time out 1, test 0 entry->probe_state 1
Nov 18 09:19:26 handle_timeout: time out 1, test 0
Nov 18 09:19:27 icmp_notify: not sent, test 0, ping 0
Nov 18 09:19:27 test_done: sent 3, test 0
Nov 18 09:19:27 PING_TEST_FAILED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:27 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state FAIL
Nov 18 09:19:32 test_done: sent 3, test 0
Nov 18 09:19:32 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:32 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:37 test_done: sent 3, test 0
Nov 18 09:19:37 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:37 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:46 test_done: sent 3, test 0
Nov 18 09:19:46 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:46 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:19:56 test_done: sent 3, test 0
Nov 18 09:19:56 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:56 RTM_CHANGE gencfg for probe main-inet-probe, test probe-oogle to state PASS
Nov 18 09:20:01 test_done: sent 3, test 0
Nov 18 09:20:01 PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
...
% grep '09:18:' /var/log/rpm.log
...
Nov 18 09:18:32 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:37 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:42 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:47 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:18:56 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:05 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_FAILED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:10 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:15 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:22 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:27 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_FAILED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:32 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:37 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:46 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:19:56 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
Nov 18 09:20:01 ws-firewall rmopd[2051]: %DAEMON-6-PING_TEST_COMPLETED: pingCtlOwnerIndex = main-inet-probe, pingCtlTestName = probe-oogle
...
% egrep '09:(18|19|20):' /var/log/EVENTOPTIONS.txt | egrep 'PING|Execut'
...
Nov 18 09:18:32 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:32 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:32 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:18:37 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:18:37 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:37 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:18:37 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:18:42 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:42 Executing policy: <back-to-normal-maininet>
Nov 18 09:18:42 Changing priority, for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED'
Nov 18 09:18:42 executing commands for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_0TKvCe' as the temporary command file, and '/dev/null' as the temporary output file
Nov 18 09:18:42 Executing commands </usr/libexec/invoke-commands /tmp/evt_cmd_0TKvCe /dev/null XML>
Nov 18 09:18:42 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:18:44 changing configuration for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_kasf4N' as the temporary command file
Nov 18 09:18:47 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:47 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:18:47 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:18:56 Processing event <PING_TEST_COMPLETED>
Nov 18 09:18:56 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:18:56 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:19:05 Processing event <PING_TEST_FAILED>
Nov 18 09:19:05 Policy <no-prefer-maininet-pingfail>'s conditions dont match for event <PING_TEST_FAILED>
Nov 18 09:19:06 event_filter_tree_find: Event id PING_TEST_FAILED not found
Nov 18 09:19:10 Processing event <PING_TEST_COMPLETED>
Nov 18 09:19:10 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:19:10 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:19:16 Processing event <PING_TEST_COMPLETED>
Nov 18 09:19:16 Executing policy: <back-to-normal-maininet>
Nov 18 09:19:16 Changing priority, for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED'
Nov 18 09:19:16 executing commands for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_XjFc9d' as the temporary command file, and '/dev/null' as the temporary output file
Nov 18 09:19:16 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:19:16 Executing commands </usr/libexec/invoke-commands /tmp/evt_cmd_XjFc9d /dev/null XML>
Nov 18 09:19:22 Processing event <PING_TEST_COMPLETED>
Nov 18 09:19:22 Executing policy: <back-to-normal-maininet>
Nov 18 09:19:22 Changing priority, for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED'
Nov 18 09:19:22 executing commands for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_6Zjtdq' as the temporary command file, and '/dev/null' as the temporary output file
Nov 18 09:19:22 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:19:22 Executing commands </usr/libexec/invoke-commands /tmp/evt_cmd_6Zjtdq /dev/null XML>
Nov 18 09:19:27 Processing event <PING_TEST_FAILED>
Nov 18 09:19:27 Policy <no-prefer-maininet-pingfail>'s conditions dont match for event <PING_TEST_FAILED>
Nov 18 09:19:27 event_filter_tree_find: Event id PING_TEST_FAILED not found
Nov 18 09:19:27 changing configuration for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_gk3W5N' as the temporary command file
Nov 18 09:19:28 changing configuration for policy 'back-to-normal-maininet' triggered on receipt of event 'PING_TEST_COMPLETED', with '/tmp/evt_cmd_QiyK2W' as the temporary command file
Nov 18 09:19:32 Processing event <PING_TEST_COMPLETED>
Nov 18 09:19:32 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
Nov 18 09:19:32 event_filter_tree_find: Event id PING_TEST_COMPLETED not found
Nov 18 09:19:37 Processing event <PING_TEST_COMPLETED>
Nov 18 09:19:37 Policy <back-to-normal-maininet>'s conditions dont match for event <PING_TEST_COMPLETED>
...