Automation

last person joined: 6 days ago 

Ask questions and share experiences about Apstra, Paragon, and all things network automation.
Expand all | Collapse all

Auto reseting an ip-monitoring then/action statement

  • 1.  Auto reseting an ip-monitoring then/action statement

    Posted 04-22-2018 10:40

    Hello Everyone. I am new when it comes to using rpm and ip-monitoring and wanted to know is there any automated way to replace "request services ip-monitoring preempt-restore policy <Policy-Name>" ?

    So Basically I have this interface that I am disabling in ip-monitoriung based on a probe, but I need it to be re-enabled every hour or so for testing purposes, to check and see if primary path is available yet or not. 

     

    Thanks. 

     



  • 2.  RE: Auto reseting an ip-monitoring then/action statement
    Best Answer

    Posted 04-22-2018 23:41

    Hello,

    Of course there is a way. You'd need an event-policy like below:

    set event-options generate-event 1hr time-interval 3600
    set event-options policy reseting-ip-monitoring events 1hr
    set event-options policy reseting-ip-monitoring then execute-commands commands "request services ip-monitoring preempt-restore policy <write the policy name here>"
    set event-options policy reseting-ip-monitoring then execute-commands user-name <write Your chosen username here>
    

    HTH

    Thx

    Alex



  • 3.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-23-2018 03:04

    That  will run the command every hour regardless of whether or not the IP monitoring event was triggered.

     

    Is there a test that can be run first to know the trigger event took place and only run the restore when that is true?

     



  • 4.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-23-2018 03:12

    Hello,


    @spuluka wrote:

    That  will run the command every hour regardless of whether or not the IP monitoring event was triggered.

     

     

     


    Correct, and that's what OP asked for. No conditions attached.

     


    @spuluka wrote:

     

    Is there a test that can be run first to know the trigger event took place and only run the restore when that is true?

     


    No need for that.

    https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/request-services-ip-monitoring-preempt-restore-policy.html

    Note: The request services ip-monitoring preempt-restore policy command takes effect only when the RPM probe is in the pass state, and when the policy is in a failover state.

     

    HTH

    Thx
    Alex



  • 5.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-23-2018 05:03

    aarsenlev,
    Thank you this should do it for me.  I will give it a try and see how it goes. 

     

    Thanks

    Ali



  • 6.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-23-2018 06:31

    aarsenlev,
    I implemented what you suggested and that part seems to be working However I believe this command wont work for me as I get the following message even when I manually run the command "request services ip-monitoring preempt-restore policy Policy-for-RPM-SW10":

    Restore request failed: Policy Policy-for-RPM-SW10's RPM Probe is currently in failure state

    so it seems like I have to find a way to clear the probe/test results as well which I cant anywhere. an individual suggested deactivating and reactivating the RPM Test but I dont think thats a valid solution. do you know of any way to clear the probe/test results?

    Also, how do you verify the timers for events and policies within event-options ?

     

    (Sorry for the long post but while wer at it can you please see below)

    Also, this is my rpm and ip-monitoring config:

    rpm {
        probe Maintenance {
            test SouthRing-SW10 {
                target address 20.20.20.4;
                probe-count 2;
                probe-interval 2;
                test-interval 15;
                routing-instance Maint;
                thresholds {
                    successive-loss 2;
                    total-loss 7;
                }
                destination-interface ge-0/0/3.1020;
            }
        }
    }
    ip-monitoring {
        policy Policy-for-RPM-SW10 {
            no-preempt;
            match {
                rpm-probe Maintenance;
            }
            then {
                interface ge-0/0/3.1020 {
                    disable;
                }
            }
        }
    }
    

    I would like to have the then statement be something like below but this is not an opton on the MXs/SRXs and QFX's that I have here.

     

    ip-monitoring {
        policy Policy-for-RPM-SW10 {
            no-preempt;
            match {
                rpm-probe Maintenance;
            }
            then {
                route 20.20.20.0/29 {
                    next-hop discard; (to basically null the route)
                }
            }
        }
    }
    

    any suggestion ?

     



  • 7.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-23-2018 07:57

    Hello,

    Your RPM probe needs to be in Pass state for this command to work.

    https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/request-services-ip-monitoring-preempt-restore-policy.html 

    Note: The request services ip-monitoring preempt-restore policy command takes effect only when the RPM probe is in the pass state, and when the policy is in a failover state.

     

    Did You check that Your RPM probe has recovered before applying this command?

    HTH

    Thx

    Alex

     



  • 8.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-24-2018 07:34

    Yeah I noticed thats the issue. I think theres an issue with the way I am implementing this. 

    is there anyway to clear the probe/test results every hour or so ? I cant find the command for it



  • 9.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-24-2018 11:31

    Hello,


    @ali.taheri wrote:

    Yeah I noticed thats the issue. I think theres an issue with the way I am implementing this. 

    is there anyway to clear the probe/test results every hour or so ? I cant find the command for it


    I think Your problem statement is different to what You actually want. What You want is to re-enable the interface after it has been disabled for ~1hour or so by Your ip-monitoring policy. Is that right?

    If yes then You would need:

    1/ RPM probe

    2/ event-policy #1 to act on RPM probe fail by disabling an interface

    https://www.juniper.net/documentation/en_US/junos/topics/example/junos-script-automation-event-policy-change-configuration.html#jd0e109

    Note that You don't actually need to commit anything to disable an interface - there is an op command "request interface interface <name> down|up" which is JUNOS CLI equivalent of UNIX "interface <name> down|up". So, event-policy #1 does "request interface interface <name> down"

    3/ a "ticking" event - an event that is periodically raised

    4/  event-policy #2 to act  after ~1hr has been passed (by virtue of "ticking" event, see above) after event-policy #1 got triggered. This event-policy #2 does "request interface interface <name> up"

    If that's what You actually want I could write the JUNOS CLI for You then.

    HTH

    Thx

    Alex

     



  • 10.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-24-2018 18:48

    aarsenlev,
    Wow, That is infact what I want. So what I exactly need is once the probe fails and the policy disables the port, I need that interface to be re-enabled every hour, so the router can check reachability to a specific destination (via the probe), if router sees echo replies, then keep the interface up, if pings continue to fail, shutdown the port again. I never heard of the #3, the tickling event, and I will definitely read some material on that. I understand if you have no time to provide the commands but if you can, I really appreciate it. 

     

    Upto this point I have been trying these on SRX's and vSRX's but in my production network I have some cases where this needs to be deployed on MX480 or MX960. I Believe ip-monitoring is not supported on MX line, any possible way to deploy the same behavior on MX960/MX480? I see rpm is supported, but what do I deploy to use these proebes on MX devices ?

     

    Thank you so much again. I am learning a lot by simply taking the hints from your suggestions and reading more about them, keep them coming I guess 🙂 



  • 11.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-25-2018 06:06

    Hello,

    First things first - I could not make Your RPM test to work in my lab:

     

    rpm {
        probe Maintenance {
            test SouthRing-SW10 {
                target address 20.20.20.4;
                probe-count 2;
                probe-interval 2;
                test-interval 15;
                routing-instance Maint;
                thresholds {
                    successive-loss 2;
                    total-loss 7;
                }
                destination-interface ge-0/0/3.1020;
            }
        }
    }

    I believe that "successive-loss" and "total-loss" are evaluated as logical AND. Hence, if You have 2 probes in each run, there never will be a total-loss 7 before run finishes. I deactivated "total-loss 7" and RPM test started to work in my lab.

     

    Secondly, here is the JUNOS CLI code for You to DOWN the interface on RPM probe down and UP it after ~1 hour each hour.

    Actually, the first UP can happen as late as 2 hours or as soon as immediately after DOWN but this is a compromise between functionality and simplicity.

    Thirdly, You don't need ip-monitoring at all, only RPM config and the below code:

     

     

    set event-options policy rpmdown-ifddown events PING_TEST_FAILED
    set event-options policy rpmdown-ifddown attributes-match PING_TEST_FAILED.test-owner matches Maintenance
    set event-options policy rpmdown-ifddown attributes-match PING_TEST_FAILED.test-name matches "SouthRing\-SW10"
    set event-options policy rpmdown-ifddown then execute-commands commands "request interface interface ge-0/0/3.1020 down"
    set event-options generate-event 1hr time-interval 3600
    set event-options policy ifdup events 1hr
    set event-options policy ifdup then execute-commands commands "request interface interface ge-0/0/3.1020 up"

     

     

    Brief explanation how the code works:

    1/ policy rpmdown-ifddown is triggered on RPM test failure. It happens every 24 secs or so with Your RPM config

     

    Apr 25 03:13:27  ROUTER-A rmopd[6442]: %DAEMON-6-PING_TEST_FAILED: pingCtlOwnerIndex = Maintenance, pingCtlTestName = SouthRing-SW10
    Apr 25 03:13:51  ROUTER-A rmopd[6442]: %DAEMON-6-PING_TEST_FAILED: pingCtlOwnerIndex = Maintenance, pingCtlTestName = SouthRing-SW10
    Apr 25 03:14:15  ROUTER-A rmopd[6442]: %DAEMON-6-PING_TEST_FAILED: pingCtlOwnerIndex = Maintenance, pingCtlTestName = SouthRing-SW10

    2/ the "rpmdown-ifddown" policy keeps issuing the equivalent of UNIX command "ifconfig ge-0/0/3.1020  down" command every time it sees the above event (PING_TEST_FAILED) , every 24 seconds on average

    3/ these repeated commands do not do any harm if interface is already down

    4/ periodic "ticking" event 1hr is raised every hour starting from the time the "generate-event" line was committed. 

    5/ as You can imagine, event 1hr can occur immediately after "rpmdown-ifdown" policy acts and could bring interface up first time sooner than 1 hour, or as late as 1h59m59s after after "rpmdown-ifdown" policy acts. This is a compromise between simplicity and functionality.

    6/ once interface is brought up, policy "ifdup" keeps issuing JUNOS equivalent of UNIX "ifconfig ge-0/0/3.1020 up" command every 1 hour. These commands are not doing any harm if the interface is already up.

    7/ If the ge-0/0/3.1020 was brought down between successive runs of "ifdup" policy, it will be upped, and the cycle repeats from step 1 above.

    Hope this makes sense

    HTH

    Thx
    Alex

     

     

     

     

     

     



  • 12.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-25-2018 17:24

    Alex,
    Thank you so much again for guiding me through this process. what you advised seems to work just fine. However I was not able to get the commands (Request interface ge-0/0/3.1020 down) executed on any of the following devices/junos:

    vSRX (12.1X47-D15.4)

    SRX320 (junos-srxsme-17.3R1.10)

    vMX (JUNOS 14.1R1.10)

    MX480 (JUNOS 15.1F2.8 - AND- JUNOS 14.2R7.5)

    MX960(JUNOS 10.4R8.5 - AND  -JUNOS 11.4R7.5)

     

    Therefore I had to make a minor modification to your config. I hope you dont mind.

     

    This is my RPM config:

    probe TM-Maintenance {
        test TM-SouthRing-SW10 {
            probe-type icmp-ping;
            target address 20.20.20.4;
            probe-count 10;
            probe-interval 4;
            test-interval 2;
            routing-instance Maint;
            thresholds {
                successive-loss 15;
            }
            destination-interface ge-0/0/3.1020;
        }
    }

    This is my event-options config:

    generate-event {
        1hr time-interval 240;
    }
    policy rpmdown-ifddown {
        events PING_TEST_FAILED;
        attributes-match {
            PING_TEST_FAILED.test-owner matches TM-Maintenance;
            PING_TEST_FAILED.test-name matches TM-SouthRing-SW10;
        }
        then {
            change-configuration {
                commands {
                    "set interfaces ge-0/0/3.1020 disable";
                }
                user-name ataheri;
                commit-options {
                    log "Disabling ge-0/0/3.1020 to SouthRing SW10";
                }
            }
        }
    }
    policy ifdup {
        events 1hr;
        then {
            change-configuration {
                commands {
                    "delete interfaces ge-0/0/3.1020 disable";
                }
                user-name ataheri;
                commit-options {
                    log "Enabling ge-0/0/3.1020 to SouthRing SW10";
                }
            }
        }
    }

    **Please note I am only using 240 Seconds instead of 3600 for testing purposes (So I dont have to wait an hour to see the results).

     

    upto this point, everything seems to be stable and working according to the flow but I am still seeing that interface going up and down when I dont expect it to do so. I guess I dont understand the tuning part of the probes well. This is what I think the probe should do with my current config:

    1- send out 1 probe every 4 seconds - total of 10 probes which should take about total of 40 seconds.

    2- after 10 probes, wait 2 seconds and start from step 1 again

    3-in order for this test to be "Failed" I have to see total of 15 continuous probes fail, which with my config should take about 60 seconds. 

    4- once 15 probes fail, disable the port, and bring the port back up after 240 seconds. (I understand this might be as short as 1 second or as long as 479 seconds depending on when the timer on the probe actually kicks in)

    5- once the port is back up, start the icmp-ping probe, Basically back to step 1

     

    Did I understand this wrong ? if yes, can you please provide some examples revolving around these timers ?

     

    I also have two follow up questions:

    1- Is there a way to have both probes and Periodic "ticking" event timers aggree/coordinate only on one timer so we dont see such variation as you explained and also seen in my step4 ?
    2- I am not able to see the logs in my "Show log messages" section. what am I missing and where should I be looking for these log messages coming from my event-option policy when commiting ?

     

    As always, thank you for your help and I hope you dont mind me taking so much of your time. 



  • 13.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-25-2018 22:36

    Hello,

    To answer Your questions:

    - "request interface inteface <name> up|down" is a hidden op command, You need to type it in full. The XML RPC equivalent is <request-interface-operation>. From source code, I see it has been introduced in JUNOS 12.3, so it looks like majority of Your kit do not support it.

    user@ROUTER-A> request interface interface ge-0/0/0.100 up | display xml rpc 
    <rpc-reply xmlns:junos="http://xml.juniper.net/junos/17.3R2/junos">
        <rpc>
            <request-interface-operation>
                    <undocumented><interface>ge-0/0/0.100</interface></undocumented>
                    <undocumented><operation>up</operation></undocumented>
            </request-interface-operation>
        </rpc>
        <cli>
            <banner></banner>
        </cli>
    </rpc-reply>

     

    - it is possble to reduce the time variance between policy rpmdown-ifddown and policy ifdup runs by introducing yet another 3rd event policy that does only ticking, and policy ifdup is only triggered when this 3rd policy has accumulated enough ticks. But generally speaking, there is no time coordination between policies, and the policy execution times interntionally drift/are smeared so if You want a tightly coordinated execution, then event-script is your best bet. And that has to be SLAX script due to devices that run old JUNOS.

    HTH

    Thx

    Alex

     



  • 14.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-26-2018 05:00

    Alex,
    Thank you so much again, you have been very helpful with each of your responses. 

     

     

    Cheers 

    Ali



  • 15.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-26-2018 13:03

    Hello,

    Actually, on a 2nd thought, if You (re)start 1hr event  each time the ge-0/0/3.1020 is disabled, it will be raised exactly every 1 hour.

    Please see below Your modified JUNOS CLI code (my lines in magenta)

    generate-event {
        1hr time-interval 240;
    }
    policy rpmdown-ifddown {
        events PING_TEST_FAILED;
        attributes-match {
            PING_TEST_FAILED.test-owner matches TM-Maintenance;
            PING_TEST_FAILED.test-name matches TM-SouthRing-SW10;
        }
        then {
            change-configuration {
                commands {
                    "set interfaces ge-0/0/3.1020 disable";
                    "activate event-options generate-event 1hr";
                    "deactivate event-options policy rpmdown-ifddown";
    } user-name ataheri; commit-options { log "Disabling ge-0/0/3.1020 to SouthRing SW10"; } } } } policy ifdup { events 1hr; then { change-configuration { commands { "delete interfaces ge-0/0/3.1020 disable"; "deactivate event-options generate-event 1hr";
                    "activate event-options policy rpmdown-ifddown";
    } user-name ataheri; commit-options { log "Enabling ge-0/0/3.1020 to SouthRing SW10"; } } } }

     What should happen is:

    1/ once the RPM test fails, policy rpmdown-ifddown will disable the interface, activate ticking event 1hr and deactivate itself to avoid periodic commits while RPM test continues to fail.

    2/ after exactly 1hour, policy ifdup will un-disable the interface, deactivate ticking event and reactivate policy rpmdown-ifddown

    3/ if RPM test passes, then no more periodic commits are done, policy rpmdown-ifddown waits for RPM test to fail

    3/ if RPM test fails, then the cycle starts from step 1 above

    HTH

    Thx

    Alex



  • 16.  RE: Auto reseting an ip-monitoring then/action statement

    Posted 04-28-2018 06:43

    Alex,
    This makes my set up a lot neater that it was. thanks for the help 🙂

     

    Cheers,

    Ali