Scripting How-To: Limit the failover of the control plane

By Erdem posted 08-10-2015 12:25

Recommend

Overview

This script implements all of the features that are included on the SRX plus the ability to limit the failover of the control plane. This applies to SLAX version 1.0 and higher.

Description

Interface monitoring is an included feature on the platform but it may not meet all customer requirements. In some scenarios if the control plane is rapidly failed over it can cause instability. This implementation of interface monitoring accounts for this possibility and limits the user to monitoring only the data plane.

Source Code

GitHub Links

The source code below is also available from GitHub at the following locations:

Example Configuration

01 Achieving High Availability with Interface Monitoring
02
03 Interface monitoring is a feature that is built into JUNOS on the SRX. The built in configuration can be utilized for both redundancy group zero and group one. However it may be possible that during rapid failover of redundancy group zero that instability can occur. For those who want redundancy group zero to failover in the event on an interface failure and are unable to prevent rapid failures this JUNOScript was created. The script operates just like the built in feature, however it prevents redundancy group zero from being rapidly failed over. To configure interface monitoring under any redundancy an administrator would create an "apply-macro monitor-interface" stanza and optionally specify its weight as shown below in figure three. If no weight is specified it is assumed it is 255 and would trigger a failover in the event the interface fails. If the weight is not great enough the intermediate weight is noted in the configuration under the stanza "apply macro failover-interface-monitor" in the chassis cluster section. When a failover occurs a syslog message will be generated as type external and level critical.
04
05 [Interface Monitoring configuration]
06
07 reth-count 4;
08 control-ports {
09 fpc 2 port 0;
10 fpc 14 port 0;
11 }
12 apply-macro failover-interface-monitor {
13 0 128;
14 }
15 redundancy-group 0 {
16 apply-macro monitor-interfaces {
17 xe-17/1/0;
18 xe-5/1/0 128;
19 xe-5/1/2 128;
20 }
21 node 0 priority 254;
22 node 1 priority 1;
23 }
24 redundancy-group 1 {
25 apply-macro track-host {
26 interval 0.5;
27 routing-instance BPSTest1;
28 server 1.0.11.11;
29 weight 255;
30 }
31 apply-macro monitor-interfaces {
32 xe-17/1/0;
33 xe-5/1/0;
34 xe-5/2/0;
35 }
36 node 0 priority 254;
37 node 1 priority 1;
38 }
39
40 To allow the interface monitoring take effect the event options stanza must be configured. This allows the interface monitoring intercept the message "SNMP_TRAP_LINK_DOWN" and allows the monitor-interface script act on the event. Using the configuration in figure four below will activate the script to act on the interface down messages.
41
42 [Interface Monitoring configuration]
43
44 event-options {
45 policy INTERFACE_MONITOR {
46 events SNMP_TRAP_LINK_DOWN;
47 then {
48 event-script monitor-interface.xsl {
49 arguments {
50 interface "{$$.interface-name}";
51 }
52 }
53 }
54 event-script {
55 file monitor-interface.xsl;
56 }
57 }
58
59 Validating the Interface Monitoring Configuration
60
61 The configuration options that begin with "apply macro" are all user created options. Because of this JUNOS does not validate these by default. To create custom validation options a commit script must be used. To validate the monitor interface configuration the commit script "srx-ha-validate.xsl" was created. The script must be placed in the /var/db/scripts/commit directory. Secondly it must be added to the JUNOS configuration by using the command "set system scripts commit file srx-ha-validate.xsl". Upon the committing of this configuration the options for monitor interface will be validated.
62 This prevents misspelling items like "interface" or giving numbers that are out of the range of the scripts capabilities. In the event that an option is not correctly con figured a warning will be emitted. This does notify the administrator that something is not right. It will not prevent the misconfiguration of monitor interface; it just creates a warning message. This was done to ensure interoperability with the Network and Security Manager platform (NSM). In the event that a warning message is received simply review the message and resolve the error by correcting the configuration mistake.
63
64 Global SRX High Availability Configuration Options
65
66 Under the chassis cluster configuration the macro "monitoring options" with the value of "clear failover" can be applied. If this is configured then after a fail over of the redundancy group occurs the manual fail over flag will be cleared. This setting is optional. If it is not configured the manual fail over will have to be cleared by the user. The second option that can be configured under the "monitoring-options" is the option "full failover". The full failover option triggers a full failover of all redundancy groups no matter which redundancy group failed its track IP checking. This option ensures that failed redundancy groups follow each other.
67
68 The design of the chassis cluster architecture is to allow the redundancy groups that pass data (redundancy group 1 and greater) to failover between the cluster members as fast as possible to support the various changing conditions of the network. The control plane redundancy group 0 has some unique limitations that do not allow for this to occur. The design of the control plane redundancy group 0 is that upon boot it will determine which chassis should be primary and stick to that chassis member until a failure occurs. The two routing engines that are used, one per chassis, synchronize using the graceful routing engine switchover (GRES) mechanism. The GRES design only allows the switching over of mastership between REs once per five minutes.
69
70 This is why RG0 is not meant to rapidly switchover between chassis and only in the event of a catastrophic failure. To prevent any GRES synchronization issues a timer has been implemented to stamp the last failover time for RG0. In configuration example two the time stamp is shown. The timestamp is in unix time (seconds since January 1st, 1970). The timestamp is set upon the first failover of RG0.
71
72 [Chassis Cluster Configuration Options]
73
74 chassis {
75 cluster {
76 apply-macro monitoring-options {
77 clear-failover;
78 full-failover;
79 }
80 apply-macro failover-monitoring {
81 last-failover 1228859971;
82 }
83 While the chassis cluster technology is very robust it is not always aligned with the operational procedures of organizations. Because of this the track ip JUNOScript accommodates these requests by also implementing the "follow the leader feature". This feature that is enabled by default forces RG0 to go where RG1 is located. This would have occurred if RG0 were unable to failover because it had been less then five minutes from the last failover. When difference between the last failover and then the current time is more then 300 seconds (five minutes) then RG0 will automatically fail over to the node where RG1 is located.

SLAX Script Contents

001	/* Machine Crafted with Care (tm) by slaxWriter */
002	version 1.0;
003	 
004	 
005	/*
006	Copyright Juniper Networks 2008
007	All rights reserved and owned by Juniper Networks
008	 
009	Interface monitoring is an included feature on the platform but it may not meet all
010	customer requirements. The script implments all of the feature that are included
011	on the SRX plus the ability to limit the failover of the control plane. In some
012	scenarios if the control plane is rapidly failed over it can cause instability.
013	This implementation of inteface monitoring accounts for this possibility and limits
014	the use to montoring only the data plane.
015	 
016	Author: Rob Cameron (robc@juniper.net)
017	 
018	 */
019	ns junos = "http://xml.juniper.net/junos/*/junos";
020	ns xnm = "http://xml.juniper.net/xnm/1.1/xnm";
021	ns ext = "http://xmlsoft.org/XSLT/namespace";
022	ns jcs = "http://xml.juniper.net/junos/commit-scripts/1.0";
023	 
024	import "../import/junos.xsl";
025	/* Script Arguments */
026	var $arguments = <argument> {
027	    <name> "interface";
028	    <description> "specify the name of the interface that failed";
029	}
030	param $interface;
031	/* srx-ha-lib.xsl file start */
032	/* use a global connection for all rpc connections */
033	var $connection = jcs:open();
034	/* Pull the chassis cluster status and use throughout the script */
035	var $get-cluster-status = <rpc> {
036	    <command> "show chassis cluster status";
037	}
038	var $cluster-status-results = jcs:execute($connection, $get-cluster-status);
039	var $chassis-cluster-rg-rpc = <rpc> {
040	    <get-configuration> {
041	        <configuration> {
042	            <chassis> {
043	                <cluster>;
044	            }
045	        }
046	    }
047	}
048	/* Pull the redundancy group information information out of the configuration, used throughout made global */
049	var $chassis-cluster-config = jcs:execute($connection, $chassis-cluster-rg-rpc);
050	/* use this as a global to determine the interface ownership by chassis model */
051	var $product-model = {
052	    call get-product-model();
053	}
054	var $get-interface-ownership = {
055	    if ($product-model == "srx5600") {
056	        var $node0-max-interface = 5;
057	        var $node0-min-interface = 0;
058	        var $node1-max-interface = 11;
059	        var $node1-min-interface = 6;
060	        <max-interface-number> {
061	            <node0-max> $node0-max-interface;
062	            <node0-min> $node0-min-interface;
063	            <node1-max> $node1-max-interface;
064	            <node1-min> $node1-min-interface;
065	        }
066	     
067	    } else if ($product-model == "srx5800") {
068	        var $node0-max-interface = 11;
069	        var $node0-min-interface = 0;
070	        var $node1-max-interface = 23;
071	        var $node1-min-interface = 12;
072	        <max-interface-number> {
073	            <node0-max> $node0-max-interface;
074	            <node0-min> $node0-min-interface;
075	            <node1-max> $node1-max-interface;
076	            <node1-min> $node1-min-interface;
077	        }
078	    }
079	}
080	var $chassis-interface-ownership = ext:node-set($get-interface-ownership);
081	/* end global section */
082	/* start template section */
083	/* Determine which interface are monitored */
084	template get-monitor-interface-current-weight ($redundancy-group) {
085	    var $results-get-monitor-interface-weight = jcs:execute($connection, $chassis-cluster-rg-rpc);
086	    var $interface-monitor-weight = $results-get-monitor-interface-weight/chassis/cluster/apply-macro[name == "failover-interface-monitor"]/data[name == "$redundancy-group"]/value;
087	     
088	    if (boolean($interface-monitor-weight)) {
089	        <text> $interface-monitor-weight;
090	     
091	    } else {
092	        <text> "0";
093	    }
094	}
095	 
096	/* Determine current time */
097	template get-current-time () {
098	    var $rpc-get-current-time = <rpc> {
099	        <get-system-uptime-information>;
100	    }
101	    var $results-get-current-time = jcs:execute($connection, $rpc-get-current-time);
102	     
103	    if ($results-get-current-time/multi-routing-engine-item) {
104	        var $current-time = $results-get-current-time/multi-routing-engine-item[1]/system-uptime-information/current-time/date-time/@junos:seconds;
105	        <text> $current-time;
106	     
107	    } else {
108	        var $current-time = $results-get-current-time/current-time/date-time/@junos:seconds;
109	        <text> $current-time;
110	    }
111	}
112	 
113	/* Check if RG0 is ready to failover */
114	template check-RG0-failback () {
115	    var $rg0-last-failover = {
116	        call get-rg0-last-failover();
117	    }
118	    var $current-time = {
119	        call get-current-time();
120	    }
121	    var $local-node = {
122	        call get-local-node();
123	    }
124	    var $rg0-master = {
125	        call get-master() {
126	            with $redundancy-group = {
127	                expr "0";
128	             }
129	        }
130	    }
131	    var $rg1-master = {
132	        call get-master() {
133	            with $redundancy-group = {
134	                expr "1";
135	             }
136	        }
137	    }
138	    var $rg0-primary-node = {
139	        /* Check who is the primary and whether the cluster has not been failed over already */
140	        if ($cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[1] == "secondary" && $cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[2] == "primary" && $cluster-status-results/redundancy-group[1]/device-stats/failover-mode[2] == "no") {
141	            /* failover RG to node0 */
142	            <text> "0";
143	         
144	        } else if ($cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[1] == "primary" && $cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[2] == "secondary" && $cluster-status-results/redundancy-group[1]/device-stats/failover-mode[1] == "no") {
145	            /* failover RG to node1 */
146	            <text> "1";
147	        }
148	    }
149	     
150	    if ($local-node == $rg0-master) {
151	        if ($rg0-master != $rg1-master) {
152	            if (($current-time) >($rg0-last-failover + 300)) {
153	                call request-rg-failover($node = $rg0-primary-node) {
154	                    with $redundancy-group = {
155	                        expr "0";
156	                     }
157	                }
158	            }
159	        }
160	    }
161	}
162	 
163	 
164	/* Check to to see the manual failover flag needs to be reset, if it does then reset it for all of the rgs
165	 
166	 */
167	template check-and-reset-manual-failover-flag () {
168	    if (boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
169	        var $rpc-check-manual-failover-flag = <rpc> {
170	            <command> "show chassis cluster status";
171	        }
172	        var $result-check-manual-failover-flag = jcs:execute($connection, $rpc-check-manual-failover-flag);
173	        var $result-check-manual-failover-flag-node-set = ext:node-set($result-check-manual-failover-flag);
174	         
175	        for-each ($result-check-manual-failover-flag//redundancy-group) {
176	            if (./device-stats/failover-mode[1] == "yes") {
177	                call reset-failover-flag($redundancy-group = ./redundancy-group-id[1]);
178	            }
179	        }
180	    }
181	}
182	 
183	/* Determine the minute in the configuration */
184	template get-rg0-last-failover () {
185	    var $results-get-rg0-last-failover = jcs:execute($connection, $chassis-cluster-rg-rpc);
186	    var $last-rg0-failover-time = $results-get-rg0-last-failover/chassis/cluster/apply-macro[name == "failover-monitoring"]/data[name == "last-failover"]/value;
187	     
188	    if (boolean($last-rg0-failover-time)) {
189	        <text> $last-rg0-failover-time;
190	     
191	    } else {
192	        <text> "0";
193	    }
194	}
195	 
196	/* Insert the last minute into the configuration */
197	template set-rg0-last-failover () {
198	    var $current-time = {
199	        call get-current-time();
200	    }
201	    /* <xsl:value-of select="jcs:output(concat('Setting last failover time for RG0 to ', $current-time))"/> */
202	    var $rpc-configure-private = <rpc> {
203	        <open-configuration> {
204	            <private>;
205	        }
206	    }
207	     
208	    expr jcs:execute($connection, $rpc-configure-private);
209	    var $rpc-set-rg0-last-failover = <rpc> {
210	        <load-configuration> {
211	            <configuration> {
212	                <chassis> {
213	                    <cluster> {
214	                        <apply-macro> {
215	                            <name> "failover-monitoring";
216	                            <data> {
217	                                <name> "last-failover";
218	                                <value> $current-time;
219	                            }
220	                        }
221	                    }
222	                }
223	            }
224	        }
225	    }
226	    expr jcs:execute($connection, $rpc-set-rg0-last-failover);
227	    var $commit = <rpc> {
228	        <commit-configuration>;
229	    }
230	    expr jcs:execute($connection, $commit);
231	}
232	 
233	/* Insert the last minute into the configuration */
234	template set-track-interface-last-weight ($weight = 0, $redundancy-group) {
235	    var $current-weight = {
236	        call get-monitor-interface-current-weight($redundancy-group);
237	    }
238	    var $total-weight = $weight + $current-weight;
239	    /* <xsl:value-of select="jcs:output(concat('Setting track interface weight to ', $weight, ' for RG ', $redundancy-group))"/> */
240	    /* <xsl:value-of select="jcs:output(concat('Setting last failover time for RG0 to ', $current-time))"/> */
241	    var $rpc-configure-private = <rpc> {
242	        <open-configuration> {
243	            <private>;
244	        }
245	    }
246	     
247	    expr jcs:execute($connection, $rpc-configure-private);
248	    var $rpc-set-track-interface-last-weight = <rpc> {
249	        <load-configuration> {
250	            <configuration> {
251	                <chassis> {
252	                    <cluster> {
253	                        <apply-macro> {
254	                            <name> "failover-interface-monitor";
255	                            <data> {
256	                                <name> $redundancy-group;
257	                                <value> $total-weight;
258	                            }
259	                        }
260	                    }
261	                }
262	            }
263	        }
264	    }
265	    expr jcs:execute($connection, $rpc-set-track-interface-last-weight);
266	    var $commit = <rpc> {
267	        <commit-configuration>;
268	    }
269	    expr jcs:execute($connection, $commit);
270	}
271	 
272	/* abstract the actual failover command outside of request failover */
273	template request-rg-failover ($node, $redundancy-group) {
274	     
275	    if ($redundancy-group != "") {
276	        /* rpc command for failover */
277	        var $rpc-failover = <rpc> {
278	            <command> {
279	                expr "request chassis cluster failover node ";
280	                expr $node;
281	                expr " redundancy-group ";
282	                expr $redundancy-group;
283	            }
284	        }
285	         
286	        expr jcs:execute($connection, $rpc-failover);
287	    }
288	    /* added to allow command take effect */
289	    expr jcs:sleep(0, 500);
290	}
291	 
292	 
293	/*
294	request-failover :: Chassis failover
295	This template performs an rg failover of the requested group
296	 
297	@param redundancy-group specifies the redundancy group to failover, defaults to 1
298	@param reset-flag specifies the manual failover flag should be cleared, defaults to false
299	@param fullfailover-flag specifies if both redundancy groups should be failed over, defaults to faulse
300	@param rg0-failover-check specifies if the failover time should be checked, defaults to true
301	 */
302	template request-failover () {
303	    param $redundancy-group = {
304	        expr "1";
305	    }
306	    param $reset-flag = {
307	        expr false();
308	    }
309	    param $fullfailover-flag = {
310	        expr "0";
311	    }
312	    param $rg0-failover-check = {
313	        expr "1";
314	    }
315	    /* Define which RG to failover */
316	    /* force the selection of a parameter disable the default */
317	    /* Chosing this forces both RGs to failover */
318	    /* Verify if its safe to failover RG0 */
319	    /* determite the other RG that would need to failover in a full failover */
320	    var $other-redundancy-group = {
321	        if ($redundancy-group == 0) {
322	            <text> "1";
323	         
324	        } else if ($redundancy-group == 1) {
325	            <text> "0";
326	        }
327	    }
328	    /* used to determine if we should fully failover */
329	    var $rg0-master = {
330	        call get-master() {
331	            with $redundancy-group = {
332	                expr "0";
333	             }
334	        }
335	    }
336	    var $rg1-master = {
337	        call get-master() {
338	            with $redundancy-group = {
339	                expr "1";
340	             }
341	        }
342	    }
343	    var $rg0-last-failover = {
344	        call get-rg0-last-failover();
345	    }
346	    var $current-time = {
347	        call get-current-time();
348	    }
349	     
350	    /* <xsl:variable name="local-node">
351	    <xsl:call-template name="get-local-node"/>
352	    </xsl:variable> */
353	    var $rg-primary-node = {
354	        /* Check who is the primary and whether the cluster has not been failed over already */
355	        if ($cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[1] == "secondary" && $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[2] == "primary" && $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/failover-mode[2] == "no") {
356	            /* failover RG to node0 */
357	            <text> "0";
358	         
359	        } else if ($cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[1] == "primary" && $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[2] == "secondary" && $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/failover-mode[1] == "no") {
360	            /* failover RG to node1 */
361	            <text> "1";
362	        }
363	    }
364	    var $rg0-primary-node = {
365	        /* Check who is the primary and whether the cluster has not been failed over already */
366	        if ($cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[1] == "secondary" && $cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[2] == "primary" && $cluster-status-results/redundancy-group[1]/device-stats/failover-mode[2] == "no") {
367	            /* failover RG to node0 */
368	            <text> "0";
369	         
370	        } else if ($cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[1] == "primary" && $cluster-status-results/redundancy-group[1]/device-stats/redundancy-group-status[2] == "secondary" && $cluster-status-results/redundancy-group[1]/device-stats/failover-mode[1] == "no") {
371	            /* failover RG to node1 */
372	            <text> "1";
373	        }
374	    }
375	     
376	    /*
377	    Execute the failover to the other chassis
378	     */
379	    if (boolean($rg0-failover-check == 1)) {
380	        if (boolean($rg-primary-node != "")) {
381	            if (boolean($fullfailover-flag &&((($rg0-master == "node0") &&($rg1-master == "node0")) ||(($rg0-master == "node1") &&($rg1-master == "node1"))))) {
382	                if (($current-time) >($rg0-last-failover + 300)) {
383	                    var $time-diff = $current-time - $rg0-last-failover;
384	                     
385	                    expr jcs:syslog(146, concat($time-diff, " seconds since last failover of RG0. Failing over RG0 to node", $rg-primary-node));
386	                    /* <xsl:value-of select="jcs:output(concat($time-diff, ' seconds since last failover of RG0. Failing over RG0.'))"/> */
387	                    /* <xsl:value-of select="jcs:output('Requesting failover for RG1')"/> */
388	                    expr jcs:syslog(146, concat("Requesting failover for RG1 to node", $rg-primary-node));
389	                    call request-rg-failover($node = $rg-primary-node, $redundancy-group);
390	                    /* full failover matched requesting RG0 failover */
391	                    call request-rg-failover($node = $rg0-primary-node, $redundancy-group = $other-redundancy-group);
392	                    call set-rg0-last-failover();
393	                 
394	                } else {
395	                    var $time-diff = $current-time - $rg0-last-failover;
396	                     
397	                    expr jcs:syslog(146, concat("Not enough time has passed to failover RG0 only ", $time-diff, " seconds have passed on node", $rg0-primary-node));
398	                    /* <xsl:value-of select="jcs:output(concat('Not enough time has passed to failover RG0 only ', $time-diff, ' seconds have passed'))"/> */
399	                    /* <xsl:value-of select="jcs:output('Requesting failover for RG1')"/> */
400	                    expr jcs:syslog(146, concat("Requesting failover for RG1 to node", $rg-primary-node));
401	                    call request-rg-failover($node = $rg-primary-node) {
402	                        with $redundancy-group = {
403	                            expr "1";
404	                         }
405	                    }
406	                }
407	             
408	            } else if ($redundancy-group == 0) {
409	                if ($current-time >($rg0-last-failover + 300)) {
410	                    var $time-diff = $current-time - $rg0-last-failover;
411	                    /* <xsl:value-of select="jcs:output(concat($time-diff, ' seconds since last failover of RG0. Failing over RG0.'))"/> */
412	                    expr jcs:syslog(146, concat($time-diff, " seconds since last failover of RG0. Failing over RG0 to node", $rg-primary-node));
413	                    call request-rg-failover($node = $rg-primary-node, $redundancy-group);
414	                    call set-rg0-last-failover();
415	                 
416	                } else {
417	                    var $time-diff = $current-time - $rg0-last-failover;
418	                    /* <xsl:value-of select="jcs:output(concat('Not enough time has passed to failover RG0 over ', $time-diff, ' seconds have passed'))"/> */
419	                    expr jcs:syslog(146, concat("Not enough time has passed to failover RG0 only ", $time-diff, " seconds have passed on node", $rg0-primary-node));
420	                }
421	             
422	            } else if ($redundancy-group == 1) {
423	                expr jcs:syslog(146, concat("Requesting failover for RG1 to node", $rg-primary-node));
424	                call request-rg-failover($node = $rg-primary-node, $redundancy-group);
425	            }
426	        }
427	     
428	    } else {
429	        if (boolean($rpc-failover != "")) {
430	            if (boolean($fullfailover-flag)) {
431	                /* <xsl:value-of select="jcs:output('Requesting full failover for both RG0 and RG1')"/> */
432	                expr jcs:syslog(146, concat("Requesting failover for RG0 and RG1 to node", $rg-primary-node));
433	                call request-rg-failover($node = $rg-primary-node, $redundancy-group);
434	                /* full failover matched requesting RG0 failover */
435	                call request-rg-failover($node = $rg0-primary-node, $redundancy-group = $other-redundancy-group);
436	                call set-rg0-last-failover();
437	             
438	            } else if ($redundancy-group == 0) {
439	                /* <xsl:value-of select="jcs:output('Requesting failover for RG0')"/> */
440	                expr jcs:syslog(146, concat("Requesting failover for RG0 to node", $rg0-primary-node));
441	                call request-rg-failover($node = $rg-primary-node, $redundancy-group);
442	                call set-rg0-last-failover();
443	             
444	            } else if ($redundancy-group == 1) {
445	                /* <xsl:value-of select="jcs:output('Requesting failover for RG1')"/> */
446	                expr jcs:syslog(146, concat("Requesting failover for RG1 to node", $rg-primary-node));
447	                call request-rg-failover($node = $rg-primary-node, $redundancy-group);
448	            }
449	        }
450	    }
451	     
452	    /*
453	    Clear the failover status if requested
454	     */
455	    if (boolean($reset-flag)) {
456	        if (boolean($fullfailover-flag)) {
457	            call reset-failover-flag($redundancy-group);
458	            call reset-failover-flag($redundancy-group = $other-redundancy-group);
459	         
460	        } else {
461	            call reset-failover-flag($redundancy-group);
462	        }
463	    }
464	}
465	 
466	 
467	/*
468	reset-failover-flag :: Get Master
469	This template is used to reset the failover flag
470	 
471	@param redundancy-group specifies the redundancy group to failover, 1 is the default
472	 */
473	template reset-failover-flag () {
474	    param $redundancy-group = {
475	        expr "1";
476	    }
477	    var $rpc-clear-failover-flag = <rpc> {
478	        <command> {
479	            expr "request chassis cluster failover reset redundancy-group ";
480	            expr $redundancy-group;
481	        }
482	    }
483	     
484	    /* <xsl:value-of
485	    select="jcs:output(concat('Reseting manual failover for redundancy group ', $redundancy-group))"/> */
486	    expr jcs:syslog(146, concat("Reseting manual failover for redundancy group ", $redundancy-group));
487	    expr jcs:execute($connection, $rpc-clear-failover-flag);
488	    expr jcs:sleep(0, 500);
489	}
490	 
491	 
492	/*
493	get-master :: Get Master
494	This template is used to determine device is master of a redundancy group (RG0 by default).
495	 
496	@param redundancy-group specifies the redundancy group to check master default is 0
497	 */
498	template get-master () {
499	    param $redundancy-group = {
500	        expr "0";
501	    }
502	    /* Determine the master of the chassis cluster */
503	    var $rg-node0-priority = $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[1];
504	    var $rg-node1-priority = $cluster-status-results/redundancy-group[$redundancy-group + 1]/device-stats/redundancy-group-status[2];
505	     
506	    if (not($rg-node0-priority) || not($rg-node1-priority)) {
507	        /* One Device is not communicating to the other the test will not run */
508	         
509	        /* <xsl:value-of
510	        select="jcs:output('Only one cluster member has been found. Check connectivity to the other cluster member.')"
511	        /> */
512	     
513	    } else if ($rg-node0-priority == "primary" && $rg-node1-priority != "primary") {
514	        /* Node0 is the primary returning Node0 */
515	        /* <xsl:value-of select="jcs:output(concat('Node0 is the master for redundancy group: ', $redundancy-group))"/> */
516	        <text> "node0";
517	     
518	    } else if ($rg-node0-priority != "primary" && $rg-node1-priority == "primary") {
519	        /* Node1 is the primary returning Node1 */
520	        /* <xsl:value-of select="jcs:output(concat('Node1 is the master for redundancy group: ', $redundancy-group))"/> */
521	        <text> "node1";
522	     
523	    } else {
524	        /* priorities are the same */
525	        /* <xsl:value-of select="jcs:output('An unexpected result has occured while checking the node status.')" /> */
526	    }
527	}
528	 
529	/* Return the local node value */
530	template get-local-node () {
531	    var $get-local-RE = <rpc> {
532	        <command> "show chassis routing-engine node local";
533	    }
534	    /* Get the local RE node */
535	    var $local-check-results = jcs:execute($connection, $get-local-RE);
536	    <text> $local-check-results/multi-routing-engine-item/re-name;
537	}
538	 
539	 
540	/*
541	get-manual-failover-flag :: Get manual failover flag
542	This template is used to determine if the failover flag is set
543	 
544	@param redundancy-group specifies the redundancy group to check master default is 0
545	 */
546	template get-manual-failover-flag ($redundancy-group) {
547	    var $rpc-check-manual-failover-flag = <rpc> {
548	        <command> "show chassis cluster status";
549	    }
550	    var $result-check-manual-failover-flag = jcs:execute($connection, $rpc-check-manual-failover-flag);
551	    <text> $result-check-manual-failover-flag/redundancy-group[$redundancy-group + 1]/device-stats/failover-mode[1];
552	}
553	 
554	/* determine and return the current product model */
555	template get-product-model () {
556	    var $rpc-product-model = <rpc> {
557	        <command> "show version";
558	    }
559	    var $results-product-model = jcs:execute($connection, $rpc-product-model);
560	     
561	    if ($results-product-model/multi-routing-engine-item) {
562	        <text> $results-product-model/multi-routing-engine-item[1]/software-information/product-model;
563	     
564	    } else {
565	        expr $results-product-model/software-information/product-model;
566	    }
567	}
568	 
569	/* end template section */
570	/* srx-ha-lib.xsl file end */
571	template main () {
572	    var $monitored-interfaces-results = <redundancy-groups> {
573	         
574	        for-each ($chassis-cluster-config//cluster/redundancy-group) {
575	            var $rg = ./name;
576	            <redundancy-group> {
577	                <name> $rg;
578	                 
579	                for-each (.//apply-macro[name == "monitor-interfaces"]/data) {
580	                    <monitored-interface> {
581	                        <name> ./name;
582	                        <weight> {
583	                            if (./value) {
584	                                expr ./value;
585	                             
586	                            } else if (not(./monitored-interface/value)) {
587	                                expr "255";
588	                            }
589	                        }
590	                    }
591	                }
592	            }
593	        }
594	    }
595	    var $rg0-master = {
596	        call get-master() {
597	            with $redundancy-group = {
598	                expr "0";
599	             }
600	        }
601	    }
602	    var $rg1-master = {
603	        call get-master() {
604	            with $redundancy-group = {
605	                expr "1";
606	             }
607	        }
608	    }
609	    var $local-node = {
610	        call get-local-node();
611	    }
612	    var $int-parse-regex = "([a-zA-Z]+)-([0-9]+)/([0-9]+)/([0-9]+)";
613	    var $monitored-interfaces = ext:node-set($monitored-interfaces-results);
614	     
615	    for-each ($monitored-interfaces//redundancy-groups/redundancy-group) {
616	        var $redundancy-group = ./name;
617	        var $monitered-interface = ./monitored-interface/name;
618	        var $weight = ./monitored-interface/weight;
619	        var $RG-interface-monitor-weight = {
620	            call get-monitor-interface-current-weight($redundancy-group);
621	        }
622	        var $total-weight = $RG-interface-monitor-weight + $weight;
623	         
624	        if (($monitered-interface == translate($interface, " ", ""))) {
625	            if (($monitered-interface == translate($interface, " ", "")) &&($total-weight > 254)) {
626	                expr jcs:output(concat("found and testing ", translate($interface, " ", "")));
627	                var $int-parse = jcs:regex($int-parse-regex, translate($interface, " ", ""));
628	                var $media = $int-parse[2];
629	                var $fpc = $int-parse[3];
630	                var $pic = $int-parse[4];
631	                var $port = $int-parse[5];
632	                /* Determine which chassis owns the interface */
633	                if (($fpc <= $chassis-interface-ownership/max-interface-number/node0-max) &&($fpc >= $chassis-interface-ownership/max-interface-number/node0-min)) {
634	                    /* node0 is the interface owner */
635	                    if ($redundancy-group == "0") {
636	                        if (($rg0-master == "node0") &&($rg1-master == "node0")) {
637	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"]), $fullfailover-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
638	                                with $redundancy-group = {
639	                                    expr "0";
640	                                 }
641	                            }
642	                         
643	                        } else if (($rg0-master == "node0") && not($rg1-master == "node0")) {
644	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
645	                                with $redundancy-group = {
646	                                    expr "0";
647	                                 }
648	                            }
649	                         
650	                        } else if (not($rg0-master == "node0") &&($rg1-master == "node0")) {
651	                            if (boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
652	                                call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
653	                                    with $redundancy-group = {
654	                                        expr "1";
655	                                     }
656	                                }
657	                            }
658	                         
659	                        } else if (not($rg0-master == "node0") && not($rg1-master == "node0")) {
660	                            /* do nothing chassis not a master for either RG */
661	                        }
662	                     
663	                    } else if ($redundancy-group == "1") {
664	                        if (($rg0-master == "node0") &&($rg1-master == "node0")) {
665	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"]), $fullfailover-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
666	                                with $redundancy-group = {
667	                                    expr "1";
668	                                 }
669	                            }
670	                         
671	                        } else if (($rg0-master == "node0") && not($rg1-master == "node0")) {
672	                            if (boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
673	                                call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
674	                                    with $redundancy-group = {
675	                                        expr "0";
676	                                     }
677	                                }
678	                            }
679	                         
680	                        } else if (not($rg0-master == "node0") &&($rg1-master == "node0")) {
681	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
682	                                with $redundancy-group = {
683	                                    expr "1";
684	                                 }
685	                            }
686	                         
687	                        } else if (not($rg0-master == "node0") && not($rg1-master == "node0")) {
688	                            /* do nothing chassis not a master for either RG */
689	                        }
690	                    }
691	                 
692	                } else if (($fpc <= $chassis-interface-ownership/max-interface-number/node1-max) &&($fpc >= $chassis-interface-ownership/max-interface-number/node1-min)) {
693	                    /* node1 is the interface owner */
694	                    /* node0 processing */
695	                    if ($redundancy-group == "0") {
696	                        if (($rg0-master == "node1") &&($rg1-master == "node1")) {
697	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"]), $fullfailover-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
698	                                with $redundancy-group = {
699	                                    expr "0";
700	                                 }
701	                            }
702	                         
703	                        } else if (($rg0-master == "node1") && not($rg1-master == "node1")) {
704	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
705	                                with $redundancy-group = {
706	                                    expr "0";
707	                                 }
708	                            }
709	                         
710	                        } else if (not($rg0-master == "node1") &&($rg1-master == "node1")) {
711	                            if (boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
712	                                call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
713	                                    with $redundancy-group = {
714	                                        expr "1";
715	                                     }
716	                                }
717	                            }
718	                         
719	                        } else if (not($rg0-master == "node1") && not($rg1-master == "node1")) {
720	                            /* do nothing chassis not a master for either RG */
721	                        }
722	                     
723	                    } else if ($redundancy-group == "1") {
724	                        if (($rg0-master == "node1") &&($rg1-master == "node1")) {
725	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"]), $fullfailover-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
726	                                with $redundancy-group = {
727	                                    expr "1";
728	                                 }
729	                            }
730	                         
731	                        } else if (($rg0-master == "node1") && not($rg1-master == "node1")) {
732	                            if (boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "full-failover"])) {
733	                                call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
734	                                    with $redundancy-group = {
735	                                        expr "0";
736	                                     }
737	                                }
738	                            }
739	                         
740	                        } else if (not($rg0-master == "node1") &&($rg1-master == "node1")) {
741	                            call request-failover($reset-flag = boolean($chassis-cluster-config/chassis/cluster/apply-macro[name == "monitoring-options"]/data[name == "clear-failover"])) {
742	                                with $redundancy-group = {
743	                                    expr "1";
744	                                 }
745	                            }
746	                         
747	                        } else if (not($rg0-master == "node1") && not($rg1-master == "node1")) {
748	                            /* do nothing chassis not a master for either RG */
749	                        }
750	                    }
751	                }
752	             
753	            } else {
754	                call set-track-interface-last-weight($redundancy-group, $weight);
755	            }
756	         
757	        } else {
758	            expr jcs:output(concat("The interface ", $interface, " is not monitored by RG", $redundancy-group));
759	        }
760	    }
761	}
762	 
763	match / {
764	    <op-script-results> {
765	        if (boolean($interface)) {
766	            /* Determine the local node executing the script */
767	            var $local-node-name = {
768	                call get-local-node();
769	            }
770	            /* Determine the master for RG0 */
771	            var $master-node-name = {
772	                call get-master() {
773	                    with $redundancy-group = {
774	                        expr "0";
775	                     }
776	                }
777	            }
778	             
779	            if ($local-node-name == $master-node-name) {
780	                expr jcs:output("Node is the master of RG0, cheking tracked objects");
781	                /* Run the interface monitoring */
782	                expr jcs:output(concat("failed interface ", $interface));
783	                call main();
784	             
785	            } else {
786	                /* This note is not master so it do anything */
787	                expr jcs:output("Node is not the maser of RG0. Nothing to do!");
788	            }
789	         
790	        } else {
791	            expr jcs:output("Please specify an interface name using the interface argument");
792	        }
793	    }
794	}

SLAX Script Contents

01	/* Machine Crafted with Care (tm) by slaxWriter */
02	version 1.0;
03	 
04	 
05	/*
06	- $Id: mpls-lsp.slax,v 1.1 2007/10/17 18:37:04 phil Exp $
07	-
08	- Copyright (c) 2004-2005, Juniper Networks, Inc.
09	- All rights reserved.
10	-
11	 */
12	ns junos = "http://xml.juniper.net/junos/*/junos";
13	ns xnm = "http://xml.juniper.net/xnm/1.1/xnm";
14	ns jcs = "http://xml.juniper.net/junos/commit-scripts/1.0";
15	 
16	import "../import/junos.xsl";
17	 
18	/*
19	- This example turns a list of addresses into a list of MPLS LSPs.
20	- An apply-macro under [protocols mpls] is configured with a
21	- set of addresses and a 'color' parameter.  Each address is
22	- turned into an LSP configuration, with the color as an admin-group.
23	 */
24	match configuration {
25	    var $mpls = protocols/mpls;
26	     
27	    for-each ($mpls/apply-macro[data/name == "color"]) {
28	        var $color = data[name == "color"]/value;
29	        <transient-change> {
30	            <protocols> {
31	                <mpls> {
32	                     
33	                    for-each (data[not(value)]/name) {
34	                        <label-switched-path> {
35	                            <name> $color _ "-lsp-" _ .;
36	                            <to> .;
37	                            <admin-group> {
38	                                <include> $color;
39	                            }
40	                        }
41	                    }
42	                }
43	            }
44	        }
45	    }
46	}

XML Script Contents

01	<?xml version="1.0"?>
02	<script>
03	<title>monitor-interface.slax</title>
04	<author>rcameron</author>
05	<synopsis>
06	This script implements all of the features that are included on the SRX plus the ability to limit the failover of the control plane.
07	</synopsis>
08	<coe>event</coe>
09	<type>HA</type>
10	 
11	<description>
12	Interface monitoring is an included feature on the platform but it may not meet all customer requirements. This implementation of interface monitoring limits the user to monitoring only the data plane.
13	</description>
14	 
15	 <example>
16	 <title>Example</title>
17	 <config>example-1.conf</config>
18	 </example>
19	 
20	<xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml"
21	src="../../../../../web/leaf.js"
22	type="text/javascript"/>
23	</script>

#ScriptingHow-To
#eventscript
#interface
#monitor
#Slax
#How-To

Blog Viewer

Scripting How-To: Limit the failover of the control plane

By Erdem posted 08-10-2015 12:25

Overview

Description

Source Code

GitHub Links

Example Configuration

SLAX Script Contents

SLAX Script Contents

XML Script Contents

Permalink