Layer 3 Virtual Private Network Inter-AS option using SRv6 as underlay transport on MX and ACX7000 routers.
Introduction
This is the 5th blog post in the series of SRv6 blogs. This blog post is co-authored by Krzysztof Szarkowicz and Rajesh M, and discusses the L3VPN (Layer 3 Virtual Private Network) Inter-AS (Inter Autonomous System) Option C, using SRv6 as underlay transport. This could be used by SP (Service Provider) in different deployments where IGP (Interior Gateway Protocol) domains are not merged, for example:
- network is divided into separate AS (Autonomous Systems) as a result of network mergers and migrations
- network is divided into separate AS domains due to logical separation (e.g., different AS for data canter network, and different AS for transport network)
- network is divided into separate AS domains due to operational reasons (network size, different operations team maintaining different parts of the network, etc.)
This blog post is based on the capabilities of Junos 22.3 running on MX Series and ACX7000 routers. Configuration and operational command outputs have been collected on vMX in our labs.
You can test yourself all the concepts described in this article, we created labs in JCL and vLabs:
JCL (Junivators and partners)
vLabs (open to all)
In this blog post, following IP (Internet Protocol) addressing is used:
Transport Infrastructure (P/PE)
- Router-ID: 198.51.100.<XX>
- Loopback: 2001:db8:bad:cafe:<area>00::<XX>/128
- SRv6 locator: fc01:<area>00:<XX>::/48
- Core Links: 2001:db8:beef:<area>00::<XXYY>:<local-ID>/112
PE-CE links:
- IPv4: <VLAN>.<XX>.<YY>.<local-ID>/24
- IPv6: 2001:db8:babe:face:<VLAN>::<XXYY>:<local-ID>/112
VPN (Virtual Private Network) Loopbacks (CE/PE):
- 192.168.<VLAN>.<XX>/32
- 2001:db8:abba:<VLAN>::<XX>/128
Architecture
RFC4364 (BGP/MPLS IP Virtual Private Networks) describes in Section 10 three different ways of providing L3VPN services over the multi-AS network. These three different architectures are commonly referenced as Inter-AS Option 10A, Inter-AS Option 10B, and Inter-AS Option 10C, given the fact, that a), b), c) options are described in Section 10. Or, simply, they could be referenced as Inter-AS Option A, Inter-AS Option B or Inter-As Option C.
This blog post focuses on Inter-AS Option C architecture. However, as opposed to original Inter-AS Option C described in RFC4364, this time the underlay is not MPLS but SRv6. Figure 1 summarizes the example network topology used as the basis for L3VPN Inter-AS Option C discussion.
Figure 1: Inter-AS Topology and L3VPN service prefix distribution
The example network has two domains, each domain with different AS number (AS 65501, and AS 65502, respectively), and each domain with its own IGP (IS-IS L2 area 49.0001 and IS-IS L2 area 49.0002). There is no IGP connectivity between these domains – they are interconnected via BGP only.
The essence of Inter-AS Option C is as follows:
- Services prefixes are exchanged between domains without modification of BGP NEXT_HOP attribute. In the context of SRv6, it also means they are exchanged without modification of SRv6 service SID (i.e., without modification of END.DT4, END.DT6 or END.DT46 SRv6 SIDs). This implies that between PEs in different domains, there must be end-to-end transport tunnel suitable for transporting L3VPN traffic. In the example of this blog post, there is a full mesh of BGP sessions between PEs (within, as well as between, domains), carrying L3VPN prefixes, as outlined in Figure 1. In most deployments, instead of creating full BGP mesh, a RR (Route Reflector) architecture would be used, with PEs within a domain peering with iBGP sessions to RRs of that domain, and L3VPN prefixes exchanged between domains via eBGP sessions between RRs. To simplify this blog post, RR architecture is not used here.
- Transport prefixes (e.g. loopbacks) are exchanged between domains with BGP NEXT_HOP attribute being changed (set to local address, so called “next-hop self” – nhs – action) on domain boundaries, as outlined in Figure 2. In the context of SRv6 (see L3VPN over SRv6 blog post for more details), transport end-points for L3VPN services are not loopbacks, but SRv6 locators, hence nhs action must be executed on SRv6 locators as well. Both loopbacks and SRv6 locators are distributed via IPv6 unicast (AFI/SAFI=2/1).
Figure 2: Inter-AS Topology and transport prefix (loopbacks, SRv6 locators) distribution
In Inter-AS Option C, there is no need to maintain service prefixes on ASBR (autonomous system boundary router), as the end-to-end tunnels between PEs in different domains provides end-to-end transport capability. This tunnel is a hierarchical tunnel, where inter-domain tunnel (tunnel towards SRv6 locator from remote domain) is tunneled in each domain via intra-domain tunnel (tunnel towards SRv6 locator in the same domain). In this blog post, we will discuss details of this architecture.
Base Configuration for Exchanging Transport Prefixes
For reference, Configuration 1 shows base configuration for IS-IS on P1 router. Similar configuration is deployed on all other P and PE routers in the topology, and details of this configuration were discussed in previous SRv6 blog posts. Please note, inter-AS link (ge-0/0/3) is not included in IS-IS.
1 routing-options {
2 source-packet-routing {
3 srv6 {
4 locator SL-000 fc01:100:1::/48;
5 no-reduced-srh;
6 }
7 }
8 resolution {
9 preserve-nexthop-hierarchy;
10 }
11 router-id 198.51.100.1;
12 autonomous-system 65501;
13 ipv6-router-id 2001:db8:bad:cafe:100::1;
14 forwarding-table {
15 export PS-LOAD-BALANCE;
16 }
17 }
18 protocols {
19 isis {
20 apply-groups GR-ISIS;
21 interface ge-0/0/0.0;
22 interface ge-0/0/1.0;
23 interface ge-0/0/2.0;
24 interface lo0.0 {
25 passive;
26 }
27 source-packet-routing {
28 srv6 {
29 locator SL-000 {
30 end-sid fc01:100:1::;
31 }
32 }
33 }
34 level 1 disable;
35 level 2 {
36 wide-metrics-only;
37 }
38 reference-bandwidth 1000g;
39 no-ipv4-routing;
40 }
41 }
Configuration 1: Base IS-IS configuration on P1
There is no IS-IS connectivity between two domains, so before BGP sessions outlined in Figure 1 could be established, connectivity between loopbacks of remote domains must be provided. This is the task of BGP sessions outlined in Figure 2. Therefore, let’s have a look at base BGP configuration for distributing loopbacks (to support establishment of multi-hop eBGP sessions outlined in Figure 1), as well as distributing SRv6 locators (to support creation of end-to-end SRv6 tunnels for L3VPN services), taking as an example PE11 (Configuration 2).
1 policy-options {
2 policy-statement PS-BGP-IPV6-EXP {
3 term TR-LOCAL-LOOPBACK {
4 from {
5 protocol direct;
6 rib inet6.0;
7 interface lo0.0;
8 route-filter 2001:db8:bad:cafe:100::11/128 exact;
9 }
10 then {
11 community add CM-LOOPBACK-65501;
12 accept;
13 }
14 }
15 term TR-LOCAL-LOCATOR {
16 from {
17 rib inet6.0;
18 route-filter fc01:100:11::/48 exact;
19 }
20 then {
21 community add CM-LOCATOR-65501;
22 accept;
23 }
24 }
25 then reject;
26 }
27 community CM-LOCATOR-65501 members 65501:1002;
28 community CM-LOOPBACK-65501 members 65501:1001;
29 }
30 protocols {
31 bgp {
32 (…)
33 group GR-IBGP-IPV6-TRANSPORT {
34 local-address 2001:db8:bad:cafe:100::11;
35 family inet6 {
36 unicast;
37 }
38 export PS-BGP-IPV6-EXP;
39 neighbor 2001:db8:bad:cafe:100::1 {
40 description P1;
41 }
42 neighbor 2001:db8:bad:cafe:100::2 {
43 description P2;
44 }
45 }
46 (…)
47 }
48 }
Configuration 2: BGP distribution of transport IPv6 prefixes on PE11
It is pretty standard configuration advertising local loopback and local SRv6 locator with some geographic community. These communities will be used later in BGP policies.
ASBR routers (P1, P2, P3, P4) have different BGP configuration, as they perform different tasks (Configuration 3).
1 policy-options {
2 policy-statement PS-EBGP-IMP {
3 then accept;
4 }
5 policy-statement PS-EBGP-IPV6-EXP {
6 term TR-IPV6 {
7 from {
8 protocol bgp;
9 rib inet6.0;
10 community CM-LOCAL-AS;
11 }
12 then {
13 next-hop self;
14 accept;
15 }
16 }
17 then reject;
18 }
19 policy-statement PS-IBGP-IPV6-EXP {
20 term TR-LOCAL-AS {
21 from community CM-LOCAL-AS;
22 then reject;
23 }
24 term TR-REMOTE-AS {
25 from {
26 protocol bgp;
27 rib inet6.0;
28 }
29 then {
30 next-hop self;
31 accept;
32 }
33 }
34 then reject;
35 }
36 community CM-LOCAL-AS members 65501:*;
37 }
38 protocols {
39 bgp {
40 (…)
41 group GR-IBGP-IPV6-TRANSPORT {
42 local-address 2001:db8:bad:cafe:100::1;
43 family inet6 {
44 unicast;
45 }
46 export PS-IBGP-IPV6-EXP;
47 neighbor 2001:db8:bad:cafe:100::11 {
48 description PE11;
49 }
50 neighbor 2001:db8:bad:cafe:100::12 {
51 description PE12;
52 }
53 neighbor 2001:db8:bad:cafe:100::2 {
54 description P2;
55 }
56 }
57 group GR-EBGP-IPV6-TRANSPORT {
58 local-address 2001:db8:beef::0104:1;
59 import PS-EBGP-IMP;
60 family inet6 {
61 unicast;
62 }
63 export PS-EBGP-IPV6-EXP;
64 peer-as 65502;
65 neighbor 2001:db8:beef::0104:4 {
66 description P4;
67 }
68 }
69 (…)
70 defaults {
71 ebgp {
72 no-policy {
73 receive reject-always;
74 advertise reject-always;
75 }
76 }
77 }
78 }
Configuration 3: BGP distribution of transport IPv6 prefixes on P1
Again, pretty standard configuration. Both export policies (for iBGP group, and for eBGP group) use next-hop self (lines 13 and 30), as discussed earlier. There is some basic loop prevention mechanism based on AS specific communities (lines 10, 21, and 36), as best common practice. iBGP sessions are multi-hop (established between loopbacks), while eBGP session is single-hop (established between link addresses).
Additionally, as best common practice, based on RFC 8212, eBGP session, by default, should block prefix exchange, for both inbound and outbound direction. RFC 8212 compliant behavior is achieved in Junos with explicit configuration (lines 70-77). Accepting and sending prefixes requires thus explicit import and export policies. Therefore, in addition to explicit export policy (line 63), an explicit import policy is defined as well (lines 2-4, and 59). For simplification, this blog post uses ‘allow all’ policy as import policy. In real deployment, you probably define some more restrictive policy here. Please note this implicit import policy is needed for eBGP session only.
Verification of inter-AS transport path
Now, having the transport prefix distribution in place, let’s verify it! Starting on PE11 (CLI-Output 1).
1 kszarkowicz@PE11> show route advertising-protocol bgp 2001:db8:bad:cafe:100::1 detail
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 * 2001:db8:bad:cafe:100::11/128
(1 entry, 1 announced)
5 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
6 Nexthop: Self
7 Localpref: 100
8 AS path: [65501] I
9 Communities: 65501:1001
10
11 * fc01:100:11::/48
(1 entry, 1 announced)
12 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
13 Nexthop: Self
14 MED: 0
15 Localpref: 100
16 AS path: [65501] I
17 Communities: 65501:1002
CLI-Output 1: Advertising IPv6 Prefixes on PE11
Looks good! PE11 advertises its own loopback (line 4) and own SRv6 locator to P1 (line 11). Also, appropriate communities are attached (line 9 and line 17). Now, let’s check next router, P1 (CLI-Output 2).
1 kszarkowicz@P1> show route advertising-protocol bgp 2001:db8:beef::104:4 detail
2
3 inet6.0: 29 destinations, 37 routes (29 active, 0 holddown, 0 hidden)
4 2001:db8:bad:cafe:100::11/128 (2 entries, 2 announced)
5 BGP group GR-EBGP-IPV6-TRANSPORT type External
6 Nexthop: Self
7 Flags: Nexthop Change
8 AS path: [65501] I
9 Communities: 65501:1001
10
11 2001:db8:bad:cafe:100::12/128 (2 entries, 2 announced)
12 BGP group GR-EBGP-IPV6-TRANSPORT type External
13 Nexthop: Self
14 Flags: Nexthop Change
15 AS path: [65501] I
16 Communities: 65501:1001
17
18 fc01:100:11::/48 (2 entries, 2 announced)
19 BGP group GR-EBGP-IPV6-TRANSPORT type External
20 Nexthop: Self
21 Flags: Nexthop Change
22 AS path: [65501] I
23 Communities: 65501:1002
24
25 fc01:100:12::/48 (2 entries, 2 announced)
26 BGP group GR-EBGP-IPV6-TRANSPORT type External
27 Nexthop: Self
28 Flags: Nexthop Change
29 AS path: [65501] I
30 Communities: 65501:1002
CLI-Output 2: Advertising IPv6 Prefixes over eBGP on P1
Looks also good! Loopbacks and SRv6 locators from PE11 (lines 4 and 18) and PE12 (lines 11 and 12) are sent by P1 to P4 in another AS. Also, as instructed in the configuration, next-hop is changed to self (lines 6-7, 13-14, 20-21, 27-28). Now, let’s check next router, P4 (CLI-Output 3).
1 kszarkowicz@P4> show route advertising-protocol bgp 2001:db8:bad:cafe:200::21 detail
2
3 inet6.0: 29 destinations, 37 routes (29 active, 0 holddown, 0 hidden)
4 * 2001:db8:bad:cafe:100::11/128 (2 entries, 1 announced)
5 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
6 Nexthop: Self
7 Flags: Nexthop Change
8 Localpref: 100
9 AS path: [65502] 65501 I
10 Communities: 65501:1001
11
12 * 2001:db8:bad:cafe:100::12/128 (2 entries, 1 announced)
13 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
14 Nexthop: Self
15 Flags: Nexthop Change
16 Localpref: 100
17 AS path: [65502] 65501 I
18 Communities: 65501:1001
19
20 * fc01:100:11::/48 (2 entries, 1 announced)
21 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
22 Nexthop: Self
23 Flags: Nexthop Change
24 Localpref: 100
25 AS path: [65502] 65501 I
26 Communities: 65501:1002
27
28 * fc01:100:12::/48 (2 entries, 1 announced)
29 BGP group GR-IBGP-IPV6-TRANSPORT type Internal
30 Nexthop: Self
31 Flags: Nexthop Change
32 Localpref: 100
33 AS path: [65502] 65501 I
34 Communities: 65501:1002
CLI-Output 3: Advertising IPv6 Prefixes over iBGP on P4
Again, everything looks fine. Again, when P4 sends the transport prefixes to PE21, next-hop of transport prefixes from remote AS is changed to self, which is what we wanted (lines 6-7, 13-14, 20-21, 27-28). Just last check on PE21 (CLI-Output 4).
1 kszarkowicz@PE21> show route receive-protocol bgp 2001:db8:bad:cafe:200::4 detail
2
3 inet.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
4
5 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
6
7 RI-VRF30.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
8
9 iso.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden)
10
11 mpls.0: 2 destinations, 2 routes (2 active, 0 holddown, 0 hidden)
12
13 bgp.l3vpn.0: 42 destinations, 42 routes (22 active, 0 holddown, 20 hidden)
14
15 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
16 2001:db8:bad:cafe:100::11/128 (2 entries, 1 announced)
17 Accepted MultipathContrib
18 Nexthop: 2001:db8:bad:cafe:200::4
19 Localpref: 100
20 AS path: 65501 I
21 Communities: 65501:1001
22
23 2001:db8:bad:cafe:100::12/128 (2 entries, 1 announced)
24 Accepted MultipathContrib
25 Nexthop: 2001:db8:bad:cafe:200::4
26 Localpref: 100
27 AS path: 65501 I
28 Communities: 65501:1001
29
30 fc01:100:11::/48 (2 entries, 1 announced)
31 Accepted MultipathContrib
32 Nexthop: 2001:db8:bad:cafe:200::4
33 Localpref: 100
34 AS path: 65501 I
35 Communities: 65501:1002
36
37 fc01:100:12::/48 (2 entries, 1 announced)
38 Accepted MultipathContrib
39 Nexthop: 2001:db8:bad:cafe:200::4
40 Localpref: 100
41 AS path: 65501 I
42 Communities: 65501:1002
43
44 inet6.3: 3 destinations, 3 routes (3 active, 0 holddown, 0 hidden)
45
46 RI-VRF20.inet6.0: 15 destinations, 25 routes (10 active, 0 holddown, 10 hidden)
47
48 RI-VRF30.inet6.0: 15 destinations, 25 routes (10 active, 0 holddown, 10 hidden)
49
50 bgp.l3vpn-inet6.0: 44 destinations, 44 routes (24 active, 0 holddown, 20 hidden)
51
52 bgp.rtarget.0: 4 destinations, 10 routes (4 active, 0 holddown, 0 hidden)
CLI-Output 4: Receiving IPv6 Prefixes on PE21
Perfect! PE21 sees remote transport prefixes received from P4 with the BGP NEXT_HOP of P4 loopback (lines 18, 25, 32, 39). All ASBRs change the next-hop to self, so in theory we should be able now to ping between remote PEs (CLI-Output 5).
1 kszarkowicz@PE21> ping 2001:db8:bad:cafe:100::11 count 1
2 PING6(56=40+8+8 bytes) 2001:db8:beef:200::321:21 --> 2001:db8:bad:cafe:100::11
3
4 --- 2001:db8:bad:cafe:100::11 ping6 statistics ---
5 1 packets transmitted, 0 packets received, 100% packet loss
CLI-Output 5: Ping from PE22 to PE11 (first try)
Unfortunately, although prefixes are correctly exchanged (verification in only one direction was shown in this blog post, but other direction is similar), ping doesn’t work (line 5)!
Oops, wait a minute! We exchanged only loopbacks and SRv6 locators. And, our ping is sourced by default from interface address (line 2). So, let’s try a ping sourced from loopback (CLI-Output 6).
1 kszarkowicz@PE21> ping source 2001:db8:bad:cafe:200::21 2001:db8:bad:cafe:100::11 count 1
2 PING6(56=40+8+8 bytes) 2001:db8:bad:cafe:200::21 --> 2001:db8:bad:cafe:100::11
3 16 bytes from 2001:db8:bad:cafe:100::11, icmp_seq=0 hlim=62 time=8.762 ms
4
5 --- 2001:db8:bad:cafe:100::11 ping6 statistics ---
6 1 packets transmitted, 1 packets received, 0% packet loss
7 round-trip min/avg/max/std-dev = 8.762/8.762/8.762/0.000 ms
8
9 kszarkowicz@PE21> ping source 2001:db8:bad:cafe:200::21 fc01:100:11:: count 1
10 PING6(56=40+8+8 bytes) 2001:db8:bad:cafe:200::21 --> fc01:100:11::
11 16 bytes from 2001:db8:beef:100::111:11, icmp_seq=0 hlim=62 time=13.639 ms
12
13 --- fc01:100:11:: ping6 statistics ---
14 1 packets transmitted, 1 packets received, 0% packet loss
15 round-trip min/avg/max/std-dev = 13.639/13.639/13.639/0.000 ms
CLI-Output 6: Ping from PE22 to PE11 (second try)
Fortunately, sourcing the ping from the local loopback solves the issue. Both remote loopback and remote SRv6 locator (or, remote SRv6 END SID, to be more precise – check SRv6 Basics: Locator and End SIDs blog post for more details) are reachable (lines 7 and 14).
Base Configuration for exchanging L3VPN prefixes
It is now time to look for L3VPN prefix distribution. As indicated earlier (Figure 1), in this blog post we exchange L3VPN prefix via full mesh of iBGP/eBGP sessions between PE routers. Let’s look at PE11 for an example BGP configuration (Configuration 4).
1 policy-options {
2 policy-statement PS-EBGP-IMP {
3 then accept;
4 }
5 policy-statement PS-EBGP-L3VPN-EXP {
6 term TR-EBGP {
7 from {
8 protocol bgp;
9 external;
10 }
11 then reject;
12 }
13 term TR-RTC {
14 from rib bgp.rtarget.0;
15 then accept;
16 }
17 term TR-L3VPN {
18 from community RT;
19 then {
20 community delete CM-RTE-TYPE;
21 accept;
22 }
23 }
24 then reject;
25 }
26 community CM-RTE-TYPE members 0x306:*:*;
27 community RT members target:*:*;
28 }
29 protocols {
30 bgp {
31 (…)
32 group GR-IBGP-L3VPN {
33 local-address 2001:db8:bad:cafe:100::11;
34 family inet-vpn {
35 unicast {
36 extended-nexthop;
37 advertise-srv6-service;
38 accept-srv6-service;
39 }
40 }
41 family inet6-vpn {
42 unicast {
43 advertise-srv6-service;
44 accept-srv6-service;
45 }
46 }
47 family route-target;
48 neighbor 2001:db8:bad:cafe:100::12 {
49 description P12;
50 }
51 }
52 group GR-EBGP-L3VPN {
53 multihop {
54 no-nexthop-change;
55 }
56 local-address 2001:db8:bad:cafe:100::11;
57 import PS-EBGP-IMP;
58 family inet-vpn {
59 unicast {
60 extended-nexthop;
61 advertise-srv6-service;
62 accept-srv6-service;
63 }
64 }
65 family inet6-vpn {
66 unicast {
67 advertise-srv6-service;
68 accept-srv6-service;
69 }
70 }
71 family route-target;
72 export PS-EBGP-L3VPN-EXP;
73 peer-as 65502;
74 neighbor 2001:db8:bad:cafe:200::21 {
75 description P21;
76 }
77 neighbor 2001:db8:bad:cafe:200::22 {
78 description P22;
79 }
80 }
81 (…)
82 multipath {
83 list-nexthop;
84 }
85 rfc8950-compliant;
86 (…)
Configuration 4: BGP distribution of L3VPN prefixes on PE11
Similar configuration is deployed on all other PE routers as well. Most of the stanzas were explained in the previous SRv6 blog posts, so we will concentrate here only on a few new aspects of the configuration specific to Inter-AS Option C.
As mentioned earlier, in Inter-AS Option C the L3VPN service prefixes must be sent without changing the NEXT_HOP attribute. This is the default behavior for iBGP, but not for eBGP, where NEXT_HOP attribute is changed to the local address of the BGP session. To prevent this change, dedicated configuration is required (lines 53-54). In the particular example of this blog post, this configuration is, though, not strictly needed, as there are direct eBGP sessions between PE routers. However, if eBGP sessions distributing L3VPN prefixes are established between RRs, as mentioned earlier in the introduction, explicit change of NEXT_HOP attribute on RRs would be required.
Further, as already discussed in the context of exchanging transport prefixes over eBGP sessions, we need some explicit import (line 57) and export (line 72) policies in eBGP group. Import BGP policy (lines 2-4) is the same as already discussed. Export BGP policy (lines 5-23) is some simple policy to advertises RT constraints NLRIs, as well as local L3VPN prefixes. In the test lab, OSPF v3 is used as PE-CE protocol (configuration not shown for brevity), so as we remove OSPF Route Type extended community (line 20), so that information about original OSPF Route Type or OSPF area is lost. In that way, on remote PE the prefixes are imported into OSPF as Type 5 (external) prefixes. Depending on the actual deployment, this BGP export policy might differ significantly between use cases.
Verification of L3VPN prefix distribution
Having this basic configuration in place, let’s have a look, if L3VPN prefix distribution works properly (CLI-Output 7).
1 kszarkowicz@PE11> show bgp summary
2 (…)
3 Peer AS InPkt OutPkt OutQ Flaps Last Up/Dwn State|#Active/Received/Accepted/Damped...
4 2001:db8:bad:cafe:100::1 65501 9454 9421 0 0 2d 23:33:28 Establ
5 inet6.0: 4/4/4/0
6 2001:db8:bad:cafe:100::2 65501 9454 9421 0 0 2d 23:33:22 Establ
7 inet6.0: 4/4/4/0
8 2001:db8:bad:cafe:100::12 65501 9469 9476 0 0 2d 23:33:14 Establ
9 bgp.rtarget.0: 0/4/4/0
10 bgp.l3vpn.0: 10/10/10/0
11 bgp.l3vpn-inet6.0: 10/10/10/0
12 RI-VRF20.inet.0: 0/5/5/0
13 RI-VRF30.inet.0: 0/5/5/0
14 RI-VRF20.inet6.0: 0/5/5/0
15 RI-VRF30.inet6.0: 0/5/5/0
16 2001:db8:bad:cafe:200::21 65502 9452 9451 0 0 2d 23:33:13 Establ
17 bgp.rtarget.0: 2/2/2/0
18 bgp.l3vpn.0: 0/10/10/0
19 bgp.l3vpn-inet6.0: 0/10/10/0
20 RI-VRF20.inet.0: 0/5/5/0
21 RI-VRF30.inet.0: 0/5/5/0
22 RI-VRF20.inet6.0: 0/5/5/0
23 RI-VRF30.inet6.0: 0/5/5/0
24 2001:db8:bad:cafe:200::22 65502 9467 9421 0 0 2d 23:33:09 Establ
25 bgp.rtarget.0: 2/2/2/0
26 bgp.l3vpn.0: 0/10/10/0
27 bgp.l3vpn-inet6.0: 0/10/10/0
28 RI-VRF20.inet.0: 0/5/5/0
29 RI-VRF30.inet.0: 0/5/5/0
30 RI-VRF20.inet6.0: 0/5/5/0
31 RI-VRF30.inet6.0: 0/5/5/0
CLI-Output 7: BGP session state on PE11
All expected BGP sessions are up. So, it means transport connectivity between remote PEs is OK (and, we verified that previously with ping). For each VRF we are advertising/receiving five IPv4 and five IPv6 prefixes (lines 12-15, 20-23, 28-31) over each BGP session distributing L3VPN prefixes. However, what is suspicious, we don’t accept any prefix (see ‘0’ accepted prefixes)! Neither from iBGP neighbor (lines 8-15), nor from eBGP neighbor (lines 16-31). So, we need a closer look at it (CLI-Output 8).
1 kszarkowicz@PE11> show route receive-protocol bgp 2001:db8:bad:cafe:200::21 table RI-VRF20.inet.0 hidden detail2
3 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
4 20.21.92.0/24 (2 entries, 0 announced)
5 Import Accepted
6 Route Distinguisher: 198.51.100.21:20
7 VPN Label: 16
8 Nexthop: 2001:db8:bad:cafe:200::21
9 AS path: 65502 I
10 Communities: target:65000:20
11
12 20.22.92.0/24 (2 entries, 0 announced)
13 Import Accepted
14 Route Distinguisher: 198.51.100.21:20
15 VPN Label: 16
16 Nexthop: 2001:db8:bad:cafe:200::21
17 MED: 2
18 AS path: 65502 I
19 Communities: target:65000:20
20
21 192.168.20.21/32 (2 entries, 0 announced)
22 Import Accepted
23 Route Distinguisher: 198.51.100.21:20
24 VPN Label: 16
25 Nexthop: 2001:db8:bad:cafe:200::21
26 AS path: 65502 I
27 Communities: target:65000:20
28
29 192.168.20.22/32 (2 entries, 0 announced)
30 Import Accepted
31 Route Distinguisher: 198.51.100.21:20
32 VPN Label: 16
33 Nexthop: 2001:db8:bad:cafe:200::21
34 MED: 2
35 AS path: 65502 I
36 Communities: target:65000:20
37
38 192.168.20.92/32 (2 entries, 0 announced)
39 Import Accepted
40 Route Distinguisher: 198.51.100.21:20
41 VPN Label: 16
42 Nexthop: 2001:db8:bad:cafe:200::21
43 MED: 1
44 AS path: 65502 I
45 Communities: target:65000:20
CLI-Output 8: L3VPN prefixes received at PE11 from PE21
Hmm. Prefixes received at PE11 from PE21 look pretty OK. Next-hop is set to PE21 loopback (lines 8, 16, 25, 33, 42), there are correct route-targets attached (lines 10, 19, 27, 36, 45). At the first sight, nothing really suspicious. So, let’s look further (CLI-Output 9.)
1 kszarkowicz@PE11> show route 192.168.20.92/32 table RI-VRF20 hidden extensive
2
3 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
4 192.168.20.92/32 (2 entries, 0 announced)
5 BGP Preference: 170/-101
6 Route Distinguisher: 192.169.2.22:20
7 Next hop type: Unusable, Next hop index: 0
8 Address: 0x7b3f394
9 Next-hop reference count: 80, key opaque handle: 0x0, non-key opaque handle: 0x0
10 Source: 2001:db8:bad:cafe:200::22
11 State: <Secondary Hidden Ext ProtectionCand>
12 Local AS: 65501 Peer AS: 65502
13 Age: 35:59 Metric: 1
14 Validation State: unverified
15 Task: BGP_65502.2001:db8:bad:cafe:200::22
16 AS path: 65502 I
17 Communities: target:65000:20
18 Import Accepted
19 VPN Label: 16
20 Localpref: 100
21 Router ID: 198.51.100.22
22 Primary Routing Table: bgp.l3vpn.0
23 Thread: junos-main
24 Indirect next hops: 1
25 Protocol next hop: 2001:db8:bad:cafe:200::22
26 Label operation: Push 16
27 Label TTL action: prop-ttl
28 Load balance label: Label 16: None;
29 Indirect next hop: 0x0 - INH Session ID: 0
30 (…)
CLI-Output 9: Detailed view of remote L3VPN prefix on PE11
Well, there is something wrong with the protocol next hop (line 7). Protocol next hop is the loopback of remote PE (line 25). So, what is wrong with it? Let’s figure it out (CLI-Output 10).
1 kszarkowicz@PE11> show route 2001:db8:bad:cafe:200::22
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 2001:db8:bad:cafe:200::22/128
7 *[BGP/170] 2d 23:57:32, localpref 100, from 2001:db8:bad:cafe:100::1
8 AS path: 65502 I, validation-state: unverified
9 > to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
10 to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
11 [BGP/170] 2d 23:57:32, localpref 100, from 2001:db8:bad:cafe:100::2
12 AS path: 65502 I, validation-state: unverified
13 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
14 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
CLI-Output 10: PE22 loopback visibility on PE11
Well, remote loopback is there. And we can ping remote loopback (as we checked earlier, when discussing transport connectivity between AS-es – CLI-Output 6).
But, looking from L3VPN perspective, the problem is that the remote loopback is only in inet6.0, but not in inet6.3. For L3VPN next-hop resolution, next-hops must be in inet6.3. Saying that, we are now looking at L3VPN over SRv6, so as discussed in SRv6 SID Encoding and Transposition blog post the next-hop resolution happens via SRv6 SID announced together with L3VPN prefix, and not via NEXT_HOP attribute. Therefore, we should rather look for SRv6 SIDs, and not NEXT_HOP attributes, if there is some problem with L3VPN next-hop resolution.
So, let’s see, what SRv6 SIDs we are receiving from remote PE (CLI-Output 8). Ooooops! We don’t receive any (CLI-Output 8)! Does remote PE send at all SRv6 SID (CLI-Output 11)?
1 kszarkowicz@PE22> show route advertising-protocol bgp 2001:db8:bad:cafe:100::11 detail table RI-VRF20.inet.0
2
3 RI-VRF20.inet.0: 7 destinations, 12 routes (7 active, 0 holddown, 0 hidden)
4 * 20.21.92.0/24 (2 entries, 1 announced)
5 BGP group GR-EBGP-L3VPN type External
6 Route Distinguisher: 192.169.2.22:20
7 VPN Label: 16
8 Nexthop: Self
9 Flags: Nexthop Change
10 MED: 2
11 AS path: [65502] I
12 Communities: target:65000:20
13
14 * 20.22.92.0/24 (2 entries, 1 announced)
15 BGP group GR-EBGP-L3VPN type External
16 Route Distinguisher: 192.169.2.22:20
17 VPN Label: 16
18 Nexthop: Self
19 Flags: Nexthop Change
20 AS path: [65502] I
21 Communities: target:65000:20
22
23 * 192.168.20.21/32 (2 entries, 1 announced)
24 BGP group GR-EBGP-L3VPN type External
25 Route Distinguisher: 192.169.2.22:20
26 VPN Label: 16
27 Nexthop: Self
28 Flags: Nexthop Change
29 MED: 2
30 AS path: [65502] I
31 Communities: target:65000:20
32
33 * 192.168.20.22/32 (2 entries, 1 announced)
34 BGP group GR-EBGP-L3VPN type External
35 Route Distinguisher: 192.169.2.22:20
36 VPN Label: 16
37 Nexthop: Self
38 Flags: Nexthop Change
39 AS path: [65502] I
40 Communities: target:65000:20
41
42 * 192.168.20.92/32 (2 entries, 1 announced)
43 BGP group GR-EBGP-L3VPN type External
44 Route Distinguisher: 192.169.2.22:20
45 VPN Label: 16
46 Nexthop: Self
47 Flags: Nexthop Change
48 MED: 1
49 AS path: [65502] I
50 Communities: target:65000:20
CLI-Output 11: PE22 advertisements towards PE11
Well, it doesn’t.
The problem is, that by default, for increased security, BGP Prefix SID (which includes SRv6 SID – please check lines 181-212 in Packet Capture 1 of SRv6 SID Encoding and Transposition blog post) is neither advertised, nor accepted over eBGP session. Advertising/accepting BGP Prefix SID requires explicit configuration, as outlined in Configuration 5.
1 protocols {
2 bgp {
3 group GR-EBGP-L3VPN {
4 advertise-prefix-sid;
5 accept-prefix-sid;
6 }
7 }
8 }
Configuration 5: Enabling advertisements/acceptance of BGP Prefix SID
With this change in place (on all PEs), now we can receive SRv6 SIDs from remote PEs (CLI-Output 12).
1 kszarkowicz@PE11> show route receive-protocol bgp 2001:db8:bad:cafe:200::21 table RI-VRF20.inet.0 hidden detail | match SID
2 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
3 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
4 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
5 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
6 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
CLI-Output 12: SRv6 SIDs received from PE21 on PE11
Now, checking again one of the remote prefixes (CLI-Output 13 and CLI-Output 14) we see some changes.
1 kszarkowicz@PE11> show route receive-protocol bgp 2001:db8:bad:cafe:200::22 table RI-VRF20.inet.0 hidden detail 192.168.20.92/32
2
3 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
4 192.168.20.92/32 (2 entries, 0 announced)
5 Import Accepted MultiNexthop RecvNextHopIgnored
6 Route Distinguisher: 192.169.2.22:20
7 VPN Label: 16896
8 Nexthop: 2001:db8:bad:cafe:200::22
9 MED: 1
10 AS path: 65502 I
11 Communities: target:65000:20
12 SRv6 SID: fc01:200:22:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
CLI-Output 13: Remote L3VPN prefix received from PE
1 kszarkowicz@PE11> show route 192.168.20.92/32 table RI-VRF20 hidden extensive
2
3 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
4 192.168.20.92/32 (2 entries, 0 announced)
5 BGP Preference: 170/-101
6 Route Distinguisher: 192.169.2.22:20
7 Next hop type: Unusable, Next hop index: 0
8 Address: 0x7b3f394
9 Next-hop reference count: 80, key opaque handle: 0x0, non-key opaque handle: 0x0
10 Source: 2001:db8:bad:cafe:200::22
11 State: <Secondary Hidden Ext ProtectionCand>
12 Local AS: 65501 Peer AS: 65502
13 Age: 5:57 Metric: 1
14 Validation State: unverified
15 Task: BGP_65502.2001:db8:bad:cafe:200::22
16 AS path: 65502 I
17 Communities: target:65000:20
18 Import Accepted MultiNexthop RecvNextHopIgnored
19 SRv6 SID: fc01:200:22:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
20 VPN Label: 16896
21 Localpref: 100
22 Router ID: 198.51.100.22
23 Primary Routing Table: bgp.l3vpn.0
24 Thread: junos-main
25 Indirect next hops: 1
26 Protocol next hop: fc01:200:22::
27 Indirect next hop: 0x0 - INH Session ID: 0
28 (…)
CLI-Output 14: Detailed view of remote L3VPN prefix on PE11
BGP NEXT_HOP attribute is still remote loopback (CLI-Output 13, line 8). However, next-hop resolution is based on SRv6 SID (CLI-Output 13, line 12, and CLI-Output 14, lines 19 and 26), not on the NEXT_HOP attribute. Why it is like that was explained in SRv6 SID Encoding and Transposition blog post. These are good news.
Troubleshooting of L3VPN Next-Hop Resolution
However, the bad news is, the next-hop is still not resolved (CLI-Output 14, line7). So, we need to troubleshoot further (CLI-Output 15).
1 kszarkowicz@PE11> show route fc01:200:22::
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 fc01:200:22::/48 *[BGP/170] 3d 02:52:03, localpref 100, from 2001:db8:bad:cafe:100::1
7 AS path: 65502 I, validation-state: unverified
8 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
9 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
10 [BGP/170] 3d 02:52:03, localpref 100, from 2001:db8:bad:cafe:100::2
11 AS path: 65502 I, validation-state: unverified
12 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
13 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
CLI-Output 15: PE22 SRv6 Locator visibility on PE11
Remote PE SRv6 locator is present only in inet6.0 RIB (Routing Information Base). For L3VPN next-hop resolution to work, the next-hop must be present in inet6.3 RIB, inet6.0 is not sufficient. As you probably remember from the very first SRv6 blog post (SRv6 Basics: Locator and End SID), SRv6 from local domain are distributed via IS-IS using two TLVs (Type-Length-Value): IPv6 prefix (IS-IS TLV 236: IPv6 IP Reachability), as well as SRv6 locator (IS-IS TLV 27: SRv6 Locator). First TLV is used to install the SRv6 locator into inet6.0 RIB as “normal” IPv6 prefix, while second TLV is used for inet6.3 RIB installation. With BGP, we are distributing both IPv6 loopbacks and SRv6 locators as “normal” IPv6 prefixes. So, what do we do? Let’s keep remote IPv6 loopbacks in inet6.0 RIB only, while install SRv6 locators into both inet6.0 and inet6.3 with a RIB group (Configuration 6).
1 policy-options {
2 policy-statement PS-IPV6-TRANSPORT {
3 term TR-LOCATOR {
4 from community CM-LOCATOR;
5 then accept;
6 }
7 then reject;
8 }
9 community CM-LOCATOR members *:1002;
10 }
11 routing-options {
12 rib-groups {
13 RG-IPV6-TRANSPORT {
14 import-rib [ inet6.0 inet6.3 ];
15 import-policy PS-IPV6-TRANSPORT;
16 }
17 }
18 }
19 protocols {
20 bgp {
21 group GR-IBGP-IPV6-TRANSPORT {
22 family inet6 {
23 unicast {
24 rib-group RG-IPV6-TRANSPORT;
25 }
26 }
27 }
28 }
29 }
Configuration 6: Installing remote SRv6 locators in inet6.3 RIB
The configuration has a RIB group (lines 12-17) listing two RIBs: inet6.0 (primary import RIB) and inet6.3 (secondary import RIB). This RIB group is attached to BGP IPv6 unicast address family (line 22-26). Thus, all received IPv6 unicast prefixes are undergoing the import route policy used by the RIB group (lines 1-10 and 15). Prefixes rejected by the route policy are installed into primary RIB only (inet6.0), while prefixes accepted by the route policy are installed into both primary and secondary RIBs (both inet6.0 and inet6.3).
So, what is the route policy doing? It accepts prefixes with community CM-LOCATOR. Which are our SRv6 locators, as we are attaching different communities for IPv6 loopbacks and SRv6 locators. Please check Configuration 2, line 21 and 28, as well as CLI-Output 1, line 17 as well.
This configuration is required only on PE routers. P (or ASBR) routers do not host L3VPN services, hence they do not require next-hop resolution via inet6.3 RIB.
Did this configuration change help? (CLI-Output 16).
1 kszarkowicz@PE11> show route fc01:200:22::
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 fc01:200:22::/48 *[BGP/170] 3d 03:32:16, localpref 100, from 2001:db8:bad:cafe:100::1
7 AS path: 65502 I, validation-state: unverified
8 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
9 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
10 [BGP/170] 3d 03:32:16, localpref 100, from 2001:db8:bad:cafe:100::2
11 AS path: 65502 I, validation-state: unverified
12 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
13 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
14
15 inet6.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
16 + = Active Route, - = Last Active, * = Both
17
18 fc01:200:22::/48 *[BGP/170] 00:31:37, localpref 100, from 2001:db8:bad:cafe:100::1
19 AS path: 65502 I, validation-state: unverified
20 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
21 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
22 [BGP/170] 00:31:37, localpref 100, from 2001:db8:bad:cafe:100::2
23 AS path: 65502 I, validation-state: unverified
24 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
25 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
CLI-Output 16: PE22 SRv6 Locator visibility on PE11
Yes. It did! Remote SRv6 locator is now in both inet6.0 and inet6.3. Just for comparison, let’s check remote IPv6 loopback (CLI-Output 17).
1 kszarkowicz@PE11> show route 2001:db8:bad:cafe:200::22
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 2001:db8:bad:cafe:200::22/128
7 *[BGP/170] 3d 03:55:30, localpref 100, from 2001:db8:bad:cafe:100::1
8 AS path: 65502 I, validation-state: unverified
9 > to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
10 to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
11 [BGP/170] 3d 03:55:30, localpref 100, from 2001:db8:bad:cafe:100::2
12 AS path: 65502 I, validation-state: unverified
13 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
14 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
CLI-Output 17: PE22 IPv6 Locator visibility on PE11
It looks, our RIB group works correctly. Remote IPv6 loopbacks are installed only into inet6.0, while remote SRv6 locators are installed into both inet6.0 and inet6.3. So far, so good.
Tunneling remote SRv6 Locators over SRv6 tunnels
Returning to our original problem of next-hop resolution of the L3VPN prefix. Did it help? Let’s check (CLI-Output 18).
1 kszarkowicz@PE11> show route 192.168.20.92/32 table RI-VRF20 hidden extensive
2
3 RI-VRF20.inet.0: 12 destinations, 22 routes (7 active, 0 holddown, 10 hidden)
4 192.168.20.92/32 (2 entries, 0 announced)
5 BGP Preference: 170/-101
6 Route Distinguisher: 192.169.2.22:20
7 Next hop type: Unusable, Next hop index: 0
8 Address: 0x7b3f394
9 Next-hop reference count: 88, key opaque handle: 0x0, non-key opaque handle: 0x0
10 Source: 2001:db8:bad:cafe:200::22
11 State: <Secondary Hidden Ext ProtectionCand>
12 Local AS: 65501 Peer AS: 65502
13 Age: 3:04:34 Metric: 1
14 Validation State: unverified
15 Task: BGP_65502.2001:db8:bad:cafe:200::22
16 AS path: 65502 I
17 Communities: target:65000:20
18 Import Accepted MultiNexthop RecvNextHopIgnored
19 SRv6 SID: fc01:200:22:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
20 VPN Label: 16896
21 Localpref: 100
22 Router ID: 198.51.100.22
23 Primary Routing Table: bgp.l3vpn.0
24 Thread: junos-main
25 Indirect next hops: 1
26 Protocol next hop: fc01:200:22::
27 Indirect next hop: 0x0 - INH Session ID: 0
28 (…)
CLI-Output 18: Detailed view of remote L3VPN prefix on PE11
Surprise, surprise – it didn’t! There is still some problem with the next-hop (lines 7 and 18), although our next-hop (SRv6 locator) is present in the inet6.3 RIB. So, what is still wrong here?
The problem is, despite installing some prefixes in inet6.3 RIB, it doesn’t make them ‘SRv6-tunneling capable’. For comparison, let’s look at some local (not remote) SRv6 locator (CLI-Output 19).
1 kszarkowicz@PE11> show route fc01:100:12::
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 fc01:100:12::/48 *[IS-IS/18] 3d 16:42:01, metric 2000
7 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
8 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0
9
10 inet6.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
11 + = Active Route, - = Last Active, * = Both
12
13 fc01:100:12::/48 *[SRV6-ISIS/14] 3d 16:42:01, metric 2000
14 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0, SRV6-Tunnel, Dest: fc01:100:12::
15 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0, SRV6-Tunnel, Dest: fc01:100:12::
CLI-Output 19: PE12 SRv6 Locator visibility on PE11
If you compare CLI-Output 16 and CLI-Output 19, you see the major difference in inet6.3. Remote domain SRv6 locator (CLI-Output 16, lines 18-25) currently resolves over plain IPv6 next-hop, while local domain SRv6 locator (CLI-Output 19, lines 13-15) resolves over SRv6 tunnel. For that reason, in the current state in the example network, L3VPN resolution with local domain SRv6 works, while L3VPN resolution with remote domain SRv6 locator fails – we need SRv6 tunnel here.
So, how do we force remote SRv6 locator to use SRv6 tunnel (so that it is eligible as next-hop for L3VPN), and not plain IPv6 next-hop? To answer this question, let’s go back to almost the beginning of this blog post. In Configuration 3, on ASBRs (P1, P2, P3, P4), next-hop self is configured for all IPv6 prefixes (meaning all remote IPv6 loopbacks and remote SRv6 locators) advertised over iBGP sessions (line 30). This is classical Inter-AS Option C configuration. As the result of this action, these IPv6 prefixes (both remote IPv6 loopbacks and remote SRv6 locators) are advertised with ASBR’s loopback as NEXT_HOP attribute (e.g., CLI-Output 4, lines 18, 25, 32, 39). It also means, both remote IPv6 loopbacks and remote SRv6 locators, when finally arriving to a PE, are resolved over IPv6 loopback of ASBR. IPv6 loopbacks are not capable for SRv6 tunneling – we have SRv6 locators for SRv6 tunneling. Installing remote SRv6 locators in inet6.3 doesn’t change the resolution scheme – they are still resolved over non-SRv6-tunneling capable IPv6 loopback of ASBR.
Thus, to change this, we need to change the way SRv6 locators are advertised by ASBRs over iBGP sessions. Instead of using simply ‘next-hop self’, which causes the NEXT_HOP attribute being set to the source address of the iBGP session (i.e., local loopback), for SRv6 locators we need to change the NEXT_HOP attribute not to local loopback, but to local SRv6 locator (Configuration 7).
1 [edit policy-options policy-statement PS-IBGP-IPV6-EXP]
2 term TR-LOCAL-AS { ... }
3 + term TR-LOOPBACK {
4 + from {
5 + rib inet6.0;
6 + community CM-LOOPBACK;
7 + }
8 + then {
9 + next-hop self;
10 + accept;
11 + }
12 + }
13 + term TR-LOCATOR {
14 + from {
15 + protocol bgp;
16 + rib inet6.0;
17 + community CM-LOCATOR;
18 + }
19 + then {
20 + next-hop fc01:100:1::;
21 + accept;
22 + }
23 + }
24 - term TR-REMOTE-AS {
25 - from {
26 - protocol bgp;
27 - rib inet6.0;
28 - }
29 - then {
30 - next-hop self;
31 - accept;
32 - }
33 - }
34 [edit policy-options]
35 + community CM-LOCATOR members *:1002;
36 + community CM-LOOPBACK members *:1001;
37 }
Configuration 7: iBGP export policy on P1
The new iBGP export policy on ASBRs replaces the iBGP export policy used so far (defined in Configuration 3, lines 19-35). The difference is that the term TR-REMOTE-AS is split into two terms: TR-LOOPBACK and TR-LOCATOR. Remote IPv6 loopbacks are readvertised with ‘next-hop self’ action, as previously. However, remote SRv6 locators are readvertised on ASBRs with ‘next-hop <local-SRv6-END-SID>’ action.
Let’s check if that finally helped with L3VPN resolution (CLI-Output 20).
1 kszarkowicz@PE11> show route receive-protocol bgp 2001:db8:bad:cafe:100::1 table inet6
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 Prefix Nexthop MED Lclpref AS path
5 2001:db8:bad:cafe:200::21/128
6 * 2001:db8:bad:cafe:100::1 100 65502 I
7 2001:db8:bad:cafe:200::22/128
8 * 2001:db8:bad:cafe:100::1 100 65502 I
9 * fc01:200:21::/48 fc01:100:1:: 100 65502 I
10 * fc01:200:22::/48 fc01:100:1:: 100 65502 I
11
12 inet6.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
13 Prefix Nexthop MED Lclpref AS path
14 * fc01:200:21::/48 fc01:100:1:: 100 65502 I
15 * fc01:200:22::/48 fc01:100:1:: 100 65502 I
CLI-Output 20: Receiving IPv6 prefixes at PE11
Well, we certainly see some changes. Remote IPv6 loopbacks have ASBR’s loopback as NEXT_HOP attribute (lines 6 and 8), while remote SRv6 locators have ASBR’s SRv6 locator as NEXT_HOP attribute (lines 9, 10, 14, 15). Please compare previous similar output on PE21 (CLI-Output 4), where NEXT_HOP attribute for both loopbacks and SRv6 locators was the loopback of ASBR.
Does it make remote SRv6 locators SRv6-tunelling capable? Let’s check (CLI-Output 21).
1 kszarkowicz@PE11> show route fc01:200:22:: active-path
2
3 inet6.0: 27 destinations, 31 routes (27 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 fc01:200:22::/48 *[BGP/170] 00:29:58, localpref 100, from 2001:db8:bad:cafe:100::1
7 AS path: 65502 I, validation-state: unverified
8 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0, SRV6-Tunnel, Dest: fc01:100:1::
9 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0, SRV6-Tunnel, Dest: fc01:100:2::
10
11 inet6.3: 5 destinations, 7 routes (5 active, 0 holddown, 0 hidden)
12 + = Active Route, - = Last Active, * = Both
13
14 fc01:200:22::/48 *[BGP/170] 00:29:58, localpref 100, from 2001:db8:bad:cafe:100::1
15 AS path: 65502 I, validation-state: unverified
16 to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0, SRV6-Tunnel, Dest: fc01:100:1::
17 > to fe80::5604:dff:fe00:82a6 via ge-0/0/2.0, SRV6-Tunnel, Dest: fc01:100:2::
CLI-Output 21: PE22 SRv6 Locator visibility on PE11
Well. Now looks better (compared to CLI-Output 16), as now remote SRv6 locator is resolved over the SRv6 tunnel towards the SRv6 END SID of ASBR (lines 8, 9, 16, 17).
And, did it eventually resolve the problem with L3VPN resolution? Let’s check that as well (CLI-Output 22 and CLI-Output 23)
1 kszarkowicz@PE11> show route 192.168.20.92/32 table RI-VRF20 active-path
2 (…)
3 192.168.20.92/32 @[BGP/170] 00:37:32, MED 1, localpref 100, from 2001:db8:bad:cafe:200::21
4 AS path: 65502 I, validation-state: unverified
5 > to fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0, SRV6-Tunnel, Dest: fc01:100:1::
CLI-Output 22: View of remote L3VPN prefix on PE11
1 kszarkowicz@PE11> show route 192.168.20.92/32 table RI-VRF20 active-path extensive
2
3 RI-VRF20.inet.0: 12 destinations, 23 routes (12 active, 0 holddown, 0 hidden)
4 192.168.20.92/32 (3 entries, 2 announced)
5 State: <CalcForwarding>
6 TSI:
7 KRT in-kernel 192.168.20.92/32 -> {list:composite(686), composite(690)}
8 OSPF3 realm ipv4-unicast area : 0.0.0.0, LSA ID : 0.0.0.4, LSA type : Extern
9 @BGP Preference: 170/-101
10 Route Distinguisher: 198.51.100.21:20
11 Next hop type: Indirect, Next hop index: 0
12 Address: 0x7b41af4
13 Next-hop reference count: 10, key opaque handle: 0x0, non-key opaque handle: 0x0
14 Source: 2001:db8:bad:cafe:200::21
15 Next hop type: Chain, Next hop index: 687
16 Next hop: via Chain Tunnel Composite, SRv6
17 Next hop: ELNH Address 0x7b40d64, selected
18 SRV6-Tunnel: Reduced-SRH Encap-mode Remove-Last-Sid
19 Src: 2001:db8:bad:cafe:100::11 Dest: fc01:100:1::
20 Segment-list[0] fc01:100:1::
21 Next hop type: Router, Next hop index: 691
22 Address: 0x7b40d64
23 Next-hop reference count: 7, key opaque handle: 0x0, non-key opaque handle: 0x0
24 Next hop: fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0 weight 0x1
25 Protocol next hop: fc01:200:21::
26 Composite next hop: 0x74cbba0 686 INH Session ID: 370
27 Indirect next hop: 0x7e5c3c4 1048580 INH Session ID: 370
28 State: <Secondary Active Ext ProtectionCand>
29 Local AS: 65501 Peer AS: 65502
30 Age: 38:59 Metric: 1 Metric2: 1000
31 Validation State: unverified
32 ORR Generation-ID: 0
33 Task: BGP_65502.2001:db8:bad:cafe:200::21
34 Announcement bits (1): 2-RI-VRF20-OSPF3
35 AS path: 65502 I
36 Communities: target:65000:20
37 Import Accepted MultiNexthop RecvNextHopIgnored
38 SRv6 SID: fc01:200:21:: Behavior: 19 BL: 48 NL: 0 FL: 16 AL: 0 TL: 16 TO: 48
39 VPN Label: 16896
40 Localpref: 100
41 Router ID: 198.51.100.21
42 Primary Routing Table: bgp.l3vpn.0
43 Thread: junos-main
44 Composite next hops: 1
45 Protocol next hop: fc01:200:21:: Metric: 1000
46 Composite next hop: 0x74cbba0 686 INH Session ID: 370
47 Indirect next hop: 0x7e5c3c4 1048580 INH Session ID: 370
48 Indirect path forwarding next hops: 1
49 Next hop type: Chain
50 Next hop: fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
51 fc01:200:21::/48 Originating RIB: inet6.3
52 Metric: 1000 Node path count: 1
53 Indirect next hops: 1
54 Protocol next hop: fc01:100:1:: Metric: 1000
55 Inode flags: 0x204 path flags: 0x80
56 Path fnh link: 0x74c7820 path inh link: 0x7263280
57 Indirect next hop: 0x7e5ab44 1048575 INH Session ID: 367
58 Indirect path forwarding next hops: 1
59 Next hop type: Chain
60 Next hop: fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
61 fc01:100:1::/48 Originating RIB: inet6.3
62 Metric: 1000 Node path count: 1
63 Forwarding nexthops: 1
64 Next hop type: Chain
65 Next hop: fe80::5604:dff:fe00:2ba0 via ge-0/0/1.0
CLI-Output 23: Detailed view of remote L3VPN prefix on PE11
Wow! We did it! Remote L3VPN prefix is no longer hidden. It is resolved via SRv6 chain tunnel composite next-hop (line 16), using SRv6 tunnel sourced from local loopback and destined to ASBR SRv6 locator (lines 18-20).
And, just final check, to verify forwarding (CLI-Output 24).
1 kszarkowicz@CE91> ping 192.168.20.92 routing-instance RI-20 count 1
2 PING 192.168.20.92 (192.168.20.92): 56 data bytes
3 64 bytes from 192.168.20.92: icmp_seq=0 ttl=62 time=6.232 ms
4
5 --- 192.168.20.92 ping statistics ---
6 1 packets transmitted, 1 packets received, 0% packet loss
7 round-trip min/avg/max/stddev = 6.232/6.232/6.232/0.000 ms
CLI-Output 24: CE to CE verification
Uff! Together, we did it! CE-to-CE forwarding in SRv6 Inter-AS Option C finally works.
Inter-AS Option C with pure transit P routers
However, we are still not ready, yet. Let’s modify slightly the topology, by shutting down few links (Figure 3).
Figure 3: Modified topology
With this modification, the path within AS 65501 has one router in PE role (PE11), two pure transit (P role) routers (P2 and PE12) and one router in ASBR role (P1).
Does the ping between CEs still work (CLI-Output 25).
1 kszarkowicz@CE91> ping 192.168.20.92 routing-instance RI-20 count 1
2 PING 192.168.20.92 (192.168.20.92): 56 data bytes
3
4 --- 192.168.20.92 ping statistics ---
5 1 packets transmitted, 0 packets received, 100% packet loss
CLI-Output 25: CE to CE verification
Bad luck. After this change, ping doesn’t work any longer. So, no time to relax, as there is still job to do!
For troubleshooting, let’s start a ping with 10 pps (packets per second), so that we can observe, where the packets are dropped (CLI-Output 26).
1 kszarkowicz@CE91> ping 192.168.20.92 routing-instance RI-20 count 100000 interval 0.1
2 PING 192.168.20.92 (192.168.20.92): 56 data bytes
CLI-Output 26: CE to CE ping with 10 pps
And observe on the path from CE91 to CE92, which router drops the packets using ‘monitor interface traffic’ command (CLI-Output 27).
1 PE11 Seconds: 946 Time: 03:25:55
2
3 Interface Link Input packets (pps) Output packets (pps)
4 ge-0/0/0 Up 7753 (10) 889 (1)
5 lc-0/0/0 Up 0 0
6 pfh-0/0/0 Up 0 0
7 ge-0/0/1 Down 915 (0) 1001 (0)
8 ge-0/0/2 Up 2593 (0) 9588 (10)
9
10
11
12 P2 Seconds: 1009 Time: 03:27:14
13
14 Interface Link Input packets (pps) Output packets (pps)
15 ge-0/0/0 Up 841 (0) 1042 (0)
16 lc-0/0/0 Up 0 0
17 pfh-0/0/0 Up 0 0
18 ge-0/0/1 Up 10489 (10) 2732 (0)
19 ge-0/0/2 Up 2742 (1) 10469 (10)
20
21
22
23 PE12 Seconds: 0 Time: 03:27:54
24
25 Interface Link Input packets (pps) Output packets (pps)
26 ge-0/0/0 Up 361 (0) 920 (0)
27 lc-0/0/0 Up 0 0
28 pfh-0/0/0 Up 0 0
29 ge-0/0/1 Up 3121 (2) 11134 (10)
30 ge-0/0/2 Up 10916 (10) 2810 (1)
31
32
33
34 P1 Seconds: 1045 Time: 03:28:13
35
36 Interface Link Input packets (pps) Output packets (pps)
37 ge-0/0/0 Down 826 (0) 841 (0)
38 lc-0/0/0 Up 0 0
39 pfh-0/0/0 Up 0 0
40 ge-0/0/1 Up 1001 (0) 1144 (0)
41 ge-0/0/2 Up 11329 (10) 3155 (0)
42 ge-0/0/3 Up 720 (0) 832 (0)
CLI-Output 27: Traffic volumes at different routers in AS 65501
On routers PE11, P2 and PE12 we see 10 pps coming in, and 10 pps going out. So, we are good there. However, on router P1, 10 pps is coming, but there is no outgoing traffic. Thus, P1 is dropping. We need to figure out, what is the reason for P1 to drop the traffic.
For that, let’s capture the transit traffic (how to capture transit traffic is subject for another blog post) on couple of links, so that we can see the encapsulation of the original CE-to-CE packet (Figure 4, Figure 5)
Figure 4: Traffic capture on PE11-P2 link
Figure 5: Traffic capture on PE12-P1 link
What we can see (Figure 4) is that PE11 encapsulates the original IPv4 packet with single IPv6 header destined to END.DT4 SID on PE21. So, SRv6 tunnel to P1 SRv6 END SID is effectively not used. If you go back to CLI-Output 23, you can see that “Reduced-SRH Encap-mode Remove-Last-Sid” (line 18) is used. Therefore last SID, fc01:100:1:: (line 20), which is END SID of P1, is removed from the encapsulation before packet is sent out.
Now, packet with destination address fc01:200:21:420:: (END.DT4 SID on PE21) is sent to P2. Let’s check what routing instructions are there on P2 to route such packet (CLI-Output 28).
1 kszarkowicz@P2> show route fc01:200:21:420::
2
3 inet6.0: 28 destinations, 32 routes (28 active, 0 holddown, 0 hidden)
4 + = Active Route, - = Last Active, * = Both
5
6 fc01:200:21::/48 *[BGP/170] 00:27:50, localpref 100, from 2001:db8:bad:cafe:100::1
7 AS path: 65502 I, validation-state: unverified
8 > to fe80::5604:dff:fe00:1060 via ge-0/0/2.0, SRV6-Tunnel, Dest: fc01:100:1::
CLI-Output 28: PE21 SRv6 Locator visibility on P2
The instruction is to send the packet via SRv6 tunnel to the SRv6 END SID of P1 (line 8). This is what can be seen in Figure 5 – router P2 pushes additional header. Why is this instruction? You probably still remember, this is the result of Configuration 7.
Now, such packet arrives to P1. Destination address of the packet is END SID of P1, which means packet is subject to local consumption, and not forwarded further (please check SRv6 Basics: Locator and End SIDs blog post). And exactly this happens on P1 – packet is not forwarded.
So, what can we do? END SIDs can be advertised with different flavors (draft-ietf-lsr-isis-srv6-extensions, RFC8986):
- PSP – Penultimate Segment Pop
- USP – Ultimate Segment Pop
- USD – Ultimate Segment Decapsulation
Flavor can be explicitly configured. Junos SRv6 implementation supports all 3 flavors, so it can support SRv6 deployments requiring these flavors.
For the particular case observed in this blog post, PSP is of particular interest. With support for PSP advertised by P1, the penultimate node (PE12 in our example) will pop the outer segment (header added by P2). With this, packet will arrive to P1 with only one IPv6 header, as originally generated by PE11 (Figure 4).
In general, it is advisable to enable advertisement for support of all three flavors on all routers, so that configuration is prepared for different use cases requiring different flavors (Configuration 8).
1 protocols {
2 isis {
3 source-packet-routing {
4 srv6 {
5 locator SL-000 {
6 end-sid fc01:100:2:: {
7 flavor {
8 psp;
9 usp;
10 usd;
11 }
12 }
13 }
14 }
15 }
16 }
17 }
Configuration 8: Enabling advertisement of support for PSP, USP, USD END SID flavors
As the result, IS-IS advertises the support for configured END SID flavors (CLI-Output 29).
1 kszarkowicz@PE12> show isis database P2 extensive | match "locator| SID"
2 SRv6 Locator: fc01:100:2::/48, Metric: 0, MTID: 0, Flags: 0x0, Algorithm: 0
3 SRv6 SID: fc01:100:2::, Flavor: PSP, USP, USD
CLI-Output 29: IS-IS advertisement of END SID flavors
Now, all routers in the IS-IS domain (including PE12 router) know, that P2 supports PSP flavor (among other flavors) for its END SID (line 3). So, PE12 will can safely remove other header before sending the packet to P1. With this configuration change, finally CE-to-CE data plane connectivity works as well for the use case with transit P routers.
1 kszarkowicz@CE91> ping 192.168.20.92 routing-instance RI-20 count 1
2 PING 192.168.20.92 (192.168.20.92): 56 data bytes
3 64 bytes from 192.168.20.92: icmp_seq=0 ttl=62 time=8.894 ms
4
5 --- 192.168.20.92 ping statistics ---
6 1 packets transmitted, 1 packets received, 0% packet loss
7 round-trip min/avg/max/stddev = 8.894/8.894/8.894/0.000 ms
CLI-Output 30: CE to CE verification
Next steps
In the next blog post we will show an interesting use case, showing guaranteed link slicing with SRv6, where Function field is further divided to carry Slice ID and VPN ID.
Useful links
RFC4364: BGP/MPLS IP Virtual Private Networks
RFC 8212: Default External BGP (EBGP) Route Propagation Behavior without Policies
RFC8986: Segment Routing over IPv6 (SRv6) Network Programming
draft-ietf-lsr-isis-srv6-extensions: IS-IS Extensions to Support Segment Routing over IPv6 Dataplane
SRv6 in Junos: https://www.juniper.net/documentation/us/en/software/junos/is-is/topics/topic-map/infocus-isis-srv6-network-programming.html
TechPost 1: SRv6 Basics Locator and End-SIDs - https://community.juniper.net/blogs/krzysztof-szarkowicz/2022/06/29/srv6-basics-locator-and-end-sids
TechPost 2: L3VPN on SRv6 - https://community.juniper.net/blogs/krzysztof-szarkowicz/2022/08/11/l3vpn-over-srv6
TechPost 3: SRv6 Summarisation - https://community.juniper.net/blogs/krzysztof-szarkowicz/2022/09/28/srv6-summarization
TechPost 4: SRv6 SID Encoding and Transposition - https://community.juniper.net/blogs/krzysztof-szarkowicz/2022/12/02/srv6-sid-encoding-and-transposition
TechPost 5: SRv6 L3VPN Inter-AS Option-C - https://community.juniper.net/blogs/krzysztof-szarkowicz/2023/02/06/srv6-l3vpn-inter-as-option-c
TechPost 6: Link Slicing with MPLS and SRv6 Underlays - https://community.juniper.net/blogs/krzysztof-szarkowicz/2023/05/12/link-slicing-with-mpls-and-srv6-underlays
Glossary
- AS: Autonomous System
- ASBR: Autonomous System Boundary Router
- BGP: Border Gateway Protocol
- CE: Customer Edge
- CLI: Command Line Interface
- eBGP: external Border Gateway Protocol
- iBGP: internal Border Gateway Protocol
- ID: Identifier
- IGP: Interior Gateway Protocol
- Inter-AS: Inter Autonomous System
- IP: Internet Protocol
- IPv4: Internet Protocol version 4
- IPv6: Internet Protocol version 6
- IS-IS: Intermediate System to Intermediate System
- L2: Level 2
- L3VPN: Layer 3 Virtual Private Network
- MPLS: Multiprotocol Label Switching
- NHS: Next Hop Self
- OSPF: Open Shortest Path First
- P: Provider
- PE: Provider Edge
- PSP: Penultimate Segment Pop
- RFC: Request for Comments
- RIB: Routing Information Base
- RR: Route Reflector
- SID: Segment Identifier
- SP: Service Provider
- SRv6: Segment Routing version 6
- TLV: Type Length Value
- USD: Ultimate Segment Decapsulation
- USP: Ultimate Segment Pop
- VPN: Virtual Private Network
Acknowledgements
Many thanks to Anton Elita for his thorough review and suggestions, and Aditya T R for preparing JCL and vLabs topologies.
Feedback
Revision History
Version |
Author(s) |
Date |
Comments |
1 |
Krzysztof Szarkowicz |
Feb 2022 |
Initial publication |
#Routing
#SolutionsandTechnology