Seamless Core Redundancy over Service Provider Network

View Only

last person joined: 4 days ago

Ask questions and share experiences about ACX Series, CTP Series, MX Series, PTX Series, SSR Series, JRR Series, and all things routing, including portfolios and protocols.

Back to discussions

Expand all | Collapse all

Seamless Core Redundancy over Service Provider Network

1. Seamless Core Redundancy over Service Provider Network

0 Recommend
polaris
Posted 11-30-2021 09:54

Reply Reply Privately
Seamless Core Redundancy over Service provider Network

Securely and Transparently Extending MPLS Backbone through Layer 3 VPN Service provider

Salah Al Buraiky and Mohammad I. Al Ghannam

Introduction

Link redundancy is essential for business continuity, especially for critical sites. Large enterprises usually prefer utilizing their own communication physical infrastructure, such as dark-fiber optical links or transmission circuits (TDM or WDM). This, however, is not always feasible due to high cost, land permit restrictions or time constraints. Constructing a fiber optic cable for a distance, say 150 miles (around 250 kilometers), could cost millions of dollars. Obtaining a land permit and performing the construction could take several months.

Service providers offer alternative solutions for a backup link. Nowadays, Layer 2 and Layer 3 VPN (Virtual Private Network) services, based on MPLS (Multi-Protocol Labeled Service) technology, are offered as cost-effective solutions for private enterprise site-to-site connections.

This article shows how to securely extend an enterprise MPLS network through a Service Provider network offering Layer 3 VPN Service. The driver is to establish a physically diverse backup path for a critical site to connect to the enterprise network. Overlay techniques such as IP Security (IPSec) and Generic Routing Encapsulation (GRE) will be used for on-the-wire confidentiality and transportability, while RSVP-TE (ReSerVation Protocol with Traffic Engineering Extensions) will be used to manage failover to the backup link and link restoration.

The configuration samples assume Junos-based routers, but no proprietary protocols or platform specific features have been employed and thus the concepts are applicable to any standard routing platform. The article assumes knowledge of the basic dynamic routing protocols (such as OSPF and BGP), knowledge of signaling (label distribution) protocols such as LDP and RSVP-TE, and a passing familiarity with IPSec and GRE.

I. Problem

A. Network Overview

The site whose availability we are to protect is part of a high speed MPLS enterprise network internally employing Layer 2 VPN technologies (such as VPLS), Layer 3 VPN technologies (routing and forwarding virtualization via VRF, Virtual Routing and Forwarding, tables) and MPLS-based traffic engineering techniques such as link protection and explicit paths. Most sites in the network have two high-capacity Ethernet uplinks backing each other with MPLS-based local protection enabled on them for rapid failover. The network uses OSPF as an IGP (Interior Gateway Protocol) and all sites have RSVP-TE LSP (Label Switched Path) connections to the Data Center.

Like other sites, the site under consideration has a main enterprise-owned fiber optic uplink. For cost and path diversity reasons, however, the backup link has been obtained from a Service Provider through a Layer 3 VPN circuit. We will see in what follows, how, using a variety of overlay techniques, we can transform this Service Provider link into a full-fledged MPLS-capable backbone link.

B. Problem Statement

In many cases, Layer 3 VPN (L3VPN) is the only backup link option. In some cases, it may be the most cost-effective or fastest-to-deploy option. A challenge with the L3VPN service provided by carriers is that, as is, it is a pure IP transport service (like the Internet). Unlike an optical fiber link, a transmission circuit or layer 2 VPN connectivity (such as VPLS), a L3VPN circuit only accepts IP packets. It can't be directly used to transport MPLS-labeled traffic. As such, it can't be used for extending the MPLS backbone of an enterprise and consequently can't be used to carry enterprise MPLS services such as Layer 3 VPN or VPLS traffic. The problem at hand is utilizing L3VPN circuit as a full-fledged MPLS backup link, by utilizing overlay techniques.

Figure 1

II. Solution

A. Solution Outline

The configuration we seek aims primarily at abstracting the carrier's network so that it appears as a standard MPLS-capable link to the router in the site to be protected and the uplink core router on the other side.

Enterprise traffic entering the service provider cloud needs to be able to continue carrying the enterprise assigned MPLS labels in order to support the enterprise's own internal MPLS services.

By default, this isn't possible since the Layer 3 VPN Service offered to the enterprise by the Service Provider is a pure IP transport service that neither accepts nor recognizes labeled traffic coming from the customer.

To overcome this, we'll utilize Generic Routing Encapsulation (GRE) as an overlay capable of transporting MPLS over an IP network (the Service Provider L3VPN Service in our case). An MPLS packet encapsulated within GRE, becomes an IP packet, capable of being carried over an L3VPN circuit, and capable of being protected by IPSec as we'll see.

The traffic carried over the L3VPN leased line needs to be cryptographically protected, since it is transported through a third-party network. For this, we'll employ IP Security (IPSec) tunneling.

We'll specifically utilize IPSec's Encapsulation Security Protocol (ESP) with 3DES (Triple Data Encryption Standard) for encryption and HMAC-MD5 (Message Digest 5 – Hash Message Authentication Code) for integrity protection. We use these for illustration purposes here, but stronger algorithms must be used in practice, such as AES-256 (Advanced Encryption Standard – 256Bits) and SHA-2 (Secure Hash Algorithm 2).

In addition, the solution needs to satisfy the following three failover requirements:

1) It needs to prefer the primary link when it's available,

2) It needs to switch to the backup link as soon as the primary fails and

3) It needs to revert back to the primary link as soon as it is available again after a failure.

We'll use traffic engineering capabilities of RSVP-TE to realize these goals. Since overlay protocols are the essence of our solution, the following section will provide a general overview of their nature and capabilities followed by a more in-depth look.

B. Overview of Overlay Protocols

An overlay is an architectural abstraction used to present a simplified underlay network to upper layers. Overlays are typically realized at the forwarding level using encapsulation protocols such as IPSec, GRE and MPLS. The abstract virtual links created by such protocols are often called tunnels.

IPSec abstracts the underlaying network (the Service Provider Network in our case) as a secure link that can carry IP traffic. The GRE tunnel is, within our solution, established within the IPSec tunnel. It abstracts the service provider network as a single, MPLS-capable link (see figure 1).

The router at the site under consideration and the core router connected to it at the other end over the Service Provider network, see each other as directly connected neighbors. The link between them is the GRE tunnel. The GRE tunnel, in turn, is seen as an IP link with MPLS encapsulation enabled, and as such can be used as a full-fledged MPLS link employable by RSVP-TE to establish an MPLS Label Switched Path (LSP) with a specified path (explicit path) and path protection. In what follows, we go deeper into each of these technologies.

Both IPSec and GRE encapsulation involve embedding the IP packet within a new IP packet. The original IP header becomes the inner header, while the new header becomes the outer header. In addition to the new IP header, each of IPSec and GRE adds an auxiliary header between the inner and the outer IP headers (see figure 2).

In the case of IPSec, this additional header could be the authentication Header (AH) or the Encapsulation Security Header (ESP). In the case of GRE, the auxiliary header is simply called the GRE header. Note that IPsec can encapsulate any IP packet, including GRE packets. It can't, however, encapsulate non-pure IP packets, such as Ethernet frames or MPLS-labeled packets.

GRE, on the other hand, can encapsulate a myriad of packet types, including MPLS packets. Its header has a dedicated field indicating the encapsulated header. We need both GRE and IPsec because IPSec can't encapsulate MPLS packets, whereas GRE offers no encryption nor integrity verification.

The security provided by IPSec and the transportability provided by GRE come at a price. They add complexity to configuration and troubleshooting and increase the size of packets. The larger packets resulting from the encapsulation overhead consume additional bandwidth and may make packets larger than the MTU of the links they traverse. The fate of packets larger than the MTU is either to be fragmented or to be dropped. MTU is therefore an aspect must be considered.

Appendices B and C provide guidance for calculating overhead and understanding the impact of exceeding link or network MTU.

Figure 2

C. Solution Protocols In-Depth

I. Overlay Encapsulation Protocols

i. IPSec Protocol

Due to the fact that traffic is transported over a service provider network, encryption is mandatory. IPSec is used to encrypt the data between the two sites. Packets are encrypted and an authentication code is generated for them and then encapsulated into Encapsulation Security Protocol (ESP) headers, the most widely used IPSec protocol. Even though MPLS routers can perform the IPSec encryption and integrity protection themselves, our design has dedicated non-MPLS routers (the CE routers) providing this function. These routers are also used for establishing the BGP peering required for connecting to the service provider's L3VPN circuit.

ii. GRE Protocol

Generic Routing Protocol (GRE) was specified to enable transporting any protocol, whether it is IPv4 unicast, IPvb4 multicast, IPv6 unicast, IPv6 multicast or even MPLS over an IP network. Before GRE, each type of these encapsulations had its own specification. As explained earlier, the necessity of taking to GRE stems from the fact that enterprise MPLS labeled traffic can't be directly sent through the Service provider's network since within the Service provider these labels would have no meaning. Remember also that IPSec can't encapsulate labeled packets as well.

iii. MPLS Protocol

MPLS is often described as a 2.5 Layer protocol or a shim layer protocol. Its header (or stack of headers) is inserted between the layer 2 header (Ethernet) and the IP header. Its header consists of four bytes that carry a 20-bit label that is used for forwarding decisions at each hop. Unlike IP headers, MPLS headers are locally significant and their use and meaning are only known by the router that assigned them. Labels are also added when local protection Traffic Engineering (TE) techniques such as link protection are employed.

(See figure 3 for traffic flow and encapsulation, decapsulation sequence)

Figure 3

II. Traffic Engineering Protocol

Traffic engineering is the act of specifying or influencing the path traffic takes during steady state operation or upon failure. In our example MPLS network, paths are governed primarily by LSPs established by RSVP-TE. By default, RSVP-TE path follows that of the IGP. The goals of our traffic engineering design, in terms of preference, failover and restoration are:

Preference: Avoid using the Service Provider network when the internal high capacity link is available.

Failover: Re-direct traffic to the Service Provider network upon detecting a failure in the internal link.

Restoration: Return traffic to the preferred path as soon as it becomes available again.

In what follows, we'll expound these:

Preference: In order to make the GRE link the least preferable, we set the IGP metric (OSPF link cost) to a high value. This makes the link less preferable (by default) for any RSVP-TE LSP calculated to or from the site. We can, additionally, use MPLS link attributes to specifically exclude the GRE link from the main path.

Failover: When the main link fails, the IGP path will be recalculated and the LSP will be re-signaled to take the backup path (the one that goes through the Service Provider). Some packet loss will be experienced while the IGP converges and the alternative LSP gets signaled. To minimize packet loss during the failover, local protection techniques (such as link protection) can be used or, alternatively, the backup LSP can be pre-signaled and kept in standby mode ready to be used immediately.

Restoration: When the main links becomes available again, the LSP will, if no additional measures are taken, stay on the backup path indefinitely. It will only switch back to the primary if the backup link fails while the main link has returned to being active. We'd like traffic to switch back to the main, better path, as soon as it's revived and thus must employ path re-optimization or configure the main path as an LSP primary path (revertive path).

With re-optimization, the LSP paths are periodically re-evaluated and if a better path (a revived main path that has a lower IGP cost, for example) is detected, it is re-signaled in a smooth (Make Before Break or MBB) manner. The re-optimization can be signaled globally so that it is applied to all LSPs or could be applied per LSP. Enabling re-optimization for the LSP will enable it to revert back to the main, better path, once it gets available.

Another way to go back to the main path once it gets available, is to set the main path as a (revertive) primary path (Junos calls revertive path, primary paths and non-revertive paths secondary paths). This way, the router will keep checking for the availability of the main path and will re-signal it once it's available again.

III. Implementation

A. Implementation Steps

1. Establish BGP Peering with the Service Provider: The first step is to establish BGP peering with the Service Provider in order to make our CE routers part of the Layer 3 VPN assigned to us by the Service Provider. In our design, we have dedicated Customer Edge (CE) Routers for this function. This step accomplishes IP reachability between CE routers in the to-be-protected site and the core uplink site.

2. Establish Encryption over the Layer 3 VPN Circuit: With IP reachability achieved, Internet Key Exchange (IKE) can be used to establish IPSec Security Association (encrypted session). In our design, the CE routers are used for this purpose. An encrypted CE to CE tunnel will be established through the Service Provider's cloud.

3. Establish Generic Routing Encapsulation (GRE): The MPLS routers on each end of the L3VPN circuit are now to be configured with GRE interfaces that are to encapsulate MPLS labeled packets into GRE packets (thus transforming them into IP packets) and sending them to the CE routers where they are to be encrypted, encapsulated into IPSec (ESP) packets and sent to the Service Provider. The pair of GRE interfaces on each side of the L3VPN circuit will appear as a single logical link and the MPLS routers on each side will be directly connected over this overlay link.

4. Enable MPLS Forwarding and core protocols over the GRE Interfaces: Since we now have a direct (overlay) link between the MPLS routers on each side, we can enable MPLS, IGP (OSPF), RSVP-TE on the GRE interfaces and treat the entire Service Provider network like an internal MPLS-capable direct link between our MPLS routers.

5. Establish Main the LSP to the Data Center: With RSVP-TE and MPLS fully functional on the GRE link, LSP paths can be now configured for traffic engineering. We'll focus on one pair of RSVP-TE LSP connections, the LSP connections linking the site to the Data Center. Other end-to-end LSP connections may exist to and from the site, but the concept is the same. The main LSP needs to avoid the GRE link. Setting a high OSPF cost to the GRE interface achieves that (this is true even for a network that uses LDP instead of RSVP-TE) (see figure 4)

As an additional precaution, MPLS RSVP-TE link attributes can be used to exclude any path that contains the GRE link. The way this works is that interfaces are assigned attributes (also called colors or affinities) and the LSP path setup could demand the inclusion or exclusion of one or more of these attributes. The main path for the LSP (the one that is supposed to prefer the internal links rather than the GRE link) will be configured as a revertive one. A revertive path (called an LSP primary path in Junos) is a path that will be reverted to if after it fails, once it becomes available again.

6. Establish Backup the LSP to the Data Center: The backup path, which has no restrictions and thus is allowed to use the GRE link is established as a non-revertive path (called an LSP secondary path in Junos). To pre-signal the path (i.e. signal it before a failure happens) we set the path as a standby path. Pre-signaling (employing the standby mode) makes failover faster and provide a way to monitor the backup path when it is not utilized (see figure 4).

Figure 4

B. Implementation Configuration

In what follows, we'll provide the main Junos commands required to implement the GRE tunneling and the RSVP-TE LSP paths for the main link and the GRE-based backup link. IPSec, which is in our design configured on a pair of dedicated, not necessarily Junos-based routers, is not covered.

The implementation steps are as follows:

1- Activate Linecard Tunnel Interface: Juniper MX routers perform GRE encapsulation and decapsulation in hardware on the linecard and the first step is to activate tunneling support on the linecards.

Commands:

set chassis fpc 0 pic 0 tunnel-services bandwidth 1g

Verification:

show interface terse gr-0/0/10

Comments:

The above activates the tunneling interface on PIC 0 on linecard 0. Since physical interfaces on the PIC are typically assigned port numbers 0 to 9, the GRE interface is assigned port number 10, making its physical interface name (its ifd) gr-0/0/10

2- GRE Tunnel Basic Parameters: The next step after instantiating the GRE interface is to configure its logical parameters, most importantly its tunnel source and destination IP addresses. These are the source and destination of the outer IP header. The source is typically the IP address of the interface leading to the Service Provider (in our case it is the interface connected to the CE responsible for IPSec encryption and peering with the Service Provider).

Commands:

edit interfaces gr-0/0/10

set description lightcity_gr-0/0/10-springville_gr-2/0/10

set unit 0 tunnel source 10.0.0.1

set unit 0 tunnel destination 10.10.0.1

Verification:

show interface gr-0/0/10 | match "IP-Header "

3- GRE MTU-Related Parameters: As mentioned earlier, GRE increases the size of the packet (by 24 bytes, see appendix B). This may lead to the packet exceeding the MTU of the egress link it's supposed to leave through. The GRE interface could be configured to allow fragmenting such packets (dividing them into smaller chunks) and reassembling such packets when received from the other side of the GRE tunnel:

Commands:

edit interfaces gr-0/0/10

set unit 0 tunnel allow-fragmentation

set unit 0 reassemble-packets

set unit 0 clear-don-fragment-bit

4- GRE Class of Service (CoS): GRE encapsulation provides the packet with a new header. To ensure that the encapsulated packet is accorded the CoS treatment of the original packet, the value of the Type of Service (TOS) byte, needs to be copied to the outer header. As a side note, remember that the router performing IPSec encapsulation needs to do a similar copying action since it will add another outer IP header.

Commands:

set unit 0 copy-tos-to-outer-ip-header

Verification:

show interface gr-0/0/10.0 | match copy

5- GRE Keepalive: GRE is stateless. As long is the egress interface it uses is up and as long as the tunnel destination address appears in the routing table, the GRE interface will remain up and any static or direct routes associated with it will remain up. If, for some reason, the destination becomes unreachable while still remaining in the route table, traffic blackholing may occur since traffic will continue to be forwarded through the GRE tunnel. To track the actual status of the GRE tunnel, keepalives may be used. These are special GRE packets (with no payload) that are sent periodically to the other side and are expected to be returned back as an indication of the tunnel's health.

Commands:

set protocols oam gre-tunnel interface gr-0/0/10.0

Verification:

show oam gre-keepalive interface gr-0/0/10.0

6- RSVP-TE LSP Paths: Any LSP to or from the protected site must utilize the internal path when available and switch, upon failure, to the backup path (the path based on the GRE tunnel through the Service Provider). We'd like the LSP to switch back to the main link as soon as it becomes available again after a failure. To achieve this, we establish, as in the sample below, a primary LSP that takes the internal (non-GRE) path and a secondary LSP that is to utilize the Service Provider. In Junos, a primary LSP is a revertive one, meaning that the router will periodically check for its availability and will establish its path whenever available. By default, the RSVP-TE will signal the LSP over the path that is congruent with the IGP calculated path. This will exclude the GRE tunnel, since it has a higher IGP cost. We however, as an extra precaution, use link colors to explicitly exclude the GRE tunnel. We have also chosen to enable link-node protection to minimize traffic loss while the secondary LSP is being signaled and established.

Commands:

edit protocols mpls label-switched-path data-center-to-lightcity

set to 172.12.0.1

set node-link-protection

set primary data-center-to-lightcity-primary admin-group exclude gre-interface

set secondary data-center-to-lightcity-secondary

set path data-center-to-lightcity-primary

set path data-center-to-lightcity-secondary

Conclusion

In this article we have seen how overlay techniques can be used as modular building blocks to overcome limitations of the available transport cloud. GRE in particular is a powerful tool in that regard. It can be used to transport MPLS over an IP-only network, including the Internet, thus extending the MPLS cloud over any network.

Once MPLS has extended over a network, that network becomes available for MPLS services such as Layer 2 VPNs (VPLS) and Layer 3 VPNs. IPSec provides the confidentiality and integrity verification that GRE lacks. In any design that employs overlays and their associated tunnels, MTU issues must be considered. If the design, like in our case, is employing overlays for backup connectivity, then traffic engineering techniques are to be carefully applied to ensure that the main link is preferred, failover occurs as intended and that traffic is restored to the main link once it has recovered.

Our design overcomes, using open standards, the limitation of L3VPN leased circuits but a price is paid in terms of complexity and packet encapsulation overhead (which translates to wasted bandwidth). A new and efficient, albeit proprietary, approach to this problem is offered by 128T, a company acquired in 2020 by Juniper Networks. 128T's routers can achieve the same overlay goals, including encryption and integrity protection, with much less packet overhead by employing NAT and a number of algorithmic techniques to dispense with tunneling.

Appendix A

Configuration

Critical Site

Lightcity (172.12.0.1)

edit protocols mpls label-switched-path lightcity-to-data-center

set to 172.12.100.1

set node-link-protection

set primary lightcity-to-data-center-primary admin-group exclude gre-interface

set secondary lightcity-to-data-center-secondary

set path lightcity-to-data-center-primary

set path lightcity-to-data-center-primary-secondary

Core Site

Springville (172.12.10.1)

edit interfaces gr-2/0/10

set description springville_gr-2/0/10_lightcity_gr-1/0/10

set unit 0 point-to-point

set unit 0 clear-dont-fragment-bit

set unit 0 reassemble-packets

set unit 0 tunnel source 10.10.0.1

set unit 0 tunnel destination 10.0.0.1

set unit 0 tunnel allow-fragmentation

set unit 0 family inet mtu 1390

set unit 0 family inet address 192.168.0.1

set unit 0 family mpls mtu 1378

edit protocols ospf area 0.0.0.0

set interface gr-2/0/10.0 interface-type p2p

set interface gr-2/0/10.0 metric 10000

edit protocols rsvp

set interface gr-2/0/10.0 authentication-key /* SECRET-DATA */

set interface gr-2/0/10.0 link-protection

Data Center

Data-center (172.12.100.1)

edit protocols mpls label-switched-path data-center-to-lightcity

set to 172.12.0.1

set node-link-protection

set primary data-center-to-lightcity-primary admin-group exclude gre-interface

set secondary data-center-to-lightcity-secondary

set path data-center-to-lightcity-primary

set path data-center-to-lightcity-secondary

LIGHTCITY

set chassis fpc 1 pic 0 tunnel-services bandwidth 1g

edit interfaces gr-1/0/10

set description lightcity_gr-1/0/10-springville_gr-2/0/10

set unit 0 tunnel source 10.0.0.1

set unit 0 tunnel destination 10.10.0.1

set unit 0 clear-dont-fragment-bit

set unit 0 tunnel allow-fragmentation

set unit 0 reassemble-packets

set interfaces gr-1/0/10 unit 0 family inet mtu 1390

set interfaces gr-1/0/10 unit 0 family inet address 192.168.0.2/30

set interfaces gr-1/0/10 unit 0 family mpls mtu 1378

OSPF

edit protocols ospf area 0.0.0.0

set interface gr-1/0/10.0 interface-type p2p

set interface gr-1/0/10.0 metric 10000

…

Additional OSPF configuration as needed (BFD, LDP synchronization, authentication, …)

…

edit protocols rsvp

set interface gr-1/0/10.0 authentication-key /* SECRET-DATA */

set interface gr-1/0/10.0 link-protection

…

Additional RSVP-TE configuration as needed.
…

edit protocols mpls label-switched-path lightcity-to-springville-GRE to 10.20.0.1

Appendix B

The Encapsulation Overhead
A. IP Security (IPSec) Overhead

The Triple Data Encryption Standard (3DES) is used for encryption and Message Digest 5 – hash Message Authentication Code (MD5-HMAC) is used for integrity verification. 3DES has a block size of 64 bits (8 bytes), meaning that it encrypts data 8 bytes at a time. In addition to the encryption key (which is generated as part of the key exchange done by IKE) the previous block of data, in the so-called chaining encryption mode, is an input to the encryption.

The first block doesn't have a preceding block and thus an Initialization Vector (IV) is added to the data to be used as a previous block for the first data block. The to-be-encrypted data must be a multiple of the block size and thus must be padded to be so if it isn't.

The byte right after the data (called the Padding Length) specifies the Padding Length. This is followed by a byte-wide field that specifies the Next Header (the protocol embedded within the ESP packet). These two fields, along with the original packet are encrypted before being embedded into the ESP packet. Thus, the size of the data to be encrypted is the size of the original packet, in addition to two bytes with padding added on top of that to make the total a multiple of 8.

A packet of size 1500 bytes will have 2 bytes added to it as ESP Padding Length Field and Next Header Fields. This will make the total to be encrypted data 1502 bytes. Since 1502 is not a multiple of 8 (1502 Modulo 8 ≠ 0), two bytes of padding will be added, making the total to be encrypted 1504. Add to that an Initialization Vector, and the packet will be 1512. Now set the Padding Length Field. The Next Header field and the (random) Initialization Vector (IV) and encrypt the 1512-byte packet.

The encrypted packet will be augmented with an ESP header consisting of 8 bytes: the 4-Byte Security Parameter Index (SPI) which is used as in index to the Security Association Database that tells the router what encryption algorithms and keys to use to process the packet and a 4-Byte Sequence Number (SN) that is used to counter replay attacks. The total with these is now 1520 bytes. Add to that the MD5-HMAC authentication data, which has a fixed size of 12 bytes and the total becomes 1532 bytes. Add to that the outer IP header of 20 bytes, and the IPsec packet becomes 1552 bytes, two of which are padding.

IPsec Overhead=[ Block Size - (Original Packet Size + 2) Modulo ]+ IV + SPI + SN + HMAC + NH + PL + Outer Header=8-[(1500+2)Modulo ] + 8 + 4 + 4 + 12 + 1 + 1 + 20 = 52

The minimum IPsec overhead is 50 bytes, while the maximum is 57 bytes.

B. Generic Routing Encapsulation (GRE) Overhead

GRE (Generic Routing Encapsulation) is a header specification for encapsulating protocols such as MPLS, IPv6, IPv4 Multicast in an IPv4 packet. It enables sending these different types of traffic over an IPv4 network be it a private network, a leased service provider network or even the Internet. In the design at hand, we'll be using it to ship MPLS packets over a service provide network. GRE adds 24 bytes to the encapsulated packet (4-Byte GRE header and 20 bytes for the outer IP header).

GRE header has a source and a destination, jointly taking 8 bytes. There is a 2-byte field that specifies the encapsulated packet (the payload packet, the MPLS encapsulated packet in our case) in addition to other fields such as checksum and flags. Continuing with the example we started with above, the packet with IPSec header (specifically ESP Header) has become 1552 bytes. Adding the GRE encapsulation makes it 1576 bytes.

C. Multilabel Protocol Label Switching (MPLS) Overhead

MPLS is often described as a 2.5 Layer protocol or a shim layer protocol. Its header is inserted between the layer 2 header (Ethernet) and the IP header. Its header consists of four bytes that carry a 20-bit label. Unlike IP headers, MPLS headers are locally significant and their use and meaning are only known by the router that assigned them.

MPLS headers are often stacked on a packet to provide different functions. Frequently, the first label is a VPN label, that enables the packet to be assigned to one of different VPN instances existing on the ultimate PE router. This is often encapsulated within a label that is used to transport the packet across the core (the LSP label or the transport label). Other labels may also exist such as the bypass label added when local protection TE techniques such as link protection are employed.

L3VPN traffic and VPLS traffic will typically have two MPLS headers (each with a label), an inner label (called the service or VPN label) and an outer label (called the transport or LSP label). Adding these two headers to the example packet makes it 1584 bytes. This is the size of the packet that will be handed over to the service provider.

D. Service provider Overhead

Considering the example packet we've been working with so far, the GRE packet that is handed over to the service provider (with its IPsec encapsulated packet that in turn encapsulated an MPLS packet) will have a size of 1584 bytes. The Service provider Network must have a minimum MTU of 1584 or else the packet must be fragmented to chunks each less than the Service provider's supported MTU. Note also that the packet may experience additional encapsulation within the Service provider's cloud. The Service provider may have an MPLS core with a VPN assigned to each customer. In that case, two MPLS headers will be added on top of the GRE packet handed over by the customer.

Appendix C

Maximum Transmission Unit (MTU)

Links, and by extension, network, have a largest allowable packet size, called the Maximum Transmission Unit (MTU). Usually (but not always) the term MTU is used to indicate the maximum without considering the layer 2 Framing overhead (like Ethernet's headers). The added overhead of overlay encapsulations such as IPSec and GRE consumes additional bandwidth and may also lead to the packets exceeding the Maximum Transmission Unit (MTU) of links in the path.

If along the path, the packet encounters a layer 2 forwarding device whose egress link has a lower MTU than that of the packet, the packet is silently dropped.

If the device a layer 3 forwarding device whose egress has a lower MTU than that of the packet, then one of two scenarios could occur:

1- If the device is willing do fragmentation and the packet has a clear DF bit, then the device will divide the large packet into smaller packets (fragments) that are to be reassembled at the destination.

2- If the device is not willing do fragmentation or the packet has a set DF (Don't Fragment) bit, then the device will drop the packet and send back an ICMP "Packet Too Large" message back.

It is often better to fragment packets before they leave to the service provider network such as to have them as small as the smallest MTU allowable within the Service provider network. This can be done by adjusting the egress interface MTU, the interface connecting to the Service provider, with the lower MTU and enabling packets to be fragmented by ensuring that the DF bit is cleared and that router at the edge is willing to perform fragmentation. Another feature that helps adapting packet sizes to the Maximum MTU is the TCP Maximum Segment Size (MSS) adjustment. This could be configured under the family section of the GRE interface in Junos.

Appendix D

A Packet's Journey Across the Tunnel

We assume that the main (non-GRE) path has failed and packets are now forwarded through the Service Provider. In that case the packet's journey looks as follows:

1- In an MPLS network divided into different Layer 3 VPNs ("VRFs"), an IP packet is typically labeled with an inner label (a service label) that indicates the VPN it should be sent to and then an outer label (called a transport label) is stacked over it indicating the LSP it should take to reach the remote PE (the PE hosting the destination of the packet).

2- The packet with its two-label stack will be sent through the GRE interface, which will add a GRE header (indicating that the encapsulated packet is an MPLS packet) and an outer IP header. The source in the outer IP header will be that of the interface leading to the service provider (the interface connected to the CE router). The destination will be the corresponding interface of the router on the other side of the Service provider cloud. The GRE packet will then be sent to the CE router.

3- The CE router will encrypt the GRE packet and add a cryptographically generated integrity check code to it. It will add an ESP header that contains some cryptographic parameters and then will add an outer header. The outer headers source will be that of the local CE router, while the destination will be that of the other CE router (which is in the context of IPsec, its peer gateway). The IPSec packet will then be sent to the Service Provider.

4- The service provider, which is oblivious to the encrypted packet embedded within, to the GRE packet buried in it and to the MPLS packet encpasulkat3ed within GRE, will just forward the packet with the customer's VPN from end to end. The Service Provider will add its own MPLS service and transport labels before sending the packet through its core. These Service Provider labels will be removed before the packet is handed over to the customer on the other side.

5- The CE on the other side will receive an IPSec packet. It will verify its integrity and decrypt it. It will strip the outer header revealing the GRE packet within and will forward the GRE packet to the MPLS router.

6- The MPLS router will strip the outer header and the GRE header, retrieving the MPLS encapsulated packet.

7- The packet will be forwarded toward the Data Center over the MPLS cloud like any packet transported totally within the enterprise network.

Resources

RFC 2784, Generic Routing Encapsulation (GRE): www.ietf.org/rfc/rfc2784.txt

RFC 4303, IP Encapsulating Security Payload (ESP): www.ietf.org/rfc/rfc4303.txt

Thomas, Thomas, Pavlichek, Doris E., Dwyer, Lawrence H., Chowbay, Rajah and Downing, Wayne W. "Juniper Networks Reference Guide: JUNOS Routing, Configuration, and Architecture", Addison-Wesley

Hanks, Richard Douglas and Reynolds, Harry "Juniper MX Series", O'Reilly Media

Juniper Networks Website: http://www.juniper.com/

Disclaimer: The information presented in this article is to provide conceptual understanding.

The opinions expressed are the authors' own and do not reflect the view of their employer. They had made their best effort to verify and lab test the concepts presented but assume no liability or responsibility for any errors.

November 30, 2021 (Tuesday)

------------------------------
Salah Buraiky
------------------------------

Routing

Seamless Core Redundancy over Service Provider Network

1. Seamless Core Redundancy over Service Provider Network