3. OSPF

3.1. In This Chapter

This chapter provides information about configuring the Open Shortest Path First (OSPF) protocol.

Topics in this chapter include:

3.2. Configuring OSPF

Open Shortest Path First (OSPF) is a hierarchical link state protocol. OSPF is an interior gateway protocol (IGP) used within large autonomous systems (ASs). OSPF routers exchange state, cost, and other relevant interface information with neighbors. The information exchange enables all participating routers to establish a network topology map. Each router applies the Dijkstra algorithm to calculate the shortest path to each destination in the network. The resulting OSPF forwarding table is submitted to the routing table manager to calculate the routing table.

When a router is started with OSPF configured, OSPF, along with the routing-protocol data structures, is initialized and waits for indications from lower-layer protocols that its interfaces are functional. Nokia’s implementation of OSPF conforms to OSPF Version 2 specifications presented in RFC 2328, OSPF Version 2 and OSPF Version 3 specifications presented in RFC 2740, OSPF for IPv6. Routers running OSPF can be enabled with minimal configuration. All default and command parameters can be modified.

Changes between OSPF for IPv4 and OSPF3 for IPv6 include the following:

  1. Addressing semantics have been removed from OSPF packets and the basic link-state advertisements (LSAs). New LSAs have been created to carry IPv6 addresses and prefixes.
  2. OSPF3 runs on a per-link basis, instead of on a per-IP-subnet basis.
  3. Flooding scope for LSAs has been generalized.
  4. Unlike OSPFv2, OSPFv3 authentication relies on IPV6's authentication header and encapsulating security payload.
  5. Most packets in OSPF for IPv6 are almost as compact as those in OSPF for IPv4, even with the larger IPv6 addresses.
  6. Most field and packet-size limitations present in OSPF for IPv4 have been relaxed.
  7. Option handling has been made more flexible.

Key OSPF features are:

  1. Backbone areas
  2. Stub areas
  3. Not-So-Stubby areas (NSSAs)
  4. Virtual links
  5. Authentication
  6. Route redistribution
  7. Routing interface parameters
  8. OSPF-TE extensions (Nokia’s implementation allows MPLS fast reroute)

3.2.1. OSPF Areas

The hierarchical design of OSPF allows a collection of networks to be grouped into a logical area. An area’s topology is concealed from the rest of the AS which significantly reduces OSPF protocol traffic. With the proper network design and area route aggregation, the size of the route-table can be drastically reduced which results in decreased OSPF route calculation time and topological database size.

Routing in the AS takes place on two levels, depending on whether the source and destination of a packet reside in the same area (intra-area routing) or different areas (inter-area routing). In intra-area routing, the packet is routed solely on information obtained within the area; no routing information obtained from outside the area is used.

Routers that belong to more than one area are called area border routers (ABRs). An ABR maintains a separate topological database for each area it is connected to. Every router that belongs to the same area has an identical topological database for that area.

3.2.1.1. Backbone Area

The OSPF backbone area, area 0.0.0.0, must be contiguous and all other areas must be connected to the backbone area. The backbone distributes routing information between areas. If it is not practical to connect an area to the backbone (see area 0.0.0.5 in Figure 5) then the ABRs (such as routers Y and Z) must be connected via a virtual link. The two ABRs form a point-to-point-like adjacency across the transit area (see area 0.0.0.4).

Figure 5:  Backbone Area 

3.2.1.2. Stub Area

A stub area is a designated area that does not allow external route advertisements. Routers in a stub area do not maintain external routes. A single default route to an ABR replaces all external routes. This OSPF implementation supports the optional summary route (type-3) advertisement suppression from other areas into a stub area. This feature further reduces topological database sizes and OSPF protocol traffic, memory usage, and CPU route calculation time.

In Figure 5, areas 0.0.0.1, 0.0.0.2 and 0.0.0.5 could be configured as stub areas. A stub area cannot be designated as the transit area of a virtual link and a stub area cannot contain an AS boundary router. An AS boundary router exchanges routing information with routers in other ASs.

3.2.1.3. Not-So-Stubby Area

Another OSPF area type is called a Not-So-Stubby area (NSSA). NSSAs are similar to stub areas in that no external routes are imported into the area from other OSPF areas. External routes learned by OSPF routers in the NSSA area are advertised as type-7 LSAs within the NSSA area and are translated by ABRs into type-5 external route advertisements for distribution into other areas of the OSPF domain. An NSSA area cannot be designated as the transit area of a virtual link.

In Figure 5, area 0.0.0.3 could be configured as a NSSA area.

3.2.1.3.1. OSPF Super Backbone

The 77x0 PE routers have implemented a version of the BGP/OSPF interaction procedures as defined in RFC 4577, OSPF as the Provider/Customer Edge Protocol for BGP/MPLS IP Virtual Private Networks (VPNs). Features included in this RFC includes:

  1. Loop prevention
  2. Handling LSAs received from the CE
  3. Sham links
  4. Managing VPN-IPv4 routes received by BGP

VPRN routes can be distributed among the PE routers by BGP. If the PE uses OSPF to distribute routes to the CE router, the standard procedures governing BGP/OSPF interactions causes routes from one site to be delivered to another in type 5 LSAs, as AS-external routes.

The MPLS VPN super backbone behaves like an additional layer of hierarchy in OSPF. The PE-routers that connect the respective OSPF areas to the super backbone function as OSPF Area Border Routers (ABR) in the OSPF areas to which they are attached. In order to achieve full compatibility, they can also behave as AS Boundary Routers (ASBR) in non-stub areas.

The PE-routers insert inter-area routes from other areas into the area where the CE-router is present. The CE-routers are not involved at any level, nor are they aware of the super backbone or of other OSPF areas present beyond the MPLS VPN super backbone.

The CE always assumes the PE is an ABR:

  1. If the CE is in the backbone, then the CE router assumes that the PE is an ABR linking one or more areas to the backbone.
  2. If the CE in not in the backbone, then the CE believes that the backbone is on the other side of the PE.
  3. As such, the super backbone looks like another area to the CE.
    Figure 6:  PEs Connected to an MPLS-VPN Super Backbone 

In Figure 6, the PEs are connected to the MPLS-VPN super backbone. In order to be able to distinguish if two OSPF instances are in fact the same and require Type 3 LSAs to be generated, or are two separate routing instances where type 5 external LSAs need to be generated, the concept of a domain-id is introduced.

The domain ID is carried with the MP-BGP update and indicates the source OSPF Domain. When the routes are being redistributed into the same OSPF Domain, the concepts of super backbone described above apply and Type 3 LSAs are generated. If the OSPF domain does not match, then the route type will be external.

Configuring the super backbone (not the sham links) makes all destinations learned by PEs with matching domain IDs inter-area routes.

When configuring sham links, these links become intra-area routes if they are present in the same area.

3.2.1.3.2. Sham Links

Figure 7:  Sham Links 

Figure 7 displays the red link between CE-3 and CE-4 could be a low speed OC-3/STM-1 link but because it establishes a intra-area route connection between the CE-3 and CE-4 the potentially high-speed PE-1 to PE-2 connection will not be utilized. Even with a super backbone configuration it is regarded as a inter-area connection.

The establishment of the (green) sham-link is also constructed as an intra-area link between PE routers, a normal OSPF adjacency is formed and the link-state database is exchanged across the MPLS-VPRN. As a result, the desired intra-area connectivity is created, at this time the cost of the green and red links can be managed such that the red link becomes a standby link only in case the VPN fails.

As the sham-link forms an adjacency over the MPLS-VPRN backbone network, be aware that when protocol-protection is enabled in the config>sys>security>cpu-protection>protocol-protection context, the operator must explicit allow the OSPF packets to be received over the backbone network. This performed using the allow-sham-links parameter of the protocol-protection command.

3.2.1.3.3. Implementing the OSPF Super Backbone

With the OSPF super backbone architecture, the continuity of OSPF routing is preserved:

  1. The OSPF intra-area LSAs (type-1 and type-2) advertised bye the CE are inserted into the MPLS-VPRN super backbone by redistributing the OSPF route into MP-BGP by the PE adjacent to the CE.
  2. The MP-BGP route is propagated to other PE-routers and inserted as an OSPF route into other OSPF areas. Considering the PEs across the super backbone always act as ABRs they will generate inter area route OSPF summary LSAs, Type 3.
  3. The inter-area route can now be propagated into other OSPF areas by other customer owned ABRs within the customer site.
  4. Customer Area 0 (backbone) routes when carried across the MPLS-VPRN using MPBGP will appear as Type 3 LSAs even if the customer area remains area 0 (backbone).

A BGP extended community (OSPF domain ID) provides the source domain of the route. This domain ID is not carried by OSPF but carried by MP-BGP as an extended community attribute.

If the configured extended community value matches the receiving OSPF domain, then the OSPF super backbone is implemented.

From a BGP perspective, the cost is copied into the MED attribute.

3.2.1.3.4. Loop Avoidance

If a route sent from a PE router to a CE router could then be received by another PE router from one of its own CE routers then it is possible for routing loops to occur. RFC 4577 specifies several methods of loop avoidance.

3.2.1.3.5. DN-BIT

When a Type 3 LSA is sent from a PE router to a CE router, the DN bit in the LSA options field is set. This is used to ensure that if any CE router sends this Type 3 LSA to a PE router, the PE router will not redistribute it further.

When a PE router needs to distribute to a CE router a route that comes from a site outside the latter's OSPF domain, the PE router presents itself as an ASBR (Autonomous System Border Router), and distributes the route in a type 5 LSA. The DN bit MUST be set in these LSAs to ensure that they will be ignored by any other PE routers that receive them.

DN-BIT loop avoidance is also supported.

3.2.1.3.6. Route Tag

If a particular VRF in a PE is associated with an instance of OSPF, then by default it is configured with a special OSPF route tag value called the VPN route tag. This route tag is included in the Type 5 LSAs that the PE originates and sends to any of the attached CEs. The configuration and inclusion of the VPN Route Tag is required for backward compatibility with deployed implementations that do not set the DN bit in Type 5 LSAs.

3.2.1.3.7. Sham Links

A sham link is only required if a backdoor link (shown as the red link in Figure 7) is present, otherwise configuring an OSPF super backbone will probably suffice.

3.2.2. OSPFv3 Authentication

OSPFv3 authentication requires IPv6 IPsec and supports the following:

  1. IPsec transport mode
  2. AH and ESP
  3. Manual keyed IPsec Security Association (SA)
  4. Authentication Algorithms MD5 and SHA1

To pass OSPFv3 authentication, OSPFv3 peers must have matching inbound and outbound SAs configured using the same SA parameters (SPI, keys, etc.). The implementation must allow the use of one SA for both inbound and outbound directions.

This feature is supported on IES and VPRN interfaces as well as on virtual links.

The re-keying procedure defined in RFC 4552, Authentication/Confidentiality for OSPFv3, supports the following:

  1. For every router on the link, create an additional inbound SA for the interface being re-keyed using a new SPI and the new key.
  2. For every router on the link, replace the original outbound SA with one using the new SPI and key values. The SA replacement operation should be atomic with respect to sending OSPFv3 packet on the link so that no OSPFv3 packets are sent without authentication or encryption.
  3. For every router on the link, remove the original inbound SA.

The key rollover procedure automatically starts when the operator changes the configuration of the inbound static-sa or bi-directional static-sa under an interface or virtual link. Within the KeyRolloverInterval time period, OSPF3 accepts packets with both the previous inbound static-sa and the new inbound static-sa, and the previous outbound static-sa should continue to be used. When the timer expires, OSPF3 will only accept packets with the new inbound static-sa and for outgoing OSPF3 packets, the new outbound static-sa will be used instead.

3.2.2.1. OSPFv3 Graceful Restart Helper

This feature extends the Graceful Restart helper function supported under other protocols to OSPFv3.

The primary difference between graceful restart helper for OSPFv2 and OSPFv3 is in OSPFv3 a different grace-LSA format is used.

As SR OS platforms can support a fully non-stop routing model for control plane high availability, SR OSs have no need for graceful restart as defined by the IETF in various RFCs for each routing protocol. However, since the router does need to co-exist in multi-vendor networks and other routers do not always support a true non-stop routing model with stateful failover between routing control planes, there is a need to support a Graceful Restart Helper function.

Graceful restart helper mode allows SR OS-based systems to provide a grace period to other routers which have requested it, during which the SR OS systems will continue to use routes authored by or transiting the router requesting the grace period. This is typically used when another router is rebooting the control plane but the forwarding plane is expected to continue to forward traffic based on the previously available FIB.

The format of the Graceful OSPF restart (GRACE) LSA format is:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           LS age              |0|0|0|          11             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       Link State ID                           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    Advertising Router                         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    LS sequence number                         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |        LS checksum            |            Length             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       +-                            TLVs                             -+
       |                             ...                               |

See section 2.2 of RFC 5187, OSPFv3 Graceful Restart.

The Link State ID of a grace-LSA in OSPFv3 is the Interface ID of the interface originating the LSA.

The format of each TLV is:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Type             |             Length            |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                            Value...                           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
                                TLV Format

Grace-LSA TLVs are formatted according to section 2.3.2 of RFC 3630, Traffic Engineering (TE) Extensions to OSPF Version 2. The Grace-LSA TLVs are used to carry the Grace period (type 1) and the reason the router initiated the graceful restart process (type 2).

Other information in RFC 5187 is directed to routers that require the full graceful restart mechanism as they do not support a stateful transition from primary or backup control plane module (CPM).

3.2.3. Virtual Links

The backbone area in an OSPF AS must be contiguous and all other areas must be connected to the backbone area. Sometimes, this is not possible. You can use virtual links to connect to the backbone through a non-backbone area.

Figure 5 depicts routers Y and Z as the start and end points of the virtual link while area 0.0.0.4 is the transit area. In order to configure virtual links, the router must be an ABR. Virtual links are identified by the router ID of the other endpoint, another ABR. These two endpoint routers must be attached to a common area, called the transit area. The area through which you configure the virtual link must have full routing information.

Transit areas pass traffic from an area adjacent to the backbone or to another area. The traffic does not originate in, nor is it destined for, the transit area. The transit area cannot be a stub area or a NSSA area.

Virtual links are part of the backbone, and behave as if they were unnumbered point-to-point networks between the two routers. A virtual link uses the intra-area routing of its transit area to forward packets. Virtual links are brought up and down through the building of the shortest-path trees for the transit area.

3.2.4. Neighbors and Adjacencies

A router uses the OSPF Hello protocol to discover neighbors. A neighbor is a router configured with an interface to a common network. The router sends hello packets to a multicast address and receives hello packets in return.

In broadcast networks, a designated router and a backup designated router are elected. The designated router is responsible for sending link-state advertisements (LSAs) describing the network, which reduces the amount of network traffic.

The routers attempt to form adjacencies. An adjacency is a relationship formed between a router and the designated or backup designated router. For point-to-point networks, no designated or backup designated router is elected. An adjacency must be formed with the neighbor.

To significantly improve adjacency forming and network convergence, a network should be configured as point-to-point if only two routers are connected, even if the network is a broadcast media such as Ethernet.

When the link-state databases of two neighbors are synchronized, the routers are considered to be fully adjacent. When adjacencies are established, pairs of adjacent routers synchronize their topological databases. Not every neighboring router forms an adjacency. Routing protocol updates are only sent to and received from adjacencies. Routers that do not become fully adjacent remain in the two-way neighbor state.

3.2.5. Link-State Advertisements

Link-state advertisements (LSAs) describe the state of a router or network, including router interfaces and adjacency states. Each LSA is flooded throughout an area. The collection of LSAs from all routers and networks form the protocol's topological database.

The distribution of topology database updates take place along adjacencies. A router sends LSAs to advertise its state according to the configured interval and when the router's state changes. These packets include information about the router's adjacencies, which allows detection of non-operational routers.

When a router discovers a routing table change or detects a change in the network, link state information is advertised to other routers to maintain identical routing tables. Router adjacencies are reflected in the contents of its link state advertisements. The relationship between adjacencies and the link states allow the protocol to detect non-operating routers. Link state advertisements flood the area. The flooding mechanism ensures that all routers in an area have the same topological database. The database consists of the collection of LSAs received from each router belonging to the area.

OSPF sends only the part that has changed and only when a change has taken place. From the topological database, each router constructs a tree of shortest paths with itself as root. OSPF distributes routing information between routers belonging to a single AS.

3.2.6. Metrics

In OSPF, all interfaces have a cost value or routing metric used in the OSPF link-state calculation. A metric value is configured based on hop count, bandwidth, or other parameters, to compare different paths through an AS. OSPF uses cost values to determine the best path to a particular destination: the lower the cost value, the more likely the interface will be used to forward data traffic.

Costs are also associated with externally derived routing data, such as those routes learned from the Exterior Gateway Protocol (EGP), like BGP, and is passed transparently throughout the AS. This data is kept separate from the OSPF protocol's link state data. Each external route can be tagged by the advertising router, enabling the passing of additional information between routers on the boundaries of the AS.

3.2.7. Authentication

All OSPF protocol exchanges can be authenticated. This means that only trusted routers can participate in autonomous system routing. Nokia’s implementation of OSPF supports plain text and Message Digest 5 (MD5) authentication (also called simple password).

MD5 allows an authentication key to be configured per network. Routers in the same routing domain must be configured with the same key. When the MD5 hashing algorithm is used for authentication, MD5 is used to verify data integrity by creating a 128-bit message digest from the data input. It is unique to that data. Nokia’s implementation of MD5 allows the migration of an MD5 key by using a key ID for each unique key.

By default, authentication is not enabled on an interface.

3.2.8. IP Subnets

OSPF enables the flexible configuration of IP subnets. Each distributed OSPF route has a destination and mask. A network mask is a 32-bit number that indicates the range of IP addresses residing on a single IP network/subnet. This specification displays network masks as hexadecimal numbers; for example, the network mask for a class C IP network is displayed as 0xffffff00. Such a mask is often displayed as 255.255.255.0.

Two different subnets with same IP network number have different masks, called variable length subnets. A packet is routed to the longest or most specific match. Host routes are considered to be subnets whose masks are all ones (0xffffffff).

3.2.9. Preconfiguration Recommendations

Prior to configuring OSPF, the router ID must be available. The router ID is a 32-bit number assigned to each router running OSPF. This number uniquely identifies the router within an AS. OSPF routers use the router IDs of the neighbor routers to establish adjacencies. Neighbor IDs are learned when Hello packets are received from the neighbor.

Before configuring OSPF parameters, ensure that the router ID is derived by one of the following methods:

  1. Define the value in the config>router router-id context.
  2. Define the system interface in the config>router>interface ip-int-name context (used if the router ID is not specified in the config>router router-id context).
    A system interface must have an IP address with a 32-bit subnet mask. The system interface is used as the router identifier by higher-level protocols such as OSPF and IS-IS. The system interface is assigned during the primary router configuration process when the interface is created in the logical IP interface context.
  3. If you do not specify a router ID, then the last four bytes of the MAC address are used.
Note:

On the BGP protocol level, a BGP router ID can be defined in the config>router>bgp router-id context and is only used within BGP.

3.2.10. Multiple OSPF Instances

The main route table manager (RTM) can create multiple instances of OSPF by extending the current creation of an instance. A given interface can only be a member of a single OSPF instance.When an interface is configured in a given domain and needs to be moved to another domain the interface must first be removed from the old instance and re-created in the new instance.

3.2.10.1. Route Export Policies for OSPF

Route policies allow specification of the source OSPF process ID in the from and to parameters in the config>router>policy-options>policy-statement>entry>from context, for example from protocol ospf instance-id.

If an instance-id is specified, only routes installed by that instance are picked up for announcement. If no instance-id is specified, then only routes installed by the base instance is will be announced. The all keyword announces routes installed by all instances of OSPF.

When announcing internal (intra/inter-area) OSPF routes from another process, the default type should be type-1, and metric set to the route metric in RTM. For AS-external routes, by default the route type (type-1/2) should be preserved in the originated LSA, and metric set to the route metric in RTM. By default, the tag value should be preserved when an external OSPF route is announced by another process. All these can be changed with explicit action statements.

Export policy should allow a match criteria based on the OSPF route hierarchy, e.g. only intra-area, only inter-area, only external, only internal (intra/inter-area). There must also be a possibility to filter based on existing tag values.

3.2.10.2. Preventing Route Redistribution Loops

The legacy method for this was to assign a tag value to each OSPF process and mark each external route originated within that domain with that value. However, since the tag value must be preserved throughout different OSPF domains, this only catches loops that go back to the originating domain and not where looping occurs in a remote set of domains. To prevent this type of loop, the route propagation information in the LSA must be accumulative. The following method has been implemented:

  1. The OSPF tag field in the AS-external LSAs is treated as a bit mask, rather than a scalar value. In other words, each bit in the tag value can be independently checked, set or reset as part of the routing policy.
  2. When a set of OSPF domains are provisioned in a network, each domain is assigned a specific bit value in the 32-bit tag mask. When an external route is originated by an ASBR using an internal OSPF route in a given domain, a corresponding bit is set in the AS-external LSA. As the route gets redistributed from one domain to another, more bits are set in the tag mask, each corresponding to the OSPF domain the route visited. Route redistribution looping is prevented by checking the corresponding bit as part of the export policy--if the bit corresponding to the announcing OSPF process is already set, the route is not exported there.
    From the CLI perspective, this involves adding a set of from tag and action tag commands that allow for bit operations.

3.2.11. Multi-Address Support for OSPFv3

While OSPFv3 was originally designed to carry only IPv6 routing information, the protocol has been extended to add support for other address families through work within the IETF (RFC 5838). These extensions within SR OS allow separate OSPFv3 instances to be used for IPv6 or IPv4 routing information.

To configure an OSPFv3 instance to distribute IPv4 routing information, a specific OSPFv3 instance must be configured using an instance ID within the range specified by the RFC. For unicast IPv4, the range is 64 to 95.

The following shows the basic configuration steps needed to create the OSPFv3 (ospf3) instance to carry IPv4 routing information. Once the instance is created, the OSPFv3 instance can be configured as needed for the associated network areas, interfaces, and other protocol attributes as you would for OSPFv2.

Example:
config
router
ospf3 64 10.20.1.3

3.2.12. IP Fast-reroute (IP FRR) For OSPF and IS-IS Prefixes

This feature provides for the use of the Loop-Free Alternate (LFA) backup next-hop for forwarding in-transit and CPM generated IP packets when the primary next-hop is not available. This means that a node resumes forwarding IP packets to a destination prefix without waiting for the routing convergence.

When any of the following events occurs, IGP instructs in the fast path the IOM or the forwarding engine to enable the LFA backup next-hop:

  1. OSPF/IS-IS interface goes operationally down: physical or local admin shutdown.
  2. Timeout of a BFD session to a next-hop when BFD is enabled on the OSPF/IS-IS interface.

IP FRR is supported on IPv4 and IPv6 OSPF/IS-IS prefixes forwarded in the base router instance to a network IP interface or to an IES SAP interface or spoke interface. It is also supported for VPRN VPN-IPv4 OSPF prefixes and VPN-IPv6 OSPF prefixes forwarded to a VPRN SAP interface or spoke interface.

IP FRR also provides a LFA backup next-hop for the destination prefix of a GRE tunnel used in an SDP or in VPRN auto-bind.

The LFA next-hop pre-computation by IGP is described in RFC 5286 Basic Specification for IP Fast Reroute: Loop-Free Alternates.

3.2.12.1. IP FRR Configuration

The user first enables Loop-Free Alternate (LFA) computation by SPF under the IS-IS routing protocol level or under the OSPF routing protocol instance level:

CLI Syntax:
config>router>isis>loopfree-alternate
config>router>ospf>loopfree-alternate
config>service>vprn>ospf>loopfree-alternate

The above commands instruct the IGP SPF to attempt to pre-compute both a primary next-hop and an LFA next-hop for every learned prefix. When found, the LFA next-hop is populated into the RTM along with the primary next-hop for the prefix.

Next the user enables IP FRR to cause RTM to download to the IOM or the forwarding engine a LFA next-hop, when found by SPF, in addition to the primary next-hop for each prefix in the FIB.

CLI Syntax:
config>router>ip-fast-reroute

3.2.12.1.1. Reducing the Scope of the LFA Calculation by SPF

The user can instruct IGP to not include all interfaces participating in a specific IS-IS level or OSPF area in the SPF LFA computation. This provides a way of reducing the LFA SPF calculation where it is not needed.

CLI Syntax:
config>router>isis>level>loopfree-alternate-exclude
config>router>ospf>area>loopfree-alternate-exclude

The user can also exclude a specific IP interface from being included in the LFA SPF computation by IS-IS or OSPF:

CLI Syntax:
config>router>isis>if>loopfree-alternate-exclude
config>router>ospf>area>if>loopfree-alternate-exclude

When an interface is excluded from the LFA SPF in IS-IS, it is excluded in both level 1 and level 2. When the user excludes an interface from the LFA SPF in OSPF, it is excluded in all areas. However, the above OSPF command can only be executed under the area in which the specified interface is primary and once enabled, the interface is excluded in that area and in all other areas where the interface is secondary. If the user attempts to apply it to an area where the interface is secondary, the command will fail.

Finally, the user can apply the same above commands for an OSPF instance within a VPRN service:

CLI Syntax:
config>service>vprn>ospf>area>loopfree-alternate-exclude
config>service>vprn>ospf>area>if>loopfree-alternate-exclude

3.2.12.2. ECMP Considerations

Whenever the SPF computation determined there is more than one primary next-hop for a prefix, it will not program any LFA next-hop in RTM. Thus, IP prefixes will resolve to the multiple primary next-hops in this case which provides the required protection.

3.2.12.3. IP FRR and RSVP Shortcut (IGP Shortcut)

When both IGP shortcut and LFA are enabled in IS-IS or OSPF, and IP FRR is also enabled, then the following additional IP FRR are supported:

  1. A prefix which is resolved to a direct primary next-hop can be backed up by a tunneled LFA next-hop.
  2. A prefix which is resolved to a tunneled primary next-hop will not have an LFA next-hop. It will rely on RSVP FRR for protection.

The LFA SPF is extended to use IGP shortcuts as LFA next-hops as explained in OSPF and IS-IS Support for Loop-Free Alternate Calculation.

3.2.12.4. IP FRR and BGP Next-Hop Resolution

An LFA backup next-hop will be able to protect the primary next-hop to reach a prefix advertised by a BGP neighbor. The BGP next-hop will thus remain up when the FIB switches from the primary IGP next-hop to the LFA IGP next-hop.

3.2.12.5. OSPF and IS-IS Support for Loop-Free Alternate Calculation

SPF computation in IS-IS and OSPF is enhanced to compute LFA alternate routes for each learned prefix and populate it in RTM.

Figure 8 illustrates a simple network topology with point-to-point (P2P) interfaces and highlights three routes to reach router R5 from router R1.

Figure 8:  Example Topology with Primary and LFA Routes 

The primary route is via R3. The LFA route via R2 has two equal cost paths to reach R5. The path by way of R3 protects against failure of link R1-R3. This route is computed by R1 by checking that the cost for R2 to reach R5 by way of R3 is lower than the cost by way of routes R1 and R3. This condition is referred to as the “loop-free criterion”.

The path by way of R2 and R4 can be used to protect against the failure of router R3. However, with the link R2-R3 metric set to 5, R2 sees the same cost to forward a packet to R5 by way of R3 and R4. Thus R1 cannot guarantee that enabling the LFA next-hop R2 will protect against R3 node failure. This means that the LFA next-hop R2 provides link-protection only for prefix R5. If the metric of link R2-R3 is changed to 8, then the LFA next-hop R2 provides node protection since a packet to R5 will always go over R4.In other words it is required that R2 becomes loop-free with respect to both the source node R1 and the protected node R3.

Consider now the case where the primary next-hop uses a broadcast interface as illustrated in Figure 9.

Figure 9:  Example Topology with Broadcast Interfaces 

In order for next-hop R2 to be a link-protect LFA for route R5 from R1, it must be loop-free with respect to the R1-R3 link Pseudo-Node (PN). However, since R2 has also a link to that PN, its cost to reach R5 by way of the PN, or router R4 are the same. Thus R1 cannot guarantee that enabling the LFA next-hop R2 will protect against a failure impacting link R1-PN since this may cause the entire subnet represented by the PN to go down. If the metric of link R2-PN is changed to 8, then R2 next-hop will be an LFA providing link protection.

The following are the detailed equations for this criterion as provided in RFC 5286, Basic Specification for IP Fast Reroute: Loop-Free Alternates:

  1. Rule 1: Link-protect LFA backup next-hop (primary next-hop R1-R3 is a P2P interface):
    Distance_opt(R2, R5) < Distance_opt(R2, R1) + Distance_opt(R1, R5)
    and,
    Distance_opt(R2, R5) >= Distance_opt(R2, R3) + Distance_opt(R3, R5)
  2. Rule 2: Node-protect LFA backup next-hop (primary next-hop R1-R3 is a P2P interface):
    Distance_opt(R2, R5) < Distance_opt(R2, R1) + Distance_opt(R1, R5)
    and,
    Distance_opt(R2, R5) < Distance_opt(R2, R3) + Distance_opt(R3, R5)
  3. Rule 3: Link-protect LFA backup next-hop (primary next-hop R1-R3 is a broadcast interface):
    Distance_opt(R2, R5) < Distance_opt(R2, R1) + Distance_opt(R1, R5) and,
    Distance_opt(R2, R5) < Distance_opt(R2, PN) + Distance_opt(PN, R5) where; PN stands for the R1-R3 link Pseudo-Node.

For the case of P2P interface, if SPF finds multiple LFA next-hops for a given primary next-hop, it follows the following selection algorithm:

  1. It will pick the node-protect type in favor of the link-protect type.
  2. If there is more than one LFA next-hop within the selected type, then it will pick one based on the least cost.
  3. If more than one LFA next-hop with the same cost results from step b, then SPF will select the first one. This is not a deterministic selection and will vary following each SPF calculation.

For the case of a broadcast interface, a node-protect LFA is not necessarily a link protect LFA if the path to the LFA next-hop goes over the same PN as the primary next-hop. Similarly, a link protect LFA may not guarantee link protection if it goes over the same PN as the primary next-hop. The selection algorithm when SPF finds multiple LFA next-hops for a given primary next-hop is modified as follows:

  1. The algorithm splits the LFA next-hops into two sets:
    1. The first set consists of LFA next-hops which do not go over the PN used by primary next-hop.
    2. The second set consists of LFA next-hops which do go over the PN used by the primary next-hop.
  2. If there is more than one LFA next-hop in the first set, it will pick the node-protect type in favor of the link-protect type.
  3. If there is more than one LFA next-hop within the selected type, then it will pick one based on the least cost.
  4. If more than one LFA next-hop with equal cost results from Step C, SPF will select the first one from the remaining set. This is not a deterministic selection and will vary following each SPF calculation.
  5. If no LFA next-hop results from Step D, SPF will rerun Steps B-D using the second set.

This algorithm is more flexible than strictly applying Rule 3 above; i.e., the link protect rule in the presence of a PN and specified in RFC 5286. A node-protect LFA which does not avoid the PN; i.e., does not guarantee link protection, can still be selected as a last resort. The same thing, a link-protect LFA which does not avoid the PN may still be selected as a last resort.

Both the computed primary next-hop and LFA next-hop for a given prefix are programmed into RTM.

3.2.12.5.1. Loop-Free Alternate Calculation in the Presence of IGP Shortcuts

In order to expand the coverage of the LFA backup protection in a network, RSVP LSP based IGP shortcuts can be placed selectively in parts of the network and be used as an LFA backup next-hop.

When IGP shortcut is enabled in IS-IS or OSPF on a given node, all RSVP LSP originating on this node and with a destination address matching the router-id of any other node in the network are included in the main SPF by default.

In order to limit the time it takes to compute the LFA SPF, the user must explicitly enable the use of an IGP shortcut as LFA backup next-hop using one of a couple of new optional argument for the existing LSP level IGP shortcut command:

config>router>mpls>lsp>igp-shortcut [lfa-protect | lfa-only]

The lfa-protect option allows an LSP to be included in both the main SPF and the LFA SPFs. For a given prefix, the LSP can be used either as a primary next-hop or as an LFA next-hop but not both. If the main SPF computation selected a tunneled primary next-hop for a prefix, the LFA SPF will not select an LFA next-hop for this prefix and the protection of this prefix will rely on the RSVP LSP FRR protection. If the main SPF computation selected a direct primary next-hop, then the LFA SPF will select an LFA next-hop for this prefix but will prefer a direct LFA next-hop over a tunneled LFA next-hop.

The lfa-only option allows an LSP to be included in the LFA SPFs only such that the introduction of IGP shortcuts does not impact the main SPF decision. For a given prefix, the main SPF always selects a direct primary next-hop. The LFA SPF will select a an LFA next-hop for this prefix but will prefer a direct LFA next-hop over a tunneled LFA next-hop.

Thus, with the selection algorithm when SPF finds multiple LFA next-hops for a given primary next-hop is modified as follows:

  1. The algorithm splits the LFA next-hops into two sets:
    1. the first set consists of direct LFA next-hops
    2. the second set consists of tunneled LFA next-hops. after excluding the LSPs which use the same outgoing interface as the primary next-hop.
  2. The algorithms continues with first set if not empty, otherwise it continues with second set.
  3. If the second set is used, the algorithm selects the tunneled LFA next-hop which endpoint corresponds to the node advertising the prefix.
    1. If more than one tunneled next-hop exists, it selects the one with the lowest LSP metric.
    2. If still more than one tunneled next-hop exists, it selects the one with the lowest tunnel-id.
    3. If none is available, it continues with rest of the tunneled LFAs in second set.
  4. Within the selected set, the algorithm splits the LFA next-hops into two sets:
    1. The first set consists of LFA next-hops which do not go over the PN used by primary next-hop.
    2. The second set consists of LFA next-hops which go over the PN used by the primary next-hop.
  5. If there is more than one LFA next-hop in the selected set, it will pick the node-protect type in favor of the link-protect type.
  6. If there is more than one LFA next-hop within the selected type, then it will pick one based on the least total cost for the prefix. For a tunneled next-hop, it means the LSP metric plus the cost of the LSP endpoint to the destination of the prefix.
  7. If there is more than one LFA next-hop within the selected type (ecmp-case) in the first set, it will select the first direct next-hop from the remaining set. This is not a deterministic selection and will vary following each SPF calculation.
  8. If there is more than one LFA next-hop within the selected type (ecmp-case) in the second set, it will pick the tunneled next-hop with the lowest cost from the endpoint of the LSP to the destination prefix. If there remains more than one, it will pick the tunneled next-hop with the lowest tunnel-id.

3.2.12.5.2. Loop-Free Alternate Calculation for Inter-Area/inter-Level Prefixes

When SPF resolves OSPF inter-area prefixes or IS-IS inter-level prefixes, it will compute an LFA backup next-hop to the same exit area/border router as used by the primary next-hop.

3.3. Loop-Free Alternate Shortest Path First (LFA SPF) Policies

An LFA SPF policy allows the user to apply specific criteria, such as admin group and SRLG constraints, to the selection of a LFA backup next-hop for a subset of prefixes that resolve to a specific primary next-hop. The feature introduces the concept of route next-hop template to influence LFA backup next-hop selection.

3.3.1. Configuring a Route Next-Hop Policy Template

The LFA SPF policy consists of applying a route next-hop policy template to a set of prefixes.

The user first creates a route next-hop policy template under the global router context:

CLI Syntax:
config>router>route-next-hop-policy>template template-name

A policy template can be used in both IS-IS and OSPF to apply the specific criteria described in the next sub-sections to prefixes protected by LFA. Each instance of IS-IS or OSPF can apply the same policy template to one or more prefix lists and to one or more interfaces.

The commands within the route next-hop policy use the begin-commit-abort model introduced with BFD templates. The following are the steps to create and modify the template:

  1. To create a template, the user enters the name of the new template directly under route-next-hop-policy context.
  2. To delete a template which is not in use, the user enters the no form for the template name under the route-next-hop-policy context.
  3. The user enters the editing mode by executing the begin command under route-next-hop-policy context. The user can then edit and change any number of route next-hop policy templates. However, the parameter value will still be stored temporarily in the template module until the commit is executed under the route-next-hop-policy context. Any temporary parameter changes will be lost if the user enters the abort command before the commit command.
  4. The user is allowed to create or delete a template instantly once in the editing mode without the need to enter the commit command. Furthermore, the abort command if entered will have no effect on the prior deletion or creation of a template.

After the commit command is issued, IS-IS or OSPF will re-evaluate the templates and if there are any net changes, it will schedule a new LFA SPF to re-compute the LFA next-hop for the prefixes associated with these templates.

3.3.1.1. Configuring Affinity or Admin Group Constraints

Administrative groups (admin groups), also known as affinity, are used to tag IP interfaces which share a specific characteristic with the same identifier. For example, an admin group identifier could represent all links which connect to core routers, or all links which have bandwidth higher than 10G, or all links which are dedicated to a specific service.

The user first configures locally on each router the name and identifier of each admin group:

CLI Syntax:
config>router>if-attribute>admin-group group-name value group-value

A maximum of 32 admin groups can be configured per system.

Next the user configures the admin group membership of the IP interfaces used in LFA. The user can apply admin groups to IES, VPRN, or network IP interface.

CLI Syntax:
config>router> interface>if-attribute>admin-group group-name [group-name...(up to 5 max)]
config>service>ies>if>if-attribute>admin-group group-name [group-name...(up to 5 max)]
config>service>vprn >if>if-attribute>admin-group group-name [group-name...(up to 5 max)]

The user can add as many admin groups as configured to a given IP interface. The same above command can be applied multiple times.

Note:

the configured admin-group membership will be applied in all levels/areas the interface is participating in. The same interface cannot have different memberships in different levels/areas.

The no form of the admin-group command under the interface deletes one or more of the admin-group memberships of the interface. It deletes all memberships if no group name is specified.

Finally, the user adds the admin group constraint into the route next-hop policy template:

CLI Syntax:
configure router route-next-hop-template template template-name
include-group group-name [pref 1]
include-group group-name [pref 2]
exclude-group group-name

Each group is entered individually. The include-group statement instructs the LFA SPF selection algorithm to pick up a subset of LFA next-hops among the links which belong to one or more of the specified admin groups. A link which does not belong to at least one of the admin-groups is excluded. However, a link can still be selected if it belongs to one of the groups in a include-group statement but also belongs to other groups which are not part of any include-group statement in the route next-hop policy.

The pref option is used to provide a relative preference for the admin group to select. A lower preference value means that LFA SPF will first attempt to select a LFA backup next-hop which is a member of the corresponding admin group. If none is found, then the admin group with the next higher preference value is evaluated. If no preference is configured for a given admin group name, then it is supposed to be the least preferred, i.e., numerically the highest preference value.

When evaluating multiple include-group statements within the same preference, any link which belongs to one or more of the included admin groups can be selected as an LFA next-hop. There is no relative preference based on how many of those included admin groups the link is a member of.

The exclude-group statement simply prunes all links belonging to the specified admin group before making the LFA backup next-hop selection for a prefix.

If the same group name is part of both include and exclude statements, the exclude statement will win. It other words, the exclude statement can be viewed as having an implicit preference value of 0.

Note:

the admin-group criterion is applied before running the LFA next-hop selection algorithm. The modified LFA next-hop selection algorithm is shown in Section 7.5.

3.3.1.2. Configuring SRLG Group Constraints

Shared Risk Loss Group (SRLG) is used to tag IP interfaces which share a specific fate with the same identifier. For example, an SRLG group identifier could represent all links which use separate fibers but are carried in the same fiber conduit. If the conduit is accidentally cut, all the fiber links are cut which means all IP interfaces using these fiber links will fail. Thus the user can enable the SRLG constraint to select a LFA next-hop for a prefix which avoids all interfaces that share fate with the primary next.

The user first configures locally on each router the name and identifier of each SRLG group:

CLI Syntax:
config>router>if-attribute>srlg-group group-name value group-value

A maximum of 1024 SRLGs can be configured per system.

Next the user configures the admin group membership of the IP interfaces used in LFA. The user can apply SRLG groups to IES, VPRN, or network IP interface.

CLI Syntax:
config>router>if>if-attribute>srlg-group group-name [group-name...(up to 5 max)]
config>service>vprn>if>if-attribute>srlg-group group-name [group-name...(up to 5 max)]
config>service>ies>if>if-attribute>srlg-group group-name [group-name...(up to 5 max)]

The user can add a maximum of 64 SRLG groups to a given IP interface. The same above command can be applied multiple times.

Note:

the configured SRLG membership will be applied in all levels/areas the interface is participating in. The same interface cannot have different memberships in different levels/areas.

The no form of the srlg-group command under the interface deletes one or more of the SRLG memberships of the interface. It deletes all SRLG memberships if no group name is specified.

Finally, the user adds the SRLG constraint into the route next-hop policy template:

CLI Syntax:
configure router route-next-hop-template template template-name
srlg-enable

When this command is applied to a prefix, the LFA SPF will select a LFA next-hop, among the computed ones, which uses an outgoing interface that does not participate in any of the SRLGs of the outgoing interface used by the primary next-hop.

Note:

the SRLG and admin-group criteria are applied before running the LFA next-hop selection algorithm.

3.3.1.3. Interaction of IP and MPLS Admin Group and SRLG

The LFA SPF policy feature generalizes the use of admin-group and SRLG to other types of interfaces. To that end, it is important that the new IP admin groups and SRLGs be compatible with the ones already supported in MPLS. The following rules are implemented:

  1. The definition of admin groups and SRLGs are moved under the new config>router>if-attribute context. When upgrading customers to R12, all user configured admin groups and SRLGs under config>router>mpls context will automatically be moved into the new context. The configuration of admin groups and SRLGs under the config>router>mpls context in CLI is deprecated.
  2. The binding of an MPLS interface to a group, i.e., configuring membership of an MPLS interface in a group, continues to be performed under config>router>mpls>interface context.
  3. The binding of a local or remote MPLS interface to an SRLG in the SRLG database continues to be performed under the config>router>mpls>srlg-database context.
  4. The binding of an IS-IS/OSPF interface to a group is performed in the config>router>if>if-attribute or config>service>vprn>if>if-attribute or config>service>ies>if>if-attribute contexts. This is used by IS-IS or OSPF in route next-hop policies.
  5. Only the admin groups and SRLGs bound to an MPLS interface context or the SRLG database context are advertised in TE link TLVs and sub-TLVs when the traffic-engineering option is enabled in IS-IS or OSPF. IES and VPRN interfaces do not have their attributes advertised in TE TLVs.

3.3.1.4. Configuring Protection Type and Next-Hop Type Preferences

The user can select if link protection or node protection is preferred in the selection of a LFA next-hop for all IP prefixes and LDP FEC prefixes to which a route next-hop policy template is applied. The default in SR OS implementation is node protection. The implementation will fall back to the other type if no LFA next-hop of the preferred type is found.

The user can also select if tunnel backup next-hop or IP backup next-hop is preferred. The default in SR OS implementation is to prefer IP next-hop over tunnel next-hop. The implementation will fall back to the other type if no LFA next-hop of the preferred type is found.

The following options are thus added into the Route next-hop policy template:

CLI Syntax:
configure router route-nh-template template template-name
protection-type {link | node}
nh-type {ip | tunnel}

When the route next-hop policy template is applied to an IP interface, all prefixes using this interface as a primary next-hop will follow the protection type and next-hop type preference specified in the template.

3.3.2. Application of Route Next-Hop Policy Template to an Interface

Once the route next-hop policy template is configured with the desired policies, the user can apply it to all prefixes which primary next-hop uses a specific interface name. The following command is achieves that:

CLI Syntax:
config>router>isis>if>lfa-policy-map route-nh-template template-name
config>router>ospf(3)>area>if>lfa-policy-map route-nh-template template-name
config>service>vprn>ospf(3)>area>if>lfa-policy-map route-nh-template template-name

When a route next-hop policy template is applied to an interface in IS-IS, it is applied in both level 1 and level 2. When a route next-hop policy template is applied to an interface in OSPF, it is applied in all areas. However, the above CLI command in an OSPF interface context can only be executed under the area in which the specified interface is primary and then applied in that area and in all other areas where the interface is secondary. If the user attempts to apply it to an area where the interface is secondary, the command will fail.

If the user excluded the interface from LFA using the command loopfree-alternate-exclude, the LFA policy if applied to the interface has no effect.

Finally, if the user applied a route next-hop policy template to a loopback interface or to the system interface, the command will not be rejected but it will result in no action taken.

3.3.3. Excluding Prefixes from LFA SPF

In the current SR OS implementation, the user can exclude an interface in IS-IS or OSPF, an OSPF area, or an IS-IS level from the LFA SPF.

This feature adds the ability to exclude prefixes from a prefix policy which matches on prefixes or on IS-IS tags:

CLI Syntax:
config>router>isis>loopfree-alternate-exclude prefix-policy prefix-policy1 [prefix-policy2…up to 5]
config>router>ospf(3)>loopfree-alternate-exclude prefix-policy prefix-policy1 [prefix-policy2…up to 5]
config>service>vprn>ospf(3)>loopfree-alternate-exclude prefix-policy prefix-policy1 [prefix-policy2…up to 5]

The prefix policy is configured as in existing SR OS implementation:

CLI Syntax:
config
router
policy-options
[no] prefix-list prefix-list1
prefix 62.225.16.0/24 prefix-length-range 32-32
[no] policy-statements prefix-policy1
entry 10
from
prefix-list "prefix-list1"
exit
action accept
exit
exit
default-action reject
exit

If the user enabled the IS-IS prefix prioritization based on tag, it will also apply to SPF LFA. If a prefix is excluded from LFA, then it will not be included in LFA calculation regardless of its priority; however, the prefix tag will be used in the main SPF.

Note:

Prefix tags are not defined for OSPF protocol.

The default action of the above loopfree-alternate-exclude command when not explicitly specified by the user in the prefix policy is a “reject”. Thus, regardless if the user did or did not explicitly add the statement “default-action reject” to the prefix policy, a prefix which did not match any entry in the policy will be accepted into LFA SPF.

3.3.4. Modification to LFA Next-Hop Selection Algorithm

This feature modifies the LFA next-hop selection algorithm. The SRLG and admin-group criteria are applied before running the LFA next-hop selection algorithm. In other words, links which do not include one or more of the admin-groups in the include-group statements and links which belong to admin-groups which have been explicitly excluded using the exclude-group statement, and the links which belong to the SRLGs used by the primary next-hop of a prefix are first pruned.

This pruning applies only to IP next-hops. Tunnel next-hops can have the admin-group or SRLG constraint applied to them under MPLS. For example, if a tunnel next-hop is using an outgoing interface which belongs to given SRLG ID, the user can enable the srlg-frr option under the config>router>mpls context to be sure the RSVP LSP FRR backup LSP will not use an outgoing interface with the same SRLG ID. A prefix which is resolved to a tunnel next-hop is protected by the RSVP FRR mechanism and not by the IP FRR mechanism. Similarly, the user can include or exclude admin-groups for the RSVP LSP and its FRR bypass backup LSP in MPLS context. The admin-group constraints will, however, be applied to the selection of the outgoing interface of both the LSP primary path and its FRR bypass backup path.

The following is the modified LFA selection algorithm which is applied to prefixes resolving to a primary next-hop which uses a given route next-hop policy template.

  1. Split the LFA next-hops into two sets:
    1. IP or direct next-hops.
    2. Tunnel next-hops after excluding the LSPs which use the same outgoing interface as the primary next-hop.
  2. Prune the IP LFA next-hops which use the following links:
    1. links which do not include one or more of the admin-groups in the include-group statements in the route next-hop policy template.
    2. links which belong to admin-groups which have been explicitly excluded using the exclude-group statement in the route next-hop policy template.
    3. links which belong to the SRLGs used by the primary next-hop of a prefix.
  3. Continue with the set indicated in the nh-type value in the route next-hop policy template if not empty; otherwise continue with the other set.
  4. Within IP next-hop set:
    1. prefer LFA next-hops which do not go over the Pseudo-Node (PN) used by the primary next-hop
    2. Within selected subset prefer the node-protect type or the link-protect type according to the value of the protection-type option in the route next-hop policy template.
    3. Within the selected subset, select the best admin-group(s) according to the preference specified in the value of the include-group option in the route next-hop policy template.
    4. Within selected subset, select lowest total cost of a prefix.
    5. If same total cost, select lowest router-id.
    6. If same router-id, select lowest interface-index.
  5. Within tunnel next-hop set:
    1. Select tunnel next-hops which endpoint corresponds to the node owning or advertising the prefix.
      1. Within selected subset, select the one with the lowest cost (lowest LSP metric).
      2. If same lowest cost, select tunnel with lowest tunnel-index.
    2. If none is available, continue with rest of the tunnel LFA next-hop set.
    3. Prefer LFA next-hops which do not go over the Pseudo-Node (PN) used by the primary next-hop.
    4. Within selected subset prefer the node-protect type or the link-protect type according to the value of the protection-type in the route next-hop policy template.
    5. Within selected subset, select lowest total cost of a prefix. For a tunnel next-hop, it means the LSP metric plus the cost of the LSP endpoint to the destination of the prefix.
    6. If same total cost, select lowest endpoint to destination cost
    7. If same endpoint to destination cost, select lowest router-id,
    8. If same router-id, select lowest tunnel-index.

3.4. LFA Protection using Segment Routing Backup Node SID

One of the challenges in MPLS deployments across multiple IGP areas or domains, such in seamless MPLS design, is the provision of FRR local protection in access and metro domains that make use of a ring, a square, or a partial mesh topology. In order to implement IP, LDP, or SR FRR in such topologies, one needs to implement the remote LFA feature. Remote LFA provides a Segment Routing (SR) tunneled LFA next-hop for an IP prefix, an LDP tunnel, or an SR tunnel. For prefixes outside of the area or domain, the access or aggregation router must push four labels: service label, BGP label for destination PE, LDP/RSVP/SR label to reach the exit ABR/ASBR, and finally one label for the remote LFA next-hop. Small routers deployed in these parts of the network have limited MPLS label stack size support.

Figure 10 illustrates the label stack required for the primary next-hop and the remote LFA next-hop computed by aggregation node AGN2 for the inter-area prefix of a remote PE. For an inter-area BGP label unicast route prefix for which ABR1 is the primary exit ABR, AGN2 resolves it to the transport tunnel of ABR1 and thus uses the remote LFA next-hop of ABR1 for protection. The primary next-hop uses two transport labels plus a service label. The remote LFA next-hop for ABR1 uses PQ node AGN5 and pushes three transport labels plus a service label.

Seamless MPLS with Fast Restoration requires up to four labels to be pushed by AGN2, as shown in Figure 10.

Figure 10:  Label Stack for Remote LFA in Ring Topology 

The objective of the LFA protection with a Backup Node SID feature is to reduce the label stack pushed by AGN2 for BGP label unicast inter-area prefixes. The forwarding of packets is forced when link AGN2-AGN1 fails away from the failure and towards ABR2, which acts as the backup for ABR1 (and vice-versa when ABR2 is the primary exit ABR for the BGP label unicast inter-area prefix). This requires that ABR2 advertises a special label for the loopback of ABR1 which will attract packets normally destined to ABR1. These packets will be forwarded by ABR2 to ABR1 via the inter-ABR link.

As a result, AGN2 will push the label advertised by ABR2 to back up ABR1on top of the BGP label for the remote PE and the service label. This keeps the label stack for the LFA next-hop to be the same size as that of the primary next-hop. It is also the same size as the remote LFA next-hop for local prefix within the ring.

3.4.1. Configuring LFA using Backup Node SID

Enable this feature by configuring a backup node SID at an ABR/ASBR that acts as a backup to the primary exit ABR/ASBR of inter-area/inter-as routes learned as BGP labeled routes.

CLI Syntax:
config>router>ospf>segment-routing$
backup-node-sid ip-prefix/prefix-length index 0..4294967295
backup-node-sid ip-prefix/prefix-length label 1..4294967295

The user can enter either a label or an index for the backup node SID.

Note:

This feature only allows the configuration of a single backup node SID per IGP instance and per ABR/ASBR. In other words, only a pair of ABR/ASBR nodes can back up each other in a given IGP domain. Each time the user invokes the above command within the same IGP instance, it will override any previous configuration of the backup node SID. The same ABR/ASBR can, however, participate in multiple IGP instances and provide a backup support within each instance.

3.4.2. Detailed Operation of LFA Protection using Backup Node SID

As shown in Figure 11, LFA for seamless MPLS supports environments where the boundary routers are either:

  1. ABR nodes that connect with iBGP multiple domains, each using a different area of the same IGP instance
  2. ASBR nodes that connect domains running different IGP instances and use iBGP within a domain and eBGP to the other domains.
Figure 11:  Backup ABR Node SID 

The following steps describe the configuration and behavior of LFA Protection using Backup Node SID:

  1. The user configures node SID 100 in ABR1 for its loopback prefix 1.1.1.1/32. This is the regular node SID. ABR1 advertises the prefix SID sub-TLV for this node SID in IGP and installs the ILM using a unique label.
  2. Each router receiving the prefix sub-TLV for node SID 100 resolves it as explained in Segment Routing in Shortest Path Forwarding, however, changes to the programming of the backup NHLFE of node SID 100 based on receiving the backup node SID for prefix 1.1.1.1/32 are defined Duplicate SID Handling.
  3. The user configures a backup node SID 200 in ABR2 for the loopback 1.1.1.1/32 of ABR1. The SID value must be different from that assigned by ABR1 for the same prefix. ABR2 installs the ILM, which performs a swap operation from the label of SID 200 to that of SID 100. The ILM must point to a direct link and next-hop to reach 1.1.1.1/32 of ABR1 as its primary next-hop. IGP examines all adjacencies established in the same area as that of prefix 1.1.1.1/32 and determines which ones have ABR1 as direct neighbor and with the best cost. If more than one adjacency has the best cost, IGP selects the one with the lowest interface index. If there is no adjacency to reach ABR2, the prefix SID for the backup node is flushed and is not resolved. This is to prevent using any other non-direct path to reach ABR1. As a result, any received traffic on the ILM of SID 200 traffic will be blackholed.
  4. If resolved, ABR2 advertises the prefix SID sub-TLV for this backup node SID 200 and indicates in the SR Algorithm field that a modified SPF algorithm, referred to as “Backup-constrained-SPF”, is required to resolve this node SID.
  5. Each router receiving the prefix sub-TLV for the backup node SID 200 performs the following steps.
    Note:

    The following resolution steps do not require a CLI command to be enabled.

    1. Determines which router is being backed up. This is achieved by checking the router-id owner of the prefix sub-TLV that was advertised with the same prefix but without the backup flag and which is used as the best route for the prefix. In this case, it should be ABR1. Then the router runs a modified SPF by removing node ABR1 from the topology to resolve the backup node SID 200. The primary next-hop should point to the path to ABR2 in the counter-clock direction of the ring.
      Note:

      The router will not compute an LFA or a remote LFA for node SID 200 because the main SPF used a modified topology.

    2. Installs the ILM and primary NHLFE for the backup node SID.
      Note:

      Only a swap label operation is configured by all routers for the backup node SID. There is no push operation and no tunnel for the backup node SID is added into the TTM.

    3. Programs the backup node SID as the LFA backup for the SR tunnel to node SID of 1.1.1.1/32 of ABR1. In other words, each router overrides the remote LFA backup for prefix 1.1.1.1/32 which is normally PQ node AGN5.
    4. If the router is adjacent to ABR1, for example AGN1, it also programs the backup node SID as the LFA backup for the protection of any adjacency SID to ABR1.
  6. When node AGN2 resolves a BGP label route for an inter-area prefix for which the primary ABR exit router is ABR1, it will use the backup node SID of ABR1 as the remote LFA backup instead of the SID to the PQ node (AGN5 in this example) to save on the pushed label stack.
    AGN2 continues to resolve the prefix SID for any remote PE prefix which is summarized into local area of AGN2 as usual. AGN2 programs a primary next-hop and an RLFA next-hop. The RLFA will use AGN5 as the PQ node and will push two labels, such as for an intra-area prefix SID. There is no need to use the backup node SID for this prefix SID and force its backup path to go to ABR1. The backup path may exit from ABR2 if the cost from ABR2 to destination prefix is shorter.
  7. If the user excludes a link from LFA in the IGP instance (config>router>ospf>area>interface>loopfree-alternate-exclude or config>router>isis>interface>loopfree-alternate-exclude) then a backup Node SID which resolves to that interface will not be used a remote LFA backup in the same way as in a regular LFA or a PQ remote LFA next-hop behavior.
  8. If the OSPF neighbor of a router is put into overload or if the metric of an OSPF interface to that neighbor is set to LSInfinity (0xFFFF), a Backup Node SID that resolves to that neighbor will not be used as a remote LFA backup in the same way as in a regular LFA or a PQ remote LFA next-hop behavior.
  9. If the IS-IS neighbor of a router is put into overload or if the metric of an IS-IS interface to that neighbor is set to overload max-metric (0xfffffe), a Backup Node SID that resolves to that neighbor will be used as a remote LFA backup in the same way as in a regular LFA or a PQ remote LFA next-hop behavior.
    Note:

    Other routers in the network will not forward transit traffic to the router in overload.

  10. If the IS-IS interface to a neighbor is set to maximum link metric (0xffffff), a Backup Node SID that resolves to that neighbor will not be used as a remote LFA backup in the same way as in a regular LFA or a PQ remote LFA next-hop behavior.
  11. LFA policy is supported for IP next-hops only. It is not supported with tunnel next-hops such as IGP shortcuts or remote LFA tunnels. A Backup Node SID is also a tunnel next-hop and as such a user configured LFA policy will not be applied to check constraints such as admin-groups and SRLG against the outgoing interface of the selected Backup Node SID.

3.4.3. Duplicate SID Handling

When IGP in a router issues or receives a LSA/LSP containing a prefix SID sub-TLV for a node SID or a backup node SID with a SID value that is a duplicate of an existing SID or backup node SID, the following resolution is followed.

Table 10:  Handling of Duplicate SID  

New LSA/LSP

Old LSA/LSP

Backup Node SID

Local Backup Node SID

Node SID

Local Node SID

Backup Node SID

Old

New

New

New

Local Backup Node SID

Old

Equal

New

New

Node SID

Old

Old

Equal/Old 1

Equal/New 2

Local Node SID

Old

Old

Equal/Old 1

Equal/Old 1

    Notes:

  1. Equal/Old means the following:
    1. If the prefix is duplicate, it is equal and thus no change is needed. Keep the old LSA/LSP.
    2. If the prefix is not duplicate, then still keep the old LSA/LSP.
  2. Equal/New means the following:
    1. If the prefix is also duplicate, it is equal and thus no change is needed. Keep the old LSA/LSP.
    2. If the prefix is not duplicate, then pick a new prefix and use the new LSA/LSP.

3.4.4. OSPF Control Plane Extensions

All routers supporting this feature must advertise support of the new Algorithm “Backup-constrained-SPF” of value 2 in the SR-Algorithm TLV, which is advertised in the Router Information Opaque LSA. This is in addition to the default supported algorithm “IGP-metric-based-SPF” of value 0. The following shows the encoding of the prefix SID sub-TLV to indicate a node SID of type backup and to indicate the modified SPF algorithm in the SR Algorithm field. The values used in the Flags field and in the Algorithm field are SR OS proprietary.

The new Algorithm (0x2) field and values are used by this feature.

0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |              Type             |             Length            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Flags     |   Reserved    |      MT-ID    |Algorithm (0x2)|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     SID/Index/Label (variable)                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Table 11:  OSPF Control Plane Extension Fields  

Field

Value

Type

2

Length

variable

Flags

1 octet field

The following flags are defined; the “B” flag is new:

     0  1  2  3  4  5  6  7
   +--+--+--+--+--+--+--+--+
   |  |NP|M |E |V |L | B|  |
   +--+--+--+--+--+--+--+--+
Table 12:  OSPF Control Plane Extension Flags  

Flag

Description

NP-Flag

No-PHP flag

If set, then the penultimate hop MUST NOT pop the Prefix-SID before delivering the packet to the node that advertised the Prefix-SID.

M-Flag

Mapping Server Flag

If set, the SID is advertised from the Segment Routing Mapping Server functionality as described in [I-D.filsfils-spring-segment-routing-ldp-interop].

E-Flag

Explicit-Null Flag

If set, any upstream neighbor of the Prefix-SID originator MUST replace the Prefix-SID with a Prefix-SID having an Explicit-NULL value (0 for IPv4) before forwarding the packet.

V-Flag

Value/Index Flag

If set, then the Prefix-SID carries an absolute value. If not set, then the Prefix-SID carries an index.

L-Flag

Local/Global Flag

If set, then the value/index carried by the Prefix-SID has local significance. If not set, then the value/index carried by this Sub-TLV has global significance.

B-Flag

This flag is used by the Protection using Backup Node SID feature. If set, then the SID is a backup SID for the prefix. This value is SR OS proprietary.

Other bits

Reserved. These MUST be zero when sent and are ignored when received.

MT-ID

Multi-Topology ID, as defined in RFC 4915.

Algorithm

One octet identifying the algorithm the Prefix-SID is associated with. A value of (0x2) indicates the modified Shortest Path First (SPF) algorithm, which removes from the topology the node which is backed up by the backup node SID. This value is SR OS proprietary.

SID/Index/Label

According to the V and L flags, it contains either:

  1. A 32 bit index defining the offset in the SID/Label space advertised by this router.
  2. A 24 bit label where the 20 rightmost bits are used for encoding the label value.

3.5. Segment Routing in Shortest Path Forwarding

OSPF can be configured in Segment Routing in Shortest Path Forwarding using the same procedures as those used to configure IS-IS. See Segment Routing in Shortest Path Forwarding in the IS-IS section for more information.

3.6. OSPF LSA Filtering

The SR OS OSPF implementation supports a configuration option to filter outgoing OSPF LSAs on selected OSPFv2 or OSPFv3 interfaces. This feature should be used with some caution because it goes against the principle that all OSPF routers in an area should have a synchronized Link State Database (LSDB), but it can be a useful resource saving in certain hub and spoke topologies where learning routes through OSPF is only needed in one direction (for example, from spoke to hub).

Three filtering options are available (configurable per interface):

  1. Do not flood any LSAs out the interface. This option is suitable if the neighbor is simply-connected and has a statically configured default route with the address of this interface as next-hop.
  2. Flood a minimum set of self-generated LSAs out the interface (e.g. router-LSA, intra-area-prefix-LSA, and link-LSA and network-LSA corresponding to the connected interface); suppress all non-self-originated LSAs. This option is suitable if the neighbor is simply-connected and has a statically configured default route with a loopback or system interface address as next-hop
  3. Flood a minimum set of self-generated LSAs (e.g. router-LSA, intra-area-prefix-LSA, and link-LSA and network-LSA corresponding to the connected interface) and all self-generated type-3, type-5 and type-7 LSAs advertising a default route (0/0) out the interface; suppress all other flooded LSAs. This option is suitable if the neighbor is simply-connected and does not have a statically configured default route.

3.7. FIB Prioritization

The RIB processing of specific routes can be prioritized through the use of the rib-priority command. This command allows specific routes to be prioritized through the protocol processing so that updates are propagated to the FIB as quickly as possible.

Configuring the rib-priority command either within the global OSPF or OSPFv3 routing context or under a specific OSPF/OSPFv3 interface context enables this feature. Under the global OSPF context, a prefix list can be specified that identifies which route prefixes should be considered high priority. If the rib-priority high command is configured under an OSPF interface context then all routes learned through that interface is considered high priority.

The routes that have been designated as high priority will be the first routes processed and then passed to the FIB update process so that the forwarding engine can be updated. All known high priority routes should be processed before the OSPF routing protocol moves on to other standard priority routes. This feature will have the most impact when there are a large number of routes being learned through the OSPF routing protocols.

3.8. OSPF Configuration Process Overview

Figure 12 displays the process to provision basic OSPF parameters.

Figure 12:  OSPF Configuration and Implementation Flow 

3.9. Configuration Notes

This section describes OSPF configuration caveats.

3.9.1. General

  1. Before OSPF can be configured, the router ID must be configured.
  2. The basic OSPF configuration includes at least one area and an associated interface.
  3. All default and command parameters can be modified.

3.9.1.1. OSPF Defaults

The following list summarizes the OSPF configuration defaults:

  1. By default, a router has no configured areas.
  2. An OSPF instance is created in the administratively enabled state.