What can be done (and what has been done) about it
This is the final installment in the BGP security series and covers BGP configuration best practices.
Since BGP lacks an intrinsic mechanism to secure routing, several mechanisms have been adopted to overcome this limitation. Among them, RPKI is the mechanism that in recent years had the highest explosion in terms of diffusion and importance. RPKI (Resource Public Key Infrastructure) is a mechanism to cryptographically sign records (called Route Origin Authorization) that associate a network prefix to the legitimate origin AS (Figure 1). ROAs are kept in an RPKI database, and when a router receives a BGP announcement it checks against the RPKI database to verify that the most-right AS of the AS path is the expected origin for the NLRI carried into the BGP announcement. This is usually called RPKI validation.
Figure 1: ROA examples
The result of the check can be valid if the origin is the AS number specified into the ROA, invalid if the origin is not the AS number specified into the ROA, or not-found if the ROA for the announced prefix has not been found. Unfortunately, despite being a very powerful mechanism and the importance of everyone signing their prefixes, RPKI is still not widely adopted (Figure 2). Moreover, it is well known that RPKI validation is not enough to drop all the invalid announcements.
Figure 2: RPKI adoption as of 2 March 2020
For example, let’s consider the scenario depicted in Figure 3. This is the same scenario depicted in the previous blog, with the exception that AS4 has now signed its prefix P into the RPKI database. Suppose that the only AS performing RPKI validation is AS1. When AS1 receives the leaked route from AS3, it will check if the most-right AS in the AS_PATH (AS4) is the expected origin. The result of the check will be valid because the ROA is found and AS4 is specified as the legitimate origin for P.
Figure 3: RPKI and route leaks
To overcome these limitations, other mechanisms have been proposed by the community such as BGPSec, which unlike RPKI, is an extension of BGP. In BGPSec, each AS cryptographically signs the BGP messages sent to its neighbors to create a chain of trust that “provide confidence that every AS on the path of AS’s listed in the UPDATE message has explicitly authorized the advertisement of the route” (RFC8205). This chain of trust would introduce a strong security defense to BGP messages, but still won’t be enough to fix every BGP routing vulnerability1. In addition to that, BGPSec introduces one new major challenge that slowed its adoption: each router must cryptographically verify and sign every BGP message they send. This will likely introduce a computational overhead on routers that can be solved only by upgrading them with crypto hardware accelerators2.
Other mechanisms focused on avoiding the propagation of route leaks have been taken into consideration in the IETF community (LDM, such as ASPA, Path RPKI, and AS cones), but most of them are still under discussion and not currently adopted.
While waiting for a widely adopted solution that would solve most of the routing problems described in this article, AS administrators are kindly requested to adopt good MANRS (https://www.manrs.org/) to secure BGP routing. MANRS is a global initiative supported by the Internet Society and describes a set of best practices that each AS administrator should follow to make the global BGP routing infrastructure more robust and secure. MANRS requires AS administrators to:
- Filter inbound and outbound BGP messages
- Facilitate the coordination among operators by publishing up-to-date contact information
- Facilitate the validation of announcement by keeping up-to-date IRR entries
- Impede the propagation of illegitimate traffic by applying anti-spoofing techniques
To achieve these goals, network administrators are currently using mechanisms (often artisanal) tailored to solve specific routing problems, such as peer-lock, filtering based on IRR entries, and/or maximum number of prefixes allowed on a BGP session. Each of these approaches has pros and cons, but all they can do is just stop the propagation of routing anomalies. At the moment, none of them can guarantee that any of the networks owned by the administrator are hijacked in any part of the world.
As described in this article, it is of utmost importance to react to routing anomalies as soon as possible to limit their effects and reduce the disservices they cause. For example, the solution in the Cloudflare case was to use the PSTN to contact the network administrator of the AS leaking the routes and its provider to apply special filters.
A viable approach to reduce the MTTR of these incidents as much as possible is to use BGP monitoring and alerting platforms. Most of these platforms allow their users to set up alarms when a routing anomaly like a hijack or leak happens, helping the AS administrator to identify the root cause of the problem and to take the necessary countermeasures. The most diverse the BGP data sources used by the monitoring infrastructure are, the more effective they will be at detecting and alerting such anomalies.
Read our ebook Comprehensive Guide to BGP Monitoring for a deeper look into BGP routing.
1._ https://labs.apnic.net/?p=447_2._ https://www.washingtonpost.com/sf/business/2015/05/31/net-of-insecurity-part-2