On April 1, 2020, Rostelecom pulled a not-so-hilarious April Fools’ prank on the Internet: they hijacked 8,877 networks. The modus operandi is partially like what we’ve seen last summer with the Allegheny-DQE leak that ended up spreading to Verizon, Cloudflare, and many other carriers, resulting in a widespread global outage.
By analyzing data collected by Route Views, RIPE NCC RIS, and by our private collectors, we see that at 7:28 pm UTC, Rostelecom (AS12389) announced about 10k more specific networks in the wild via its peer Rascom (AS20764). These routes were up for about 10 minutes, causing a traffic redirection of several major Internet services, such as AWS, Google, and Cloudflare. This leak was then propagated by Rascom providers: CenturyLink (AS3356) and Cogent (AS174). This propagation was probably due to poor ingress filter mechanisms. Aftab Siddiqui (Internet Society) wrote an excellent blog entry about this event.
In this post, we would like to focus our attention on something very different that happened this time: the role of RPKI in limiting the spread of this BGP hijack attempt.
First, take into consideration that no AS can reach the whole Internet destination on its own. The Internet is a composition of different companies serving different locations, and each must rely on their providers to get to every Internet destination not directly connected or reachable via peers and customers.
The exception to that is represented by 16 AS’s which are commonly considered to be provider-free. These AS’s are worldwide transit providers that can reach every Internet destination relying only on their peers. These AS’s are directly connected and sharing their customer cone so that all traffic arriving to this clique can be directed to the proper destination. In Rostelecom-Rascom case, the hijacked routes have reached this clique of connections from two entry points: CenturyLink (AS3356) and Cogent (AS174).
During the last couple of years, several of the provider-free AS’s started to implement the Route Origin Validation (ROV) policy. They extract from incoming BGP packets the network advertised and the origin AS and test them against the Route Origin Authorization (ROA) of that network. The ROA is a cryptographically signed object in the IRR that states that an AS is authorized to originate a certain prefix. If an ROA is found, and the AS is not in the authorized AS list, then the network announcement is dropped.
This policy can be applied to any of their BGP neighbors – for example, the case of Telia (AS1299) and NTT (AS2914) – or only on peers, like in the case of AT&T (AS7018), PCCW (AS3491) and TATA (AS6453).
The net result of this policy application is that the routes whose destination network is originated by a non-allowed origin will not be propagated, thus protecting BGP neighbors from using invalid routes. This was particularly useful this time, since 78% of the hijacked networks had an ROA, while only 22% of them were not signed. Curiously enough, 8% of the networks in the list legitimately belong to Rostelecom, that signed them in RPKI, and were thus marked as valid.
The effect can be appreciated by looking at data shared by most of the provider-free AS’s with route collectors. The histogram shows the number of more specific networks announced by each provider-free AS connected to a route collector and clearly shows the different behavior among those AS’s which applied ROV and those that still have not applied it. If every provider-free AS was applying ROV, only the networks without a ROA would have been spread all over the Internet, causing a global disruption to each of the services running on top of them.
The road to a more secure Internet is still long and full of obstacles. Most networks do not currently have a ROA, and the number of important AS’s applying ROV is still too small to be largely effective. A recent survey conducted by CAIDA among 75 network operators showed that besides the limited adoption, the biggest obstacles as perceived by them were high costs and complexity of deployment, leading them to prefer other security mechanisms like route filtering and deaggregation.
This is a well-known chicken-or-egg problem that affects RPKI from its early adoption. Unless a strong incentive is in place to encourage network operators to both sign their routes and drop invalid announcements, RPKI will not be as effective as it could be.
The key of solving this impasse is that every network operator does his/her part by simply signing ROAs for each of his/her networks. As pointed out in the survey, a larger adoption is one of the keys to have a successful deployment of RPKI. The steps to sign your ROA are very easy, and a great guideline can be found on MANRS website. A large base of signed ROAs would be a great incentive to push transit networks to deploy ROV, thus completing the RPKI application.
However, security measures such as RPKI are not enough by themselves, since bad actors will always find new ways to get around them. Security must also be combined with a strong 24/7 monitoring strategy that alerts you whenever there are changes made to your BGP routes.
Imagine living in a neighborhood where there are a lot of break-ins – simply locking your doors and windows isn’t enough to feel completely safe; you also want a home alarm system that wakes you up when there’s a forced entry. And make no mistake, the Internet is a dangerous neighborhood with a lot of malicious and resourceful criminals prowling the streets. The only way to get a good night’s sleep is by ensuring that you and your customers are protected, and that requires BOTH strong security measures and round-the-clock monitoring to catch anything that slips through.
Learn more about BGP Security in our blog series, One Year in BGP (In)Security.
To learn more about how Catchpoint can help, download our handbook, BGP Monitoring with Catchpoint Network Insights.