Blog Post

Who turned the Internet Off?

In November 2011 we discovered a jump in failures across all cities – both on backbone Carriers and Last Mile locations on Timewarner, ATT & Verizon FIOS.

I started this Monday morning, like any other Monday catching up with the office in NY via Skype. Out of the sudden I lost my internet connectivity. Not the first time it happened, so I figured out it might have been the router or the ISP, so I simply rebooted it – but no luck.

Fifteen minute later the internet is back on its own and I get back on Skype to continue my conversation. To my surprise I find out the person on the other line – all the way in New York City – had just experienced the same connectivity problem. Quickly we found out we were not alone – Twitter had several people complaining of issues all around US and the globe, Gizmodo put a quick article blaming it on Timewarner.

We quickly took a look at the performance monitoring data collected by Catchpoint monitoring stations around the world and discovered a jump in failures across all cities – both on backbone Carriers and Last Mile locations on Timewarner, ATT & Verizon FIOS. The problem started at around 9:15 am EDT  and lasted until 9:30 am EDT.

Our probes captured several failures for major web sites, ad serving companies, content distribution networks, Public DNS resolvers… as if someone used some kind of kill switch. Several of the captured traceroutes during this period, showed routes that went nowhere – the routers did not know the paths to take to reach the destination.

  • From Singapore AWS to a Major CDN (Why not served out of Singapore… not the right topic)
Tracing route to i.cdn.turner.com [8.26.197.254] over a maximum of 30 hops:

1 * * * Timed Out

2 <1 ms <1 ms <1 ms ec2-175-41-128-192.ap-southeast-1.compute.amazonaws.com[175.41.128.192]

3 <1 ms <1 ms <1 ms ec2-175-41-128-233.ap-southeast-1.compute.amazonaws.com[175.41.128.233]

4 * * * Timed Out

5 * * * Timed Out

6 1 ms 1 ms 1 ms 116.51.17.45

7 2 ms 2 ms 2 ms ae-2.r20.sngpsi02.sg.bb.gin.ntt.net[129.250.4.142]

8 185 ms 187 ms 187 ms as-3.r20.snjsca04.us.bb.gin.ntt.net[129.250.3.88]

9 186 ms 173 ms 175 ms ae-1.r07.snjsca04.us.bb.gin.ntt.net[129.250.5.53]

10 ms ms ms Unknown

11 ms ms ms Unknown

12 ms ms ms Unknown

13 ms ms ms Unknown

  • From Hong Kong to Amazon S3:
Tracing route to xyz.s3.amazonaws.com [207.171.185.201] over a maximum of 30 hops:

1 2 ms <1 ms <1 ms 110.232.176.161

2 170 ms 132 ms 63 ms ge4-6.br02.hkg04.pccwbtn.net[63.218.1.197]

3 147 ms 147 ms 147 ms sjp-brdr-03.inet.qwest.net[63.146.27.165]

4 ms ms ms Unknown

5 ms ms ms Unknown

6 ms ms ms Unknown

7 ms ms ms Unknown

There has been no official report on what happened, however we were able to collect various reports on Nanog, Twitter, emails… of multiple concurrent issues at hand:

– very large number of BGP announcements / updates / withdrawals. Looking at BGP Mon we can see the increase at the same exact time.

bgpmon 1162011

Some other Graphs  from TEAM CYMRU :

Internet Routing Table Delta:

Internet Prefix Delta from Team CYMRU

BGP Announcements / Withdrawal:

BGB Announcements / Withdrawal TEAM CYMRU

– problems with Juniper routers, which could be tied to the BGP announcements – Source Nanog.

– DNS Cache Poisoning in Brazil that is creating havoc around the world (http://net-security.org/secworld.php?id=11903)

Let’s hope this is was a onetime glitch or human mistake  – that can be resolved and it is not something worse.

======= UPDATE =======

Juniper Networks confirmed a bug on their routers causing the BGP issues.

Bulletin: http://pastebin.com/HBWiH92j

Juniper Tweet

Mehdi – Catchpoint

Web Experience
Network Path
DNS
CDN
BGP
Cloud Migration

You might also like

Blog post

SRE Survey 2021: Where do we go from here

Blog post

How To – Improve Your Secure Web Gateway Rollout With Endpoints

Blog post

VMworld 2021: Automation, Elastic Edge, and the Increasing Importance of User Experience

Blog post

Introducing Opportunities & Experiments on WebPageTest: Take the Guesswork out of Performance