Published on Sat, Oct 22, 2016 by Aaron
Yesterday, October 21 2016, there was a large DDoS attack against Dyn, one of the leading authoritative DNS providers. The attack started around 11 AM UTC and lasted for hours, severely hurting the reachability of big name sites like Twitter, GitHub and PayPal. It hurt CDN Fastly too. Mainstream media logically picked up on the story and it seems the attack was done using Mirai IoT botnet. The exact nature and scale of this attack is currently not known to the public. 24 hours after the attack ended, it's still a much talked about topic on Twitter.
This article is not about what exactly happened, who was behind the attack or why they did it. We want to show the real impact of the attack on the performance of Dyn's authoritative DNS service globally, using our unique RUM for DNS data. TurboBytes monitors the real-world performance of authoritative DNS providers from across the globe, 24/7, by running tests in the browsers of millions of people that are connected to thousands of networks.
Dyn official report: "On Friday October 21, 2016 at approximately 11:10 UTC, Dyn came under attack by a large Distributed Denial of Service (DDoS) attack against our Managed DNS infrastructure in the US-East region. Customers affected may have seen regional resolution failures in US-East and intermittent spikes in latency globally. Dyn's engineers were able to successfully mitigate the attack at approximately 13:20 UTC, and shortly after, the attack subsided."
This text tells very little about the level of pain Dyn customers had. You may read it as if only some customers' sites were unreachable for some users (in US-East) and there was an occasional slow response on other continents.
Our RUM for DNS data clearly shows it was pretty bad:
The chart above shows the Failratio of Dyn and other DNS providers as measured through recursive resolvers (Google Public DNS, OpenDNS and ISP resolvers) in the state of New York. Our tests are initiated in the browser and use a random subdomain to force the recursive to get the response from the authoritative.
During the two hour attack, Failratio averaged at 45% and peaked at 80%. In our book, that qualifies as an outage.
If Dyn servers responded, often it did so very slowly.
From our data we can confirm the problems in US were limited to US-East. But what happened outside US? Most countries were just fine, but many users in Germany and France definitely noticed Dyn failing during the first wave of the attack:
Dyn official report: "At roughly 15:50 UTC a second DDoS attack began against the Managed DNS platform. This attack was distributed in a more global fashion. Affected customers may have seen intermittent resolution issues as well as increased global latency. At approximately 17:00 UTC, our engineers were again able to mitigate the attack and service was restored."
That end time of 17:00 is maybe a typo, as the original status post states the attack lasted much longer and the incident was marked as Resolved at 22:17 UTC.
Our data shows the second wave indeed started at roughly 15:50 UTC and ended at ~ 20:30 UTC.
In US the Failratio averaged 32% and again the peak was 80%. Dyn performance was different in other countries, like United Kingdom and France:
In both major markets in Europe, the Failratio peaked at 100% and this peak lasted 30 minutes. This means no query to Dyn got a response! After the peak the Failratio dropped to ~ 10% in United Kingdom but in France it stabilized at a still very high 60%. The France chart also shows performance of other DNS providers was degraded for two hours. After taking a closer look at our data it became clear this pattern showed only on AS3215, which belongs to Orange, the biggest consumer ISP. Dyn was completely down from AS3215. Failratio was stable at 100%. Ouch. This must have impacted millions of people!
Interestingly, for two hours the other major DNS providers also suffered on the network of Orange. Is this related to the attack against Dyn? Maybe. We don't know. Also in Brasil the websites of Dyn customers suffered.
During the second wave, we ran some Pulse tests to view the real-world DNS behaviour from 80 machines around the globe, most connected to consumer ISP networks.
The first test ran at 17:33:44 UTC and showed a 70.1% error rate when our Pulse agents tried to query for
soundcloud.com directly from Dyn's servers (
The second test ran at 17:35:10 UTC. This time the Pulse agents queried for
soundcloud.com via Google Public DNS, OpenDNS and their ISP resolvers. The error rate was 40.6%.
The record in question has a TTL of 600 (10 minutes), so logically the error rate was lower when querying recursive resolvers because some could serve the response from cache and did not have to reach out to Dyn. Another possible explanation for the difference in error rates is that in effort to mitigate attack traffic, Dyn may have blocked end user IPs from hitting their nameservers in order to give recursives a higher chance of reaching Dyn. In hindsight, we should have run more Pulse tests using random subdomains to conclusively test recursives' reachability to Dyn.
This attack was a big one. The impact was tremendous because it lasted for several hours and it was against a leading DNS provider, bringing down many popular websites and online services in nations on multiple continents. TurboBytes is in the unique position to see the real-world impact for users on consumer ISP networks. Our RUM for DNS data does not lie.
April 2 2015, we wrote a blog post Why You Should Use Two DNS Providers. This explains how recursive resolvers work and why using more than one DNS provider makes your website or online service much more reliable and resilient against an attack like this one against Dyn.
The TurboBytes tool Pulse can come in handy in case you want to diagnose DNS performance on eyeball networks worldwide. Next time you think something may be wrong, visit https://pulse.turbobytes.com/.
Consider using OpenDNS as your primary resolver, from home and the office.
Their founder David Ulevitch tweeted yesterday about a cool feature that helps you experience the Internet as you expect it to be even if a DNS provider like Dyn is down:
Pro-tip: OpenDNS users generally see the Internet as they should. We do a good job of handing "last known good" IPs when we can't resolve.— ☁ David Ulevitch ☁ (@davidu) October 21, 2016