Alexa Topsites: Fastest and Slowest

September 19, 2017

At Latency.at we provide performance and availability metrics of your sites and services from multiple global locations and provides the results as Prometheus metrics. While our users can aggregated, filtered and graph the metrics as they wish, we deployed a Grafana + Prometheus based demo at https://demo.latency.at. That Prometheus servers monitors the following sites from the Alexa Topsites List:

Additionally the demo scrapes https://latency.at and my personal blog https://5pi.de.

In this blog post we explore the available metrics and see what we can find out about the performance and availability of the monitored sites.

Lowest Latency

Let start with the obvious question: What are the fastest sites?

all.png bottomk(5, avg_over_time(probe_duration_seconds[1h])):

Please note: Due to the design of Prometheus, top/bottomk returns the largest/smallest values at any point in time. This is why some timeseries have gaps. See this issue for more details.

Across all regions, that would be https://wikipedia.org in San Francisco. But providing good performance in a single location is easy, so lets look at the site with the lowest average latency:

all-average.png bottomk(5, avg by (instance) (avg_over_time(probe_duration_seconds[1h])))

Unsurprisingly, the best sites are global players like Google/YouTube and Facebook. But also small sites like my blog perform similarly with off-the-shelf CDN configuration. For static content, low latency isn’t hard to achive after all.

Highest Latency

Now what are the worst performing sites? worst.png topk(5, avg by (instance) (avg_over_time(probe_duration_seconds[1h])))

The globally worst performing sites target chinese customers, so global performance isn’t a priority there. We also don’t have any probes in China yet which would compensate that somewhat.

More surprising though, the next worst site is https://mail.live.com. That’s the Microsoft Mail login, a service targeting a global audience. Yet, it isn’t doing very well latency wise.

Let’s look at this in more detail. mail-live-grafana.png You can see yourself in our demo

Even in the US the latency is quite bad. The origin is probably somewhere on the east cost given the similar latency of San Francisco and Frankfurt. If we look at the request phases we see that the request is slow across the board.

In the typical modern web stack, handling a request touches many quite different services. First you depend on DNS performance, then the public internet between a client and the system that terminates the traffic, usually a reverse proxy. The TLS handshake happens here and depends on the systems CPU mainly. After that, the actual backend processes the request.

If we divide mail.live.com’s latency by the average latency of all sites, maybe we can figure out which is particular slow compared to other sites:

mail-live-vs-avg.png avg by (phase) (probe_http_duration_seconds{instance="https://mail.live.com"}) / avg by (phase) (probe_http_duration_seconds)

processing and transfer times are pretty average, even a bit faster. Everything else though is way slower than the average, with DNS resolution taking almost an order of magnitude more time than for the average site.

If Microsoft wants to improve performance here, DNS is the place to look.

Next?

Have something you want to monitor too? Sign up for an account with free trial. Our Katacoda Quickstart Scenario can help you getting started too.

Also check out our Public Metrics to play for free with our public cloud provider metrics.