CDN Comparison: CloudFront vs Fastly vs CloudFlare

September 06, 2017

Latency.at measures performance and availability of your sites and services from multiple global locations and provides the results as Prometheus metrics.

To make sure the metrics we provide are not skewed by local issues in the datacenters we operate, we continuously monitor a cached static target with 60s TTL behind the CloudFront, CloudFlare and Fastly CDNs. Of course we’re eating our own dogfood and do so by using Latency.at as you would.

The origin is a simple, static HTML page as S3 website in eu-central (Frankfurt). The CDN configuration is as simple as it can be, with CloudFlare being an exception. Here we have to configure CloudFront as origin since using custom Host headers is only possible with the CloudFlare Enterprise plan.

Although we monitor this to assure proper operation of our service, we can also draw some useful conclusions about these CDNs.

Overall latency

Let’s start by comparing the average from all our probes for each CDN over the past week:

overview-7d.jpg

Let’s average over 3h to make the graph more readable:

overview-7d-3h.png

Across all regions all CDNs perform similarly, while Fastly has by far the most consistent performance. But what CDN would work best for you? Usually you don’t care about average performance but more about providing acceptable performance for all users. So which one has the best worse-case performance? For that we can look at the max across all probes for the 99th percentile of all requests based on a 15 minutes window. We’re using these recording rules for this:

# Prometheus 2.0 rule format
record: probe_duration_seconds:success
expr: probe_duration_seconds
  * (probe_success == 1)
-------
record: probe_duration_seconds:15m:99th
expr: quantile_over_time(0.99,
  probe_duration_seconds:success[15m])

overview-max-7d.jpg

Fastly looks again like a good choice. These slow requests are usually requests on expired objects, so we effectively measure fetches from origin. Since CloudFlare fetches from CloudFront, it possible that the results for CloudFlare a bit skewed, so we will focus on CloudFront vs Fastly in our comparison.

CloudFront

cloudfront-avg-7d.png CloudFront performance is quite inconsistent but similar across regions. Unsurprisingly due to cold requests, performance on average is best in Frankfurt, the location of the S3 bucket and worse in regions far from the bucket.

Let’s have a look at the results from the Bangalore probe, one of the worse performing regions, and look at the various connection phases for requests in the last hour. Since the stacking of the graph works better, we use the Latency.at Demo Dashboard for these.

cloudfront-detail-1h.png

As we would expect, processing takes the most time. That’s the time after a connection was established and it’s waiting for the first byte.

If an object is cached, this should be really low since it can be returned from the edge cache. If it’s not cached or the TTL is expired, it needs to check the freshness of the origin object in Frankfurt.

Surprisingly we see a quite high processing time while the objects should be cached. On the other hand, fetching the object after the cache is expired is sometimes quite fast with processing times of merely 120ms. It seems like CloudFront isn’t very good at delivering cached content with consistent performance but quite fast at verifying if a object in S3 has changed.

When sending a HEAD request from the probe directly to the origin, without hitting the CDN it takes around 350ms. So somehow the CloudFront edge caches have a faster way to get to the origin.

Fastly

Let’s compare this to Fastly: fastly-avg-7d.png

The latency is incredibly consistent, again with regions closer to the bucket being faster. Let’s have a closer look at data from the Bangalore probe here too.

fastly-detail-1h.png Requests for cached objects finish within 80ms, similar to the fastest requests at CloudFront. But once the object expires, fetching from the origin takes about 500ms. This is about the same time it takes for the probe to reach the S3 bucket website directly, yet slower than CloudFront.

Conclusion

Take the results with a grain of salt. It’s not a thorough benchmark which should test more content types, origins and configurations. Yet, it shows that performance between CDNs can differ a lot. Fastly provides very consistent performance with no surprises across all our probes. CloudFront’s performance on the other hand is quite unpredictable: At best it performs similar to Fastly for cached object and is much faster when it needs to connect to the S3 origins. Unfortunately more often than not it’s much slower and on average Fastly provides better performance.

What surprised me the most is to see how fast CloudFront can fetch from S3: Revalidating an object (called RefreshHit in CloudFront) and return the response to a client takes ~120ms. One would expect the CloudFront edge cache to send a HEAD request to the origin when a expired object is requested. Doing that from our prober takes ~400ms and even a simple ping round trip takes ~140ms. So CloudFront is doing something to optimize this. What exactly we don’t know but using CloudFront is something to consider when hosting content on S3.

No matter which CDN you use, if you care about the real performance of your site around the world, you need to monitor it. And if you’re using Prometheus, we built Latency.at exactly for this. Check it out, we have a free trial!

Follow us on Twitter and let us know what we should blog about next time. Maybe you have a suggestion for a service we should take a closer look at?