The Ins and Outs of Content Delivery Networks (CDN)

Published 26.01.2021

Author Hrittik Roy

Categories Engineering

Tags

It is not a hidden fact that everybody’s day-to-day operations would become slow without caching. For example, the amount of time required for each component to load in a web application will add up quickly and might drive our users towards using alternatives. Ask any solution architect, and they will not for once forget to mention Content Delivery Networks while designing a system. CDNs are a fundamental piece of any modern HTTP app, and there’s no doubt to it.

But weren’t we talking about caching, then how did CDN suddenly appear in the picture? In this post, we will walk through how these two terms are interrelated, just like squares are with rectangles, and by the end of this post, you will have cleared all the questions you have regarding CDNs that you might ever have!

What is CDN?

A CDN (Content Delivery Network) is a geographically dispersed network of proxy servers and their data centers. The network’s main objective is to provide low latency and high availability for heavy static files or ‘content’ like HTML pages, javascript files, stylesheets, images, and videos. If you’re confused about why we only cache heavy static files, you can read the first part of this blog to understand caching in depth. Trust me, it won’t take less than 3 mins, and you’ll be clear with caching and why it’s considered a good approach.

If CDNs still sounds tricky, think of it as a big black box that sits somewhere between your server and your users and make applications load time fast and the black box uses caching to do the magic. You see, CDNs utilize caching, but caching doesn’t need CDN, just like all squares are rectangles and not vice versa.

Why use the black box?

Performance 👨‍💻

Visitors encounter quicker page loading times by spreading content closer to website visitors via a nearby CDN server (among other optimizations). A CDN will decrease bounce rates and increase the amount of time people spend on the web, as users are more likely to click away from a slow-loading site. In other words, a quicker website ensures that more people spend longer and hang around.

Akamai Reveals 2 Seconds As The New Threshold Of Acceptability For Web Page Response Time

Cost Optimization 💲

A major expense for websites is bandwidth usage costs for website hosting. CDNs are able to minimize the amount of data that a source server must provide by caching, thus lowering hosting costs for website owners.

DDoS Mitigation 🔐

A distributed denial-of-service (DDoS) attack is a malicious attempt to interrupt a targeted server, service, or network’s usual traffic by flooding the target or its surrounding infrastructure with a flood of Internet traffic.

The game is simple who was more network capacity wins. CDNs help us mitigate a DDoS attack by absorbing very high request rates and serving them from their cache. For a CDN, it’s effortless to serve cache content compared to the origin server that needs to render the page every time it’s requested.

Availability 🌎

Large quantities of traffic (might even be DDoS) or hardware failures can interrupt the normal operation of the website. A CDN can manage more traffic and withstand hardware failure better than many source servers because of its distributed nature.

CDNs absorb all the increased traffic/DDoS without affecting the origin server.

What’s inside the black box?

The black box consists of a lot of servers that cache content. Whenever you visit a website, a few things happen in the background. The browser understands your request and then fetches the website’s IP address via a DNS server.

The DNS server tells the user to go to the nearest CDN PoPs (Points of Presence), and then the PoPs provides the user with the requested files. If the PoPs don’t have the file (aka cache miss), the servers request the files from the origin server, stores a copy of it to itself (aka caching), and serves the requested files to users.

When a subsequent request from a similar location is made, the CDN server can quickly serve the file from its cache (aka cache hit).

What’s inside the black box? Pt.2

Apart from storing cache after it’s requested for the first time, there’s another approach. The other approach is a ‘PUSH’ based approach. As the word says, we push our static files to all the CDN servers and wait until they are requested.

Content is cached at every PoPs after the first request for subsequent request.

Content is cached at every PoPs before the first request.

As you might have guessed, it would be faster for a first-time user too. The ‘PULL’ based approach, where we cache file after the initial request, is slow for the first user but not for the subsequent users.

Black box on Steroids

Cloudflare with edge caching goes on optimizing the pull-based approach further by utilizing users as PoPs. Cloudflare goes on to describe it as:

For example: A user has been to your site before, but the caching rule on your browser says, “This page has been updated.” This means the request will have to go to the origin server. If the user is in New York, and the origin server is in Singapore, that’s a long call. However, a CDN knows to retrieve the content from your site every time it’s updated. The updated site is now cached at the closest CDN server in Atlanta. This reduces load time significantly for subsequent visits.

Edge Caching uses users as PoPs.

Moreover, Cloudflare has dedicated data centers that are rented to them by ISPs in more areas around the world than a traditional CDN has. This additional caching layer, called caching on the edge, is given by Cloudflare.

Netflix uses a similar strategy where they use ISPs as PoPs to cache their content and deliver a better experience.

Final Thoughts 🌟

CDNs have become essential to any modern infrastructure. The different types like the ‘PULL’ and ‘PUSH’ based approach are use cases for different user types and budgets. It’s straightforward to set up a pull-based model as the main work of caching at needed PoPs is done by the provider. People prefer to use pull based for the same reason alongside being cheaper than the alternative. Sometimes being fast is a requirement, and a push-based approach fulfills that. It’s very user-dependent, and I am attaching a link to explore the difference in depth.

I hope with the end of this series, you’re able to understand every bit of caching and CDNs and are excited to try and implement them in your application or website.

Happy Optimizing! ✨

Join 100+ cloud native enthusiasts

and stay in the loop on modern software development.

Sign up to receive exclusive content around cloud native software development right into your inbox.

We don’t spam! Read our privacy policy for more info.

More stories from our blog

What’s new in Kubernetes v1.21.2?

What’s new in Kubernetes v1.21.2?

It's June, and Kubernetes has released a new update with version 1.21.2. We will have a look in brief at the changes that came along with this update. We will also have a look at the bugs that Kubernetes removed ahead with the few things added. Let's roll. Changes...

Chaos Engineering: Not so Chaotic

Chaos Engineering: Not so Chaotic

It feels very complex when we talk a lot about cloud computing and developer operations. Furthermore, certain things look complicated, but they are not so if we easily understand those concepts. Today, we will discuss such a thing that sounds complex but is simple and...

On Charming Engineering Culture: My Notes

On Charming Engineering Culture: My Notes

Engineering teams are at the core of any modern organisation. They break/make an organisation, and empowering them is critical to any modern companies’ success. A motivated engineer brings more value than a ‘whatever’ engineer. Its high time managers and leaders focus...

Knative: Serverless on Kubernetes

Knative: Serverless on Kubernetes

Knative takes care of the details of networking, autoscaling (even to zero), and revision tracking when you run serverless containers on Kubernetes with ease.

Observability: Your Eyes in Cloud

Observability: Your Eyes in Cloud

Observability is all around the cloud. You might come across the term while exploring the vast stretches of documentations or blog posts, maybe videos or streams too. Well, from far you might have seen that this is a very broad term, and it’s expected. The topic is...

Cloud Firewalls Simplified: Beginners  Edition

Cloud Firewalls Simplified: Beginners Edition

Cloud technology is everywhere. From your photos to big corporations carrying out their day to day operations. But have you ever thought about the security needed to protect this vast pile of data? Security from external attacks by threat detection and elimination is...

Object and Block Storage: How They Differ?

Object and Block Storage: How They Differ?

The difference between block and file storage makes heads spin due to the complexity of definitions and technical jargon across the internet. Even a technical person sometimes forgets the business value and makes decision fatigue their best friend when trying to...

Helm: Why DevOps Engineers Love it?

Helm: Why DevOps Engineers Love it?

Kubernetes doesn’t have reproducibility built-in. At least, that’s what we hear most people complain as a cloud native consultation firm serving both startups and enterprises. I have been using Kubernetes for a while now, and it stands up to the mark of being a gold...

Interested in what we do? Looking for help? Wanna talk about software strategy?