The Ins and Outs of Content Delivery Networks (CDN)

by | Jan 26, 2021 | Engineering

It is not a hidden fact that everybody’s day-to-day operations would become slow without caching. For example, the amount of time required for each component to load in a web application will add up quickly and might drive our users towards using alternatives. Ask any solution architect, and they will not for once forget to mention Content Delivery Networks while designing a system. CDNs are a fundamental piece of any modern HTTP app, and there’s no doubt to it.

But weren’t we talking about caching, then how did CDN suddenly appear in the picture? In this post, we will walk through how these two terms are interrelated, just like squares are with rectangles, and by the end of this post, you will have cleared all the questions you have regarding CDNs that you might ever have!

What is CDN?

A CDN (Content Delivery Network) is a geographically dispersed network of proxy servers and their data centers. The network’s main objective is to provide low latency and high availability for heavy static files or ‘content’ like HTML pages, javascript files, stylesheets, images, and videos. If you’re confused about why we only cache heavy static files, you can read the first part of this blog to understand caching in depth. Trust me, it won’t take less than 3 mins, and you’ll be clear with caching and why it’s considered a good approach.

If CDNs still sounds tricky, think of it as a big black box that sits somewhere between your server and your users and make applications load time fast and the black box uses caching to do the magic. You see, CDNs utilize caching, but caching doesn’t need CDN, just like all squares are rectangles and not vice versa.

Why use the black box?

Performance 👨‍💻

Visitors encounter quicker page loading times by spreading content closer to website visitors via a nearby CDN server (among other optimizations). A CDN will decrease bounce rates and increase the amount of time people spend on the web, as users are more likely to click away from a slow-loading site. In other words, a quicker website ensures that more people spend longer and hang around.

Akamai Reveals 2 Seconds As The New Threshold Of Acceptability For Web Page Response Time

Cost Optimization 💲

A major expense for websites is bandwidth usage costs for website hosting. CDNs are able to minimize the amount of data that a source server must provide by caching, thus lowering hosting costs for website owners.

DDoS Mitigation 🔐

A distributed denial-of-service (DDoS) attack is a malicious attempt to interrupt a targeted server, service, or network’s usual traffic by flooding the target or its surrounding infrastructure with a flood of Internet traffic.

The game is simple who was more network capacity wins. CDNs help us mitigate a DDoS attack by absorbing very high request rates and serving them from their cache. For a CDN, it’s effortless to serve cache content compared to the origin server that needs to render the page every time it’s requested.

Availability 🌎

Large quantities of traffic (might even be DDoS) or hardware failures can interrupt the normal operation of the website. A CDN can manage more traffic and withstand hardware failure better than many source servers because of its distributed nature.

CDNs absorb all the increased traffic/DDoS without affecting the origin server.

What’s inside the black box?

The black box consists of a lot of servers that cache content. Whenever you visit a website, a few things happen in the background. The browser understands your request and then fetches the website’s IP address via a DNS server.

The DNS server tells the user to go to the nearest CDN PoPs (Points of Presence), and then the PoPs provides the user with the requested files. If the PoPs don’t have the file (aka cache miss), the servers request the files from the origin server, stores a copy of it to itself (aka caching), and serves the requested files to users.

When a subsequent request from a similar location is made, the CDN server can quickly serve the file from its cache (aka cache hit).

What’s inside the black box? Pt.2

Apart from storing cache after it’s requested for the first time, there’s another approach. The other approach is a ‘PUSH’ based approach. As the word says, we push our static files to all the CDN servers and wait until they are requested.

Content is cached at every PoPs after the first request for subsequent request.

Content is cached at every PoPs before the first request.

As you might have guessed, it would be faster for a first-time user too. The ‘PULL’ based approach, where we cache file after the initial request, is slow for the first user but not for the subsequent users.

Black box on Steroids

Cloudflare with edge caching goes on optimizing the pull-based approach further by utilizing users as PoPs. Cloudflare goes on to describe it as:

For example: A user has been to your site before, but the caching rule on your browser says, “This page has been updated.” This means the request will have to go to the origin server. If the user is in New York, and the origin server is in Singapore, that’s a long call. However, a CDN knows to retrieve the content from your site every time it’s updated. The updated site is now cached at the closest CDN server in Atlanta. This reduces load time significantly for subsequent visits.

Edge Caching uses users as PoPs.

Moreover, Cloudflare has dedicated data centers that are rented to them by ISPs in more areas around the world than a traditional CDN has. This additional caching layer, called caching on the edge, is given by Cloudflare.

Netflix uses a similar strategy where they use ISPs as PoPs to cache their content and deliver a better experience.

Final Thoughts 🌟

CDNs have become essential to any modern infrastructure. The different types like the ‘PULL’ and ‘PUSH’ based approach are use cases for different user types and budgets. It’s straightforward to set up a pull-based model as the main work of caching at needed PoPs is done by the provider. People prefer to use pull based for the same reason alongside being cheaper than the alternative. Sometimes being fast is a requirement, and a push-based approach fulfills that. It’s very user-dependent, and I am attaching a link to explore the difference in depth.

I hope with the end of this series, you’re able to understand every bit of caching and CDNs and are excited to try and implement them in your application or website.

Happy Optimizing! ✨

Explore more

Serverless, FaaS and why do you need them?

In recent years, serverless adoption has started, with more and more individuals depending on serverless technology to meet organizations’ specific needs. A survey conducted by Serverless Inc showed in 2018 that half of the respondents used serverless in their job,...

read more

The DevOps Roadmap: Unikernels

Containerization is one of the core building principles of clouds and DevOps, but traditional VMs and containers lack the security and agility that modern infrastructure craves. We are moving towards workloads that are smaller, faster, and more secure than the...

read more

The DevOps Roadmap: Virtualization

The Full-Stack Developer's Roadmap Part 1: FrontendThe Full-Stack Developer's Roadmap Part 2: BackendThe Full-Stack Developer's Roadmap Part 3: DatabasesThe Full-Stack Developer's Roadmap Part 4: APIsThe DevOps Roadmap: Fundamentals with CI/CDThe DevOps Roadmap: 7...

read more

Cloud Computing models: SaaS vs IaaS vs PaaS

Companies embrace cloud computing worldwide, and the forecasted size of 1025.9 billion USD by 2026 says the same story. Owning and managing infrastructure comes with a considerable cost and improper utilization of human resources. Companies are meant to foster...

read more

Interested in what we do? Looking for help? Wanna talk about software strategy?