The Ins and Outs of Content Delivery Networks (CDN)

by | 26.01.2021 | Engineering

It is not a hidden fact that everybody’s day-to-day operations would become slow without caching. For example, the amount of time required for each component to load in a web application will add up quickly and might drive our users towards using alternatives. Ask any solution architect, and they will not for once forget to mention Content Delivery Networks while designing a system. CDNs are a fundamental piece of any modern HTTP app, and there’s no doubt to it.

But weren’t we talking about caching, then how did CDN suddenly appear in the picture? In this post, we will walk through how these two terms are interrelated, just like squares are with rectangles, and by the end of this post, you will have cleared all the questions you have regarding CDNs that you might ever have!

What is CDN?

A CDN (Content Delivery Network) is a geographically dispersed network of proxy servers and their data centers. The network’s main objective is to provide low latency and high availability for heavy static files or ‘content’ like HTML pages, javascript files, stylesheets, images, and videos. If you’re confused about why we only cache heavy static files, you can read the first part of this blog to understand caching in depth. Trust me, it won’t take less than 3 mins, and you’ll be clear with caching and why it’s considered a good approach.

If CDNs still sounds tricky, think of it as a big black box that sits somewhere between your server and your users and make applications load time fast and the black box uses caching to do the magic. You see, CDNs utilize caching, but caching doesn’t need CDN, just like all squares are rectangles and not vice versa.

Why use the black box?

Performance 👨‍💻

Visitors encounter quicker page loading times by spreading content closer to website visitors via a nearby CDN server (among other optimizations). A CDN will decrease bounce rates and increase the amount of time people spend on the web, as users are more likely to click away from a slow-loading site. In other words, a quicker website ensures that more people spend longer and hang around.

Akamai Reveals 2 Seconds As The New Threshold Of Acceptability For Web Page Response Time

Cost Optimization 💲

A major expense for websites is bandwidth usage costs for website hosting. CDNs are able to minimize the amount of data that a source server must provide by caching, thus lowering hosting costs for website owners.

DDoS Mitigation 🔐

A distributed denial-of-service (DDoS) attack is a malicious attempt to interrupt a targeted server, service, or network’s usual traffic by flooding the target or its surrounding infrastructure with a flood of Internet traffic.

The game is simple who was more network capacity wins. CDNs help us mitigate a DDoS attack by absorbing very high request rates and serving them from their cache. For a CDN, it’s effortless to serve cache content compared to the origin server that needs to render the page every time it’s requested.

Availability 🌎

Large quantities of traffic (might even be DDoS) or hardware failures can interrupt the normal operation of the website. A CDN can manage more traffic and withstand hardware failure better than many source servers because of its distributed nature.

CDNs absorb all the increased traffic/DDoS without affecting the origin server.

What’s inside the black box?

The black box consists of a lot of servers that cache content. Whenever you visit a website, a few things happen in the background. The browser understands your request and then fetches the website’s IP address via a DNS server.

The DNS server tells the user to go to the nearest CDN PoPs (Points of Presence), and then the PoPs provides the user with the requested files. If the PoPs don’t have the file (aka cache miss), the servers request the files from the origin server, stores a copy of it to itself (aka caching), and serves the requested files to users.

When a subsequent request from a similar location is made, the CDN server can quickly serve the file from its cache (aka cache hit).

What’s inside the black box? Pt.2

Apart from storing cache after it’s requested for the first time, there’s another approach. The other approach is a ‘PUSH’ based approach. As the word says, we push our static files to all the CDN servers and wait until they are requested.

Content is cached at every PoPs after the first request for subsequent request.

Content is cached at every PoPs before the first request.

As you might have guessed, it would be faster for a first-time user too. The ‘PULL’ based approach, where we cache file after the initial request, is slow for the first user but not for the subsequent users.

Black box on Steroids

Cloudflare with edge caching goes on optimizing the pull-based approach further by utilizing users as PoPs. Cloudflare goes on to describe it as:

For example: A user has been to your site before, but the caching rule on your browser says, “This page has been updated.” This means the request will have to go to the origin server. If the user is in New York, and the origin server is in Singapore, that’s a long call. However, a CDN knows to retrieve the content from your site every time it’s updated. The updated site is now cached at the closest CDN server in Atlanta. This reduces load time significantly for subsequent visits.

Edge Caching uses users as PoPs.

Moreover, Cloudflare has dedicated data centers that are rented to them by ISPs in more areas around the world than a traditional CDN has. This additional caching layer, called caching on the edge, is given by Cloudflare.

Netflix uses a similar strategy where they use ISPs as PoPs to cache their content and deliver a better experience.

Final Thoughts 🌟

CDNs have become essential to any modern infrastructure. The different types like the ‘PULL’ and ‘PUSH’ based approach are use cases for different user types and budgets. It’s straightforward to set up a pull-based model as the main work of caching at needed PoPs is done by the provider. People prefer to use pull based for the same reason alongside being cheaper than the alternative. Sometimes being fast is a requirement, and a push-based approach fulfills that. It’s very user-dependent, and I am attaching a link to explore the difference in depth.

I hope with the end of this series, you’re able to understand every bit of caching and CDNs and are excited to try and implement them in your application or website.

Happy Optimizing! ✨

CommunityNew

The DevOps Awareness Program

Subscribe to the newsletter

Join 100+ cloud native ethusiasts

#wearep3r

Join the community Slack

Discuss all things Kubernetes, DevOps and Cloud Native

Related articles6

Introduction to GitOps

Introduction to GitOps

GitOps serves to make the process of development and operations more developer-centric. It applies DevOps practices with Git as a single source of truth for infrastructure automation and deployment, hence the name “Git Ops.” But before getting deeper into what is...

Kaniko: How Users Can Make The Best Use of Docker

Kaniko: How Users Can Make The Best Use of Docker

Whether you love or hate containers, there are only a handful of ways to work with them properly that ensures proper application use with Docker. While there do exist a handful of solutions on the web and on the cloud to deal with all the needs that come with running...

Cilium: A Beginner’s Guide To Improve Security

Cilium: A Beginner’s Guide To Improve Security

A continuation from the previous series on eBPF and security concerns; it cannot be reiterated enough number of times how important it is for developers to ensure the safety and security of their applications. With the ever expanding reach of cloud and software...

How to clean up disk space occupied by Docker images?

How to clean up disk space occupied by Docker images?

Docker has revolutionised containers even if they weren't the first to walk the path of containerisation. The ease and agility docker provide makes it the preferred engine to explore for any beginner or enterprise looking towards containers. The one problem most of...

Parsing Packages with Porter

Parsing Packages with Porter

Porter works as a containerized tool that helps users to package the elements of any existing application or codebase along with client tools, configuration resources and deployment logic in a single bundle. This bundle can be further moved, exported, shared and distributed with just simple commands.

eBPF – The Next Frontier In Linux (Introduction)

eBPF – The Next Frontier In Linux (Introduction)

The three great giants of the operating system even today are well regarded as Linux, Windows and Mac OS. But when it comes to creating all purpose and open source applications, Linux still takes the reign as a crucial piece of a developer’s toolkit. However, you...