It is not a hidden fact that everybody’s day-to-day operations would become slow without caching. For example, the amount of time required for each component to load in a web application will add up quickly and might drive our users towards using alternatives. Ask any solution architect, and they will not for once forget to mention Content Delivery Networks while designing a system. CDNs are a fundamental piece of any modern HTTP app, and there’s no doubt to it.
But weren’t we talking about caching, then how did CDN suddenly appear in the picture? In this post, we will walk through how these two terms are interrelated, just like squares are with rectangles, and by the end of this post, you will have cleared all the questions you have regarding CDNs that you might ever have!
What is CDN?
If CDNs still sounds tricky, think of it as a big black box that sits somewhere between your server and your users and make applications load time fast and the black box uses caching to do the magic. You see, CDNs utilize caching, but caching doesn’t need CDN, just like all squares are rectangles and not vice versa.
Why use the black box?
Visitors encounter quicker page loading times by spreading content closer to website visitors via a nearby CDN server (among other optimizations). A CDN will decrease bounce rates and increase the amount of time people spend on the web, as users are more likely to click away from a slow-loading site. In other words, a quicker website ensures that more people spend longer and hang around.
Akamai Reveals 2 Seconds As The New Threshold Of Acceptability For Web Page Response Time
Cost Optimization 💲
A major expense for websites is bandwidth usage costs for website hosting. CDNs are able to minimize the amount of data that a source server must provide by caching, thus lowering hosting costs for website owners.
DDoS Mitigation 🔐
A distributed denial-of-service (DDoS) attack is a malicious attempt to interrupt a targeted server, service, or network’s usual traffic by flooding the target or its surrounding infrastructure with a flood of Internet traffic.
The game is simple who was more network capacity wins. CDNs help us mitigate a DDoS attack by absorbing very high request rates and serving them from their cache. For a CDN, it’s effortless to serve cache content compared to the origin server that needs to render the page every time it’s requested.
Large quantities of traffic (might even be DDoS) or hardware failures can interrupt the normal operation of the website. A CDN can manage more traffic and withstand hardware failure better than many source servers because of its distributed nature.
CDNs absorb all the increased traffic/DDoS without affecting the origin server.
What’s inside the black box?
The black box consists of a lot of servers that cache content. Whenever you visit a website, a few things happen in the background. The browser understands your request and then fetches the website’s IP address via a DNS server.
The DNS server tells the user to go to the nearest CDN PoPs (Points of Presence), and then the PoPs provides the user with the requested files. If the PoPs don’t have the file (aka cache miss), the servers request the files from the origin server, stores a copy of it to itself (aka caching), and serves the requested files to users.
When a subsequent request from a similar location is made, the CDN server can quickly serve the file from its cache (aka cache hit).
What’s inside the black box? Pt.2
Apart from storing cache after it’s requested for the first time, there’s another approach. The other approach is a ‘PUSH’ based approach. As the word says, we push our static files to all the CDN servers and wait until they are requested.
Content is cached at every PoPs after the first request for subsequent request.
Content is cached at every PoPs before the first request.
As you might have guessed, it would be faster for a first-time user too. The ‘PULL’ based approach, where we cache file after the initial request, is slow for the first user but not for the subsequent users.
Black box on Steroids
Cloudflare with edge caching goes on optimizing the pull-based approach further by utilizing users as PoPs. Cloudflare goes on to describe it as:
For example: A user has been to your site before, but the caching rule on your browser says, “This page has been updated.” This means the request will have to go to the origin server. If the user is in New York, and the origin server is in Singapore, that’s a long call. However, a CDN knows to retrieve the content from your site every time it’s updated. The updated site is now cached at the closest CDN server in Atlanta. This reduces load time significantly for subsequent visits.
Edge Caching uses users as PoPs.
Moreover, Cloudflare has dedicated data centers that are rented to them by ISPs in more areas around the world than a traditional CDN has. This additional caching layer, called caching on the edge, is given by Cloudflare.
Netflix uses a similar strategy where they use ISPs as PoPs to cache their content and deliver a better experience.
Final Thoughts 🌟
CDNs have become essential to any modern infrastructure. The different types like the ‘PULL’ and ‘PUSH’ based approach are use cases for different user types and budgets. It’s straightforward to set up a pull-based model as the main work of caching at needed PoPs is done by the provider. People prefer to use pull based for the same reason alongside being cheaper than the alternative. Sometimes being fast is a requirement, and a push-based approach fulfills that. It’s very user-dependent, and I am attaching a link to explore the difference in depth.
I hope with the end of this series, you’re able to understand every bit of caching and CDNs and are excited to try and implement them in your application or website.
Happy Optimizing! ✨