etcd and kubernetes

Kubernetes Stateful Friend: What’s more to etcd?

by | 30.08.2021 | Engineering

The Kubernetes control plane consists of various components, and one of such components is etcd. Anyone starting to learn k8s come across it and memorizes quickly that it’s a key-value pair for Kubernetes with persistence store. But, what’s more to it? Why do we need it? All these questions are pretty scattered, and in this post, we would go through them in a beginner mindset. Nothing significantly advance, just enough for you to grasp the importance, features of etcd to go forward your cloud native journey.

Let’s start!

What is etcd?

etcd (pronounced “et-cee-dee” and not “e-t-c-d”) is an open source distributed key-value store for storing and managing the important data that distributed systems require to function. It is most well-known for managing the configuration, state, and metadata for Kubernetes, a popular container orchestration technology.

etcd is named after unix’s configuration directory, “etc” and “d” istributed system

What is key-value store?

The data model of etcd is simple, relying on keys and values rather than arbitrary data associations. When compared to standard SQL databases, this helps to maintain relatively predictable performance.

etcd and Kubernetes

If you spend time looking at the Kubernetes control plane, you’ll notice that etcd is where Kubernetes API maintains all of the information about a cluster’s state; in fact, it’s the sole stateful (doesn’t erase state or is persistence) element of the entire control plane.

etcd kubernetes
Kubernetes Components and etcd as persistence store (Image source: Kubernetes.io)

So, Kubernetes monitors this data and uses etcd’s “watch” function to update itself when changes occur. The “watch” function can trigger a response when the values reflecting the cluster’s actual and ideal states diverge.

Why Kubernetes uses etcd?

There are various databases in the ecosystem, but did you wonder why etcd fits so well. The primary reason is being in the CNCF ecosystem, and it has grown well to suit Kubernetes. CNCF tools work well with each other. But what else?

Distributed Database

The main advantage of combining Kubernetes and etcd is that etcd is a distributed database that works in tandem with Kubernetes clusters. As a result, using etcd with Kubernetes is critical for cluster health. The Kubernetes community has widely leveraged it to give numerous benefits for managing cluster states, enabling more automation for dynamic workloads.

Change Notification

Clients can subscribe to changes to a specific key or set of keys using etcd. Kubernetes makes great use of change alerts, and it’s one of the feature that Kubernauts love.

Highly Available

With three or more odd number nodes, etcd is deployed in a highly available method. The etcd cluster is made up of nodes that share nothing. One node serves as the cluster’s leader, while the others serve as followers. At run-time, the leader node is determined using Raft algorithm. This eliminates single points of failure caused by network connectivity issues, power outages, hardware failures, unanticipated maintenance, and so on.

Reliable

etcd immediately saves a request in a log file (write-ahead log using gpRC request) and then creates a snapshot file to avoid the log file growing too large. The snapshot file is sorted by order of keys and contains key-value pairs arranged by the b+ tree structure.

The etcd cluster can restore data from log files and snapshots and resume the service if it crashes or stops due to a problem.

Reliably consistent

Each data read from etcd returns the most up-to-date information from all clusters. When a request is received, the leader node casts votes against followers. The leader commits the request and asks followers to commit if the majority of nodes agree. Any node in the cluster can receive a request from an etcd client. If a client sends a request to a follower, the request will be forwarded to a leader node.

Speed

The key-value store is benchmarked at 10,000 writes per second, but etcd’s performance relies primarily on storage disc speed, and SSDs are strongly recommended in etcd deployments.

Security

Etcd stores secrets from Kubernetes and other highly sensitive configuration data, and it’s needed to be secure by design.  It has some excellent built-in security at it’s core to protect the data.

Secure Transport

etcd supports client certificate authentication using automated Transport Layer Security (TLS) and optional Secure Socket Layer (SSL).

Etcd and Kubernetes
Etcd Access with mutually authenticated TLS Source: rafay

RBACs (Role-Based Access Controls)

Within the deployment, etcd enables role-based access controls, ensuring that team members dealing with it have the least-privileged level of access required to do their work.

Isolation

etcd supports serializable isolation by MVCC (Multi-version Concurrency Control).

Simple

Using standard HTTP/JSON tools, any program, from simple web apps to extremely complicated container orchestration engines like Kubernetes, can read or write data to etcd. It’s effortless to use, and you can play with it in the virtual lab here.

Final Thoughts

Etcd was released even before Kubernetes, but in 2014 google adopted the database for configuration management. They are now a perfect fit, and you might have understood with your first contact with the orchestration tool. But, one thing to point out is etcd exists without Kubernetes, and it has more application than just being a part of the control plane like in Rook and CoreDNS.

I hope you enjoyed this article, and there are a few more introductory articles which you might want to go through here:

Also, feel free to subscribe to our newsletter, where we talk about updates on the tools, deep dives on powerful cloud native tools and aspirations/philosophy every week.

Need help with your cloud native strategy? We’re here!

CommunityNew

The DevOps Awareness Program

Subscribe to the newsletter

Join 100+ cloud native ethusiasts

#wearep3r

Join the community Slack

Discuss all things Kubernetes, DevOps and Cloud Native

Related articles6

Introduction to GitOps

Introduction to GitOps

GitOps serves to make the process of development and operations more developer-centric. It applies DevOps practices with Git as a single source of truth for infrastructure automation and deployment, hence the name “Git Ops.” But before getting deeper into what is...

Kaniko: How Users Can Make The Best Use of Docker

Kaniko: How Users Can Make The Best Use of Docker

Whether you love or hate containers, there are only a handful of ways to work with them properly that ensures proper application use with Docker. While there do exist a handful of solutions on the web and on the cloud to deal with all the needs that come with running...

Cilium: A Beginner’s Guide To Improve Security

Cilium: A Beginner’s Guide To Improve Security

A continuation from the previous series on eBPF and security concerns; it cannot be reiterated enough number of times how important it is for developers to ensure the safety and security of their applications. With the ever expanding reach of cloud and software...

How to clean up disk space occupied by Docker images?

How to clean up disk space occupied by Docker images?

Docker has revolutionised containers even if they weren't the first to walk the path of containerisation. The ease and agility docker provide makes it the preferred engine to explore for any beginner or enterprise looking towards containers. The one problem most of...

Parsing Packages with Porter

Parsing Packages with Porter

Porter works as a containerized tool that helps users to package the elements of any existing application or codebase along with client tools, configuration resources and deployment logic in a single bundle. This bundle can be further moved, exported, shared and distributed with just simple commands.

eBPF – The Next Frontier In Linux (Introduction)

eBPF – The Next Frontier In Linux (Introduction)

The three great giants of the operating system even today are well regarded as Linux, Windows and Mac OS. But when it comes to creating all purpose and open source applications, Linux still takes the reign as a crucial piece of a developer’s toolkit. However, you...