rook cncf storage

Rook: Cloud-Native Storage Orchestration

by | 03.10.2021 | Engineering

Wouldn’t it be nice if there was a simpler way to get larger applications to run without the added headaches of infrastructure needs and resource availability? It was this ideology that empowered the creation of containerized systems that host all the elements needed to run large applications. Similar to a cloud native platform, users can simply put their applications through the containerized mainframe and run deployment.

One of the younger players that’s leading this scourge against resource intensive application platforms is Rook. A recent graduate inducted by the CNCF(Cloud Native Computing Foundation), Rook is an open source cloud-native storage platform for Kubernetes. It also handles framework design and offers support for a diverse set of storage solutions.

What’s A Container?

In the age of virtual machines, containers stand out as an exception, relying on virtual isolation to deploy and run applications. What makes containerization amazing to companies is the power to access and run applications on the OS without the need for a virtual machine. Containers usually contain all the information necessary to run applications, including libraries, packages and infrastructure independent softwares. Much of the constraint for holding all this is placed on the operating system hosting the entire application, such that a single container cannot consume all of a host’s physical resources.

rook posd
Strimzi Overview Source: Admin Magazine

Containers play an important role in virtualizing a single application and creating digital instances which can be accessed by others. Containers also rely on an important method of distributing resources so that failures that occur in a single container are not replicated or slow down the entire resources, affecting only a single container. This also eliminates the compatibility problems between containerized applications that exist on the same operating system.

So What’s With All The Hype For Rook?

Where does Rook come into all this? Rook turns storage software into self-managing, self-scaling, and self-healing storage services. It does this by automating many of the important processes of storage including deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management.

Rook operates under the configurations of the cloud-native container that it is run under to perform management, scheduling and deployment. It further helps users to schedule, manage lifecycles and resources while interpreting user experience in the platform. Rook currently sits as a graduate level software under the CNCF and continues to receive support from the community in making it more user friendly.

All this is achieved by Rook running Kubernetes slave nodes and utilizing the disks as specified by the user. Rook further creates separate pools only containing SSD’s (solid state drives) applications that require more IOPS (input/output operations per second) while drafting a backup for all other storage types.

For every disk that the system creates, Rook initializes another pod that manages that disk and for each containerized service. One of these containerized types is the Rados gateway, designed for providing object storage. This is also compatible with AWS S3 and swift API’s. Rook deploys storage applications either through a hyper-converged format (where it runs in the same node as other applications) or in a hyperscale format (by giving the complete node access to Ceph services only). This gives users a dual mechanism for better storage options.

What is Ceph?

Ceph is a massively scalable and high-performance distributed storage system with no single point of failure. Ceph is a Software Distributed System (SDS), which means it can run on any hardware that meets its specifications.

Rook Architecture
Common Rook Architecture Source: CloudOps

The hyper-converged solution helps in better dealign with resource utilization and dropping costs related to infrastructure. If the system does run out of storage, Rook automatically adds more disks and initializes a separate storage class, based on the pools.

The Tools of The Trade: Storage Type Options With Rook

This is where the ‘meat and potatoes’ of the platform comes into place. Users will be amazed at the array of options to hold data and configure storage with Rook. Here is just a glimpse of some of the most discussed types:-

  • Block:- Allows users to create storage blocks which are further consumed by pods for user access.
  • Object:- Similar to a queueing system that stores multiple data types, the object storage type helps users to store data that is accessible even inside as well as outside the cluster.
  • Shared File System:- As the name implies, a shared file system will hold data that can be shared with multiple pods and containerized even further, depending on user requirements. Access control and sharing of resources of data can also be configured separately.
  • Ceph Dashboard:- As Rook was originally designed to be run with Ceph, the cluster dashboard helps users view the current status of the system and make changes accordingly.
  • Tools:- A toolbox container that houses the full suite of Ceph clients. This is where users can debug and troubleshoot Rook clusters for dealing with container breakdowns or handling clusters that are consuming excessive resources.
  • Monitoring:- All Rook clusters have exporters, which are applications to monitor the status of the containers through Prometheus.
  • Teardown:- A container application that hosts all the resources required to test the clusters.
Rook cncf
Orchestrator Overview for Rook Source: Intel

All That Works Well With Rook

Rook was created as a software designed storage solution that allows users to program native apps on a containerized platform. In simpler words, it’s a great tool to have in one’s arsenal to create separate streams or pools of storage from a range of storage types.

Beyond just storage applications, it also allows for the resources to be transferable across a number of on-site and cloud platforms. Some critics and detractors may point towards the lack of support for these storage types, but it’s a challenge that all applications have had to face since their inception, which isn’t unique to Rook.

Rook is also great at scaling up storage options horizontally and vertically with faster and automatic provisioning of volumes for pods. Long gone are the days of dealing with problems of automated run downs for fixing corrupted disks, rapid automated deployment and resource constraints.

Rook is also highly compatible with other data servicing applications, such as Apache Cassandra NoSQL databases and the CockroachDB and YugabyteDB distributed cloud-native SQL databases. The principle to running Rook as most coders would ascertain is a Kubernetes operator that monitors resources to ensure that storage is

…And All That Doesn’t

While there may not be many notable limitations of running with Rook, it does take a completely different approach for storage applications, that is highly tangent to virtual machines. Companies that have built their infrastructures and cloud structures on VMs, will likely have to take the hard path of making a hard transition to applications like Rook.

Rook is, afterall, a new addition to the industry and has yet to see wider adoption as containerization still remains unknown to bigger stakeholders. There is also a bit of a resource gap when it comes to understanding how Rook can be applied to larger organizations and dealing with multiple data streams or applications. With that being said, it’s up to the community to traverse the deep waters of the tech landscape and highlight the often untouched advantages of Rook to the public.

Final Thoughts and Review

If you still haven’t been hooked by Rook, it doesn’t hurt to take a small dive at community documentations and even running the application. Rook can be actively deployed with just a few commands through a Kubernetes command line and requires fairly minimal packages to get going.

Containerization as a mechanism of storage strategy, is already elevating the industry by leaps and bounds but is yet to be formalized in a way that would make it the ‘gold standard’ for how data is dealt with. Containers provide a great deal of advantages, especially in an environment where resources are being utilized beyond their break-even point. Rook stands as shiny beacon of a platform that can bring many of these advantages to more users but, as usual, is marred by the same issues as any young contender to the industry.

But just because it hasn’t gained the likes of being a far larger application shouldn’t deter people from using it. Take a gander through our other articles and tune in next time, as we discuss in detail, another unknown or lesser used platform.

Happy Learning!

CommunityNew

The DevOps Awareness Program

Subscribe to the newsletter

Join 100+ cloud native ethusiasts

#wearep3r

Join the community Slack

Discuss all things Kubernetes, DevOps and Cloud Native

Related articles6

How to clean up disk space occupied by Docker images?

How to clean up disk space occupied by Docker images?

Docker has revolutionised containers even if they weren't the first to walk the path of containerisation. The ease and agility docker provide makes it the preferred engine to explore for any beginner or enterprise looking towards containers. The one problem most of...

Parsing Packages with Porter

Parsing Packages with Porter

Porter works as a containerized tool that helps users to package the elements of any existing application or codebase along with client tools, configuration resources and deployment logic in a single bundle. This bundle can be further moved, exported, shared and distributed with just simple commands.

eBPF – The Next Frontier In Linux (Introduction)

eBPF – The Next Frontier In Linux (Introduction)

The three great giants of the operating system even today are well regarded as Linux, Windows and Mac OS. But when it comes to creating all purpose and open source applications, Linux still takes the reign as a crucial piece of a developer’s toolkit. However, you...

Falco: A Beginner’s Guide

Falco: A Beginner’s Guide

Falco shines through in resolving these issues by detecting and alerting any behaviour that makes Linux system calls. This system of alerting rules is made possible with the use of Sysdig’s filtering expressions to detect potentially suspicious activity. Users can also specify alerts for specific calls, arguments related to the calls and through the properties of the calling process.

Why DevOps Engineers Love Fluentd?

Why DevOps Engineers Love Fluentd?

Fluentd’s main operational forte lies in the exchange of communication and platforming for creating pipelines where log data can be easily transferred from log generators (such as a host or application) to their preferred destinations (data sinks such as Elasticsearch).

Operating On OpenTracing: A Beginner’s Guide

Operating On OpenTracing: A Beginner’s Guide

OpenTracing is a largely ignored variant of the more popular distributed tracing technique, commonly used in microservice architectures. Users may be familiar with the culture of using distributed tracing for profiling and monitoring applications. For the newcomers,...