Stateful applications, virtual disks in Kubernetes

3rd Apr 2024

You will learn

Create a permanent volume.
Create an application that can keep its state in the volume.
use StatefulSet,PersistentVolumeClaim, PersistentVolume objects.

Permanent volumes and data of Kubernetes

In the world of Kubernetes, we have to reckon with the fact that the container can start at any node and can end its activity at any time. This is not a problem if the container only processes requests and sends responses, e.g. performs some calculations. Complications occur when the container is stored important data.

If the output of the application depends only on the input, then we say that the application is stateless - nothing but the input affects the output. An example is an image compression service - it can produce a compressed image from an uncompressed image on input and does not need any additional data. If the container does not have its state, then it basically does not matter on which node it runs.

If the container stores and reads some data and the data affects the processing result, we say that it has its state. Status is a set of data that is important for the result of processing. An example is a database - it modifies or reads a set of files in a directory according to commands. It is clear that problems will arise if the database stops working on one node and moves to another if the files that represent the state of the database are not moved.

Mostly this problem is solved by having important files status expressions are located in a special location designated for this purpose, such as a network file system that is accessible from all nodes. If we can properly separate the state from the rest of the container, it is possible to easily restart under another node, because important data still remains in one place.

Permanent volumes

Persistent volumes (PersistentVolumes) are a way to separate the state of an application from a container.

PersistentVolume is a type of object that expresses one place where data can be stored - an entire disk, a directory, a network drive, or a shared directory on a network file system.

We know several types of permanent bonds depending on what source they represent:

hostPath: directory on the current node (usable only for single node cluster)
localPersistentVolume: directory or disk on a specific node.
iSCSI: network storage server disk with iSCSI protocol
NFS: UNIX shared network directory
GlusterFS: distributed filesystem
RBD: Ceph distributed file system block device
various other ways for different providers (Azure, Amazon, Google ..)

Official documentation of permanent volumes

We choose the type of permanent volume according to what disk capacity we have available. We can create an object with a permanent volume manually or automatically as required.

Permanent volume in Docker Engine

If we are installing Kubernetes using the Docker Engine, we will implement the persistent volume using hostPath.

See tutorials.

Create a desktoppv1.yaml configuration file with a permanent volume.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: desktoppv1
  labels:
    type: local
spec:
  storageClassName: local
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  hostPath:
    path: "/tmp/desktoppv1"

Permanent volumes in the cluster do not belong to any namespace,

Here's how to create a permanent volume:

kubectl apply -f desktoppv1.yaml
kubectl get pv

The persistent volume expresses that space is available in a specific directory on the virtual machine where Kuberntes is running and "wrapping" kubernetes with an object of type PersistentVolume. If we want to use this space in a container, we must create a mapping.

An StorageClass object is used to configure automatic allocation. If the system is configured for this, a part of the available disk capacity is automatically reserved as required.

If not, then the system waits for the administrator to create the space (directory or disk) manually and create an object of type PersistentVolume.

StatefulSet

The mapping between the persistent volume and Pod is provided by an object of type PersistentVolumeClaim. This object expresses a specific application request for storage space. If there is a PersistentVolume that matches the request, a mapping will be created.

We can create this object separately, or use a special object of type StatefulSet. It contains rules for creating and deleting pods on any node so that access to their status is maintained. If the application data located on the local disk, so Kubernetes will ensure that pod only performs on the node where the volume is located. This makes the files in the persistent volume accessible transparently. As a rule, the container does not need to know where and how the files it works with are stored.

Let's try the example of StatefulSet and show us relational database management.

The StatefulSet object will be similar to theDeployment object, but it will also contain a template for creating the request to a persistent volume of type PersistentVolumeClaim:

Create a file postgres_ss.yaml and write in it:

# Used API
apiVersion: apps/v1
# Object type
kind: StatefulSet
# Object name
metadata:
  name: postgres
# Object specification
spec:
  # link to the pod according to the pod label
  selector:
    matchLabels:
      app: postgres
  # service name
  serviceName: postgres
  # Number of instances of pod
  replicas: 1
  # Pod template
  template:
    metadata:
      # Podu label
      labels:
        app: postgres
    spec:
      # Pod containers
      containers:
        # container name
      - name: postgres
        # Image name
        image: postgres:10.5
        # open container port
        ports:
          - name: postgres
            containerPort: 5432
            protocol: TCP
        # Container configuration environment variables
        env:
            # Database user name
          - name: POSTGRES_USER
            value: postgres
            # Name of the database
          - name: POSTGRES_DB
            # Password to connect to the database
            value: postgres
          - name: POSTGRES_PASSWORD
            value: verysecret
        # Permanent volume requirements
        volumeMounts:
          - mountPath: /var/lib/postgresql /data
            # Name of the volume request
            name: postgrespvc
  volumeClaimTemplates:
  - metadata:
      # the name of the permanent volume request
      # Must be the same as the volume name
      # in volumeMounts in Pod
      name: postgrespvc
    spec:
      accessModes: ["ReadWriteOnce"]
      # Automatic assignment of a permanent volume
      # We know which volume to allocate accordingly
      storageClassName: "local"
      resources:
        requests:
          # Permanent volume size requirements
          storage: 1Gi

In the volumeClaimTemplates section, we write templates for requests for a permanent volume of thePersistentVolumeClaim type (there can be more than one). In this case, we have declared an interest in a directory with a size of at least 1 GB (storage: 1Gi), which can be written to by a maximum of one process and read by any number of processes. (accessMode: ["ReadWriteOnce"]). We have requested automatic allocation of the persistent volume (storageClassName: local).

Create an object of type StatefulSet

kubectl apply -f postgres_ss.yaml -n cv7

When you create an object of type StatefulSet, an object of typePersistentVolumeClaim is automatically created, which represents the request to create a permanent volume. Let's see what happens in our cluster:

# Most important objects, but PersistentVolumeClaim not visible
kubectl get all -n cv7
kubectl describe statefulset/postgres -n cv7
# PersistenVolume Claim must be considered separately
kubectl get pvc -n cv7
# Find out the name of the permanent volume request
kubectl describe statefulsets/postgres -n cv7
kubectl describe pvc/postgrespvc-postgres-0 -n cv7
# Let's see the state of the permanent union
kubectl describe desktoppv1

The StatefulSet object, likeDeployment, creates ReplicaSet andPod objects. In addition, it creates a request to create a permanent volume of type PersistentVolumeClaim according to the specified template. When there is a suitable object of type PersistentVolume (persistent volume) for the objectPersistentVolumeClaim, it is possible to create a mapping and start a new one under the managed object of type StatefulSet.

The database should run. If not, use the get,describe or logs commands to find the cause.

Service Disclosure

If the permanent volume is OK, we can test the functionality a new object of type StatefulSet.

We express the presence of a new service in the cluster using the Service object. This will tell you under what name and port the service will be available.

Create a service with the postgresql database so that other objects can use it.

Save the service configuration e.g. to the file postgres-service.yaml.

apiVersion: v1
kind: Service
metadata:
  name: postgresservice
spec:
  selector:
    app: postgres
  type: ClusterIP
  ports:
    - protocol: TCP
      # Service port
      port: 5432
      # Container port
      targetPort: 5432

This service called postgresservice will only be available within the cluster on port 5432.

Graphical interface to the database

Let's create an interface with which we can connect to the database.

The "pgadmin" web interface will run as a separate Deployment object with the service. When we connect to it using a web browser, we will also have a database service. The database will run under the DNS name postgresservice.

The "pgadmin" interface will be accessible from the browser on port 30881.

In this example, we place both the service and the deployment in one file, the configurations are separated ---.

File pgadmin-deploymentservice.yaml:

apiVersion: v1
kind: Service
metadata:
  name: pgadmin
spec:
  selector:
    app: pg
    admin
  # The service type changes from ClusterIP to NodePort
  type: NodePort
  ports:
    - protocol: TCP
      port: 8800
      targetPort: 80
      # Port visible on each node
      nodePort: 30881
---
# See the API version for documentation
apiVersion: apps/v1
# Object type
kind: Deployment
# About the object
metadata:
  # Object name
  name: pgadmin-deployment
# object specification
spec:
  # The number of pods to create
  replicas: 1
  # The selector creates a Deployment and Pod link
  # Selects those PODs that have the tag nginx
  selector:
    matchLabels:
      app: pgadmin
  # POD template
  template:
    metadata:
      # POD label - to connect Deployment and Pod
      labels:
        app: pgadmin
    spec:
      # POD containers
      containers:
      # Only one nginx container
      - name: pgadmin
        # Image name and version
        image: dpage/pgadmin4
        ports:
        # POD has port 80 open
        - containerPort: 80
        env:
        - name: PGADMIN_DEFAULT_EMAIL
          value: admin@admin.sk
        - name: PGADMIN_DEFAULT_PASSWORD
          value: verysecret

We apply the configuration and over time the cluster should run a web application with which we can connect to the database and insert something into it.

We can use environment variables to specify a login name and password.

Other materials

Previous Post Next Post

Stateful applications, virtual disks in Kubernetes

Permanent volumes and data of Kubernetes

Permanent volumes

Permanent volume in Docker Engine

StatefulSet

Service Disclosure

Graphical interface to the database

Other materials

Stateful applications, virtual disks in Kubernetes

Important links