Docker Swarm volumes

Containers are ephemeral. Containers live entirely in memory, so even if the container is set up with automatic restart, the new container will not have access to the data created inside of the old container. To save persistent data of Docker containers we need to create volumes that live outside of the ephemeral containers. Don’t map volumes that store important information to the host. Use platform storage, like Amazon EBS, GCE Persistent Disk or Azure Disk Volume to store all your important data. Set up an automatic backup process to create volume snapshots regularly, so even if the host goes away, your data is safe.

Docker Swarm, use drivers to connect to external resources. One of the most versatile is REX-Ray.

We will add volume to MongoDB running in a Docker Swarm.

Install the REX-Ray plugin on the Docker host

docker plugin install rexray/ebs EBS_ACCESSKEY=aws_access_key EBS_SECRETKEY=aws_secret_key

Create a 1 GB Docker volume with Rex-Ray and gp2 EBS backing.

docker volume create –driver rexray/ebs –opt size=1 –opt volumetype=gp2 –name ebs_volume

Launch a new MongoDB service and map the volume to the MongoDB data directory

docker service create –network my_overlay_network –replicas 1 –mount type=volume,src=ebs_volume,target=/data/db –name mongodb mongo

Kubernetes volumes

To understand the types of available volumes read the official Kubernetes documentation on Volumes

The official documentation on Kubernetes Persistent Volume and Persistent Volume Claim is at Persistent Volumes

Migrating to CSI drivers from in-tree plugins

Kubernetes moves away from in-tree plugins, that had to be checked into the Kubernetes code repository to out-of-tree volume plugins like CSI ( Container Storage Interface ) drivers and Flexvolume. These drivers can be developed by third parties independently of Kubernetes, and have to be installed and configured on the node, see Migrating to CSI drivers from in-tree plugins

Mount propagation

Mount propagation allows containers to share volumes with other containers within the pod or with containers in other pods on the same host. For more information see Mount propagation

Access Mode

Access mode specifies the way volumes are accessed from one or multiple pods.

  • ReadWriteOnce ( RWO ) – the volume can be mounted as read-write by a single node only
  • ReadOnlyMany ( ROX ) – the volume can be mounted read-only by many nodes
  • ReadWriteMany ( RWX )– the volume can be mounted as read-write by many nodes

For the latest access mode limitations for each volume type see Access Modes

Volume Types

persistentVolumeClaim

persistentVolumeClaim volumes mount PersistentVolume into a Pod without knowing the underlying storage backing, like GCE PersistentDisk or iSCSI volume. CSI (Computer Storage Interface) volume types do not support direct reference from Pods, can only be referenced via the PersistentVolumeClaim object.

nfs

nfs (Netwotk File System) Persistent Disks can be mounted simultaneously by multiple Pods for writing.

awsElasticBlockStore

awsElasticBlockStore AWS EBS volumes have limitations

  • the Pods have to run on AWS EC2 instance nodes
  • the EC2 instances have to be in the same region and availability-zone as the EBS volume
  • only one EC2 instance can be mounted to the EBS
  • the access mode can only be ReadWriteOnce

awsElasticBlockStore volume examples

Kubernetes can mount a volume directly to a pod. In this example we mount an AWS EBS volume to the Mongo database pod.

Create a 1 GB volume from the command line

aws ec2 create-volume –availability-zone=us-east-1a –size=1 –volume-type=gp2

Create the mongodb.yml file

apiVersion: v1
kind: Pod
metadata:
  name: mongodb-on-ebs
spec:
  containers:
  - image: mongo
    name: mongo-pod
    volumeMounts:
    - mountPath: /data/db
      name: mongo-volume
  volumes:
  - name: mongo-volume
    awsElasticBlockStore:
      volumeID: <THE_VOLUME_ID>
      fsType: ext4

Create the Mongo DB pod

kubectl create -f mongodb.yml

gcePersistentDisk

gcePersistentDisk can be read simultaneously by multiple Pods but written only by one at a time, so it is a good choice as a common configuration source.
Starting in Kubernetes version 1.10 a beta feature allows the creation of Regional Persistent Disks that are available from multiple zones of the same region.

The gcePersistentDisk limitations are

  • the Pods have to run on GCE VM nodes
  • the nodes have to be in the same GCE project and zone as the Persistent Disk
  • can be mounted to multiple Pods for reading, but only one Pod can write it. If the Pod is controlled by a Replica Controller the access mode has to be read-only, or the replica count has to be 0 or 1

gcePersistentDisk volume example

Create the Persistent Disk (accessible from one zone only)

gcloud compute disks create --size=200GB --zone=us-central1-a my-gce-disk

To create a Regional Persistent Disk (beta in Kubernets 1.10)

gcloud beta compute disks create --size=200GB my-gce-disk
    --region us-central1
    --replica-zones us-central1-a,us-central1-b

Create the Regional Persistent Volume (beta in Kubernets 1.10)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: gce-test-volume
  labels:
    failure-domain.beta.kubernetes.io/zone: us-central1-a__us-central1-b
spec:
  capacity:
    storage: 200Gi
  accessModes:
  - ReadWriteOnce
  gcePersistentDisk:
    pdName: my-gce-disk
    fsType: ext4

Then Pod definition

apiVersion: v1
kind: Pod
metadata:
  name: gce-pd
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: gce-test-container
    volumeMounts:
    - mountPath: /gce-pd
      name: gce-test-volume
  volumes:
  - name: gce-test-volume
    # The GCE Persistent Disk has to exist
    gcePersistentDisk:
      pdName: my-gce-disk
      fsType: ext4

Working with volumes

List all Persistent Volume Claims

kubectl get pvc

Resources to learn Kubernetes

This is a great five part blog series on Kubernetes and a post on volumes by Sebastian Caceres. I recommend reading it even before the official Kubernetes tutorial to get a great overview of how Kubernetes really works.

Kubernetes Tutorials

Resources to learn Docker Swarm

This is five part blog series, and the post on volumes is great explanation of how Docker Swarm really works by Sebastian Caceres. I recommend it even before doing the official Docker Swarm tutorial, to get a peek under the hood.

Kubernetes overview

Kubernetes Hierarchy

  • image
  • container
  • pod ( one or more containers that would be deployed together on the same host to share volumes )
  • deployment
  • service

Kubelet

Kubelets run on every host to start and stop pods and communicate with the Docker engine on the host level.

Kube-proxy

Kube-proxies also run on every host to redirect the traffic to specific services and pods.

Container Linux

Container Linux by CoreOS (formerly known as CoreOS Linux, or just CoreOS) an OS specifically designed to run containers, a lightweight Linux distribution that uses containers to run applications. It does not even have a package manager, but contains the basic GNU Core Utilities for administration. It also include include KubeletDockeretcd and flannel.

Kubernetes Networking

Flannel

Flannel gives each host a separate IP subnet range to prevent IP address collisions, providing a unique IP address to each container. Flannel is the standard SDN ( software-defined network ) tool for CoreOS (Container Linux), it is shipped with the distribution.

Calico

Calico provides security in the Kubernetes cluster. By default in the Kubernetes cluster any pod can communicate to any other pod on any host. Calico restricts the inter pod communication using namespaces and selectors. It allows the communication from the host to the pods to enable health checks. Calico has tight integration with Flannel.

Canal

As Calico and Flannel nicely fit together, Canal is the combination of the two to provide a comprehensive inter-pod networking solution in the Kubernetes cluster.

Kubernetes commands

  • kubectl get – list resources
  • kubectl describe – show detailed information about a resource
  • kubectl logs – print the logs from a container in a pod
  • kubectl exec – execute a command on a container in a pod

List existing pods

kubectl get pods

Get detailed information on the pods

kubectl describe pods

Start a proxy to access the containers within the pod

kubectl proxy

Get the pod name and store it in the POD_NAME environment variable

export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}')
echo Name of the Pod: $POD_NAME

Access an API running in the pod. The name of the pod is in the POD_NAME environment variable

curl http://localhost:8001/api/v1/namespaces/default/pods/POD_NAME/proxy/

View the STDOUT of the only container of the pod

kubectl logs POD_NAME

View the STDOUT of a specific container of the pod

kubectl logs POD_NAME -c CONTAINER_NAME

View the STDOUT of all containers of the pod

kubectl logs POD_NAME --all-containers=true

Execute a command in the only container of the pod

kubectl exec POD_NAME MY_COMMAND

Execute a command in a container of the pod

kubectl exec POD_NAME -c CONTAINER_NAME MY_COMMAND

Start a Bash session in the container (container name is optional if the pod has only one container)

kubectl exec -ti POD_NAME -c CONTAINER_NAME bash

To check an API from the Bash console within the container (use localhost to address it within the container)

curl localhost:8080

Install missing commands on Linux distributions

On some lean systems, mostly in Docker containers, some important commands are not readily available. The table below shows the command to install them.

To get the name of the Linux distribution execute

cat /etc/os-release

To find the package that contains the command, install apt-file

sudo apt-get install apt-file

Update the file package mapping database

sudo apt-file update

Search for the command at the end of the path

apt-file search --regexp '/MY_COMMAND$'

Select the package that contains the command in the standard path (/usr/bin/)

To get more information on the package

apt-cache show MY_PACKAGE

Application install commands

ApplicationRHEL, CentOSUbuntu, Debian
free, kill, pkill,
pgrep, pmap,
ps, pwdx, skill,
slabtop, snice,
sysctl, tload,
top, uptime,
vmstat, w, watch
apt-get install procps

The advancement of computer programming and personal computer technology

If you really know one programming language, you can learn others too. The most important part is to understand the concept of computer programming. All languages are built on similar type of instructions, some of them require a semicolon at the end of the line, others don’t. Some of them use curly braces {} to group instructions, others use indentation.

All of them allow you to make decisions, usually with the keyword IF, assign values to variables with =, read the keyboard, write to the screen, read and write the disk, and the network. Loop through items with FOR, FOREACH, and WHILE, address array elements with [0..]. You only have to learn a few English words and the syntax to use them.

The internet, and especially Stack Overflow is a great resource to find sample code that does what you need. Avoid assembling your program by copy pasting code from the internet. Try to understand the examples, and write your own lines to be able to really understand and maintain it.

The list below shows the advancement of personal computer technology. I have added the usual hardware specifications, the most popular operating systems, and important programming languages. The dates are not when the technology was announced, but when the average user started to use it.


1981

IBM Mainframe, magnetic tape and large format magnetic disk storage

  • Fortran

Homemade personal computer with no permanent storage

  • Basic

“Midrange” computer (16 KB solid state or magnetic-core RAM, large format magnetic disk storage )

  • Basic

1984

ZX 81 (3.25 MHz processor, first 1 KB, later 16 KB RAM, compact audio cassette storage)

ZX Spectrum (3.5 MHz processor, 16 KB RAM, compact audio cassette storage)

Commodore 64 (1 MHz processor, 64 KB RAM, 5.4″ floppy disk)

  • Basic for data processing
  • Simon Basic for graphical user interface

IBM PC( 4.7 MHz processor, first 128 KB, later 256 KB RAM, 5.4″ floppy disk, later 10 or 20 MB 5.4″ hard drive)

DOS

  • dBase
  • Clipper
  • FoxPro

1987

IBM XT ( 4.7-12 MHz processor, 16 MB RAM)

DOS

  • Lotus 123 spreadsheet for engineering calculations
  • LISP for AutoCAD menus

1994

IBM 386 ( 40 MHz processor, 256 MB RAM)

DOS, Windows 3.1

  • FoxPro
  • Visual FoxPro
  • Visual Basic
  • SQL

1998

IBM clone( 40 MHz processor, 256 MB RAM)

Windows 95

  • PowerBuilder
  • Jaguar for web application server
  • ASP for web UI
  • SQL databases

2000

IBM clone ( 150 MHz processor, 512 MB RAM)

Windows 98, Windows ME, Windows 2000

  • ASP for web UI
  • Visual Basic
  • SQL databases

Linux Debian

  • Bash

2003

( 1 GHz processor, first 1 GB, later 4 GB RAM)

Windows Server 2003

  • C#
  • C++
  • Java
  • SQL databases

2008

( 1 GHz processor, first 4 G, later 32 GB RAM)

Windows Server 2008


2015

Windows 7 laptop (2 GHz processor, 6 GB RAM, 500 GB HD)

Windows Server 2012 R2 virtual machines in the cloud ( 2 GHz processor, 4 – 32 GB RAM)

  • PowerShell

Linux RedHat 7 ( virtual machines in the cloud 2 GHz processor, 4 – 32 GB RAM)

  • Bash

MacBook Pro laptop (2.5 GHz i7 processor, 16 GB RAM, 1TB SSD storage)

  • Packer
  • Terraform
  • Ruby
  • PowerShell ( on Windows 10 virtual machine )
  • Chef
  • ServerSpec
  • InSpec
  • Chocolatey

2019

MacBook Pro (5 GHz i9 processor, 32 GB RAM, 1TB SSD storage)

  • Docker
  • Kubernetes
  • Golang