first commit
This commit is contained in:
commit
6a75c512c4
8
etcd-inmemory/CHANGELOG
Normal file
8
etcd-inmemory/CHANGELOG
Normal file
@ -0,0 +1,8 @@
|
||||
* Wed Feb 15 2023 Erik Brakkee (erik@brakkee.org)
|
||||
- Cleanup of the scripts. No longer sourcing the .bashrc
|
||||
and using the docker wrapper script instead of the docker
|
||||
alias.
|
||||
|
||||
* Tue Feb 14 2023 Erik Brakkee (erik@brakkee.org)
|
||||
- Scripts to be able to run etcd in memory.
|
||||
|
91
etcd-inmemory/README.md
Normal file
91
etcd-inmemory/README.md
Normal file
@ -0,0 +1,91 @@
|
||||
# Running etcd in memory
|
||||
|
||||
This is an RPM for running etcd in memory for a kubeadm kubernetes install. It solves the issue that in a home setup, your disks will make
|
||||
quite some noise because of etcd. See also the [post](https://brakkee.org/site/2023/02/14/silencing-kubernetes-at-home/) where this was described.
|
||||
|
||||
This RPM has been tested on Centos Stream 8 with kubelet 1.24.10. It will
|
||||
probably/hopefully work on other RHEL-like systems as well.
|
||||
|
||||
# Disclaimer
|
||||
|
||||
This is provided as an example and you can use at it your own risk. As an
|
||||
administrator you are responsible at all times for the health of your
|
||||
kubernetes cluster and all the data. Make sure that you have backups and can
|
||||
restore from backup in case of problems, or try it out on a less critical cluster.
|
||||
|
||||
# Requirements
|
||||
|
||||
The cluster must be using containerd as the container runtime and etcd must be
|
||||
running as a pod in the kubernetes cluster such as with a kubeadm cluster
|
||||
setup.
|
||||
|
||||
# Other container runtimes
|
||||
|
||||
There is a docker script in the RPM that uses nerdctl to adapt to containerd.
|
||||
It is possible to adapt this setup to another container runtime by replacing
|
||||
the docker script with another implementation. Currently, only containerd is
|
||||
supported.
|
||||
|
||||
# Building the RPM
|
||||
|
||||
Build the RPM using maven and install it on your controller node using yum/dnf.
|
||||
This will provide the following:
|
||||
* backups of etcd in 15 minute intervals in the /var/lib/wamblee/etcd
|
||||
directory. This will also preserve a number of older backups.
|
||||
* prior to shutdown of the kubelet service, an additional backup is taken.
|
||||
* prior to startup of the kubelet a restore is done.
|
||||
|
||||
In a production setup you would add a distribution management section to
|
||||
the pom.xml and configure the maven release plugin to deploy to a
|
||||
repository (e.g. nexus).
|
||||
|
||||
# Setup
|
||||
|
||||
After installing the RPM, wait until the first backups are appearing.
|
||||
In the next step, drain the controller node and stop the kubelet.
|
||||
|
||||
```
|
||||
kubectl drain NODENAME --ignore-daemonsets
|
||||
systemctl stop kubelet
|
||||
```
|
||||
|
||||
Then stop all running containers on the controller node:
|
||||
```
|
||||
/opt/wamblee/etcd/bin/docker ps |
|
||||
awk 'NR > 1 { print $1}' |
|
||||
xargs /opt/wamblee/etcd/bin/docker stop
|
||||
```
|
||||
|
||||
After the above steps, all services in your cluster should still be running.
|
||||
|
||||
Now backup the contents of the /var/lib/etcd directory
|
||||
```
|
||||
cd /var/lib/etcd
|
||||
tar cvfz ~/etcd.tar.gz .
|
||||
rm -rf /var/lib/etcd/*
|
||||
```
|
||||
|
||||
Now in /etc/fstab, create an entry to mount /var/lib/etcd in memory:
|
||||
```
|
||||
tmpfs /var/lib/etcd tmpfs defaults,,noatime,size=2g 0 0
|
||||
```
|
||||
Then remove all contents from the /var/lib/etcd directory and mount the ramdisk:
|
||||
```
|
||||
rm -rf /var/lib/etcd/*
|
||||
mount -a
|
||||
```
|
||||
|
||||
Now you can start the kubelet again using `systemctl start kubelet`. After this,
|
||||
you should see all the nodes as before: `kubectl get nodes`.
|
||||
|
||||
After this, uncordon the controller node
|
||||
```
|
||||
kubectl uncordon NODENAME
|
||||
```
|
||||
|
||||
If anything goes wrong in the above steps, then drain the controller node
|
||||
(it at all possible), stop the kubelet, and stop all containers, unmount
|
||||
/var/lib/etcd and then
|
||||
restore the etcd data from backup and start the kubelet again.
|
||||
|
||||
|
10
etcd-inmemory/files/etc/cron.d/wamblee-etcd
Normal file
10
etcd-inmemory/files/etc/cron.d/wamblee-etcd
Normal file
@ -0,0 +1,10 @@
|
||||
SHELL=/bin/bash
|
||||
PATH=/sbin:/bin:/usr/sbin:/usr/bin
|
||||
HOME=/root
|
||||
MAILTO=root
|
||||
|
||||
*/15 * * * * root /opt/wamblee/etcd/bin/etcd-cron > /var/log/wamblee-etcd-backup 2>&1
|
||||
30 0 * * * root /opt/wamblee/etcd/bin/etcdctl defrag --cluster > /var/log/wamblee-etcd-defrag 2>&1
|
||||
|
||||
|
||||
|
3
etcd-inmemory/files/opt/wamblee/etcd/bin/docker
Executable file
3
etcd-inmemory/files/opt/wamblee/etcd/bin/docker
Executable file
@ -0,0 +1,3 @@
|
||||
#!/bin/bash
|
||||
|
||||
exec nerdctl -n k8s.io "$@"
|
33
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-backup
Executable file
33
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-backup
Executable file
@ -0,0 +1,33 @@
|
||||
#!/bin/bash
|
||||
|
||||
PATH=/opt/wamblee/etcd/bin:$PATH
|
||||
|
||||
if [[ $# -ne 1 ]]
|
||||
then
|
||||
echo "Usage: $0 <backupname>" 1>&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
BACKUP="$1"
|
||||
|
||||
|
||||
IMAGE="$( /opt/wamblee/etcd/bin/docker ps | awk '/\/etcd$/ { print $2}' )"
|
||||
if [[ -z "$IMAGE" ]]
|
||||
then
|
||||
echo "$0: could not create backup" 1>&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
docker run --network host \
|
||||
-v /etc/kubernetes/pki/etcd:/etc/kubernetes/pki/etcd \
|
||||
-v /var/lib/wamblee/etcd:/var/lib/wamblee/etcd \
|
||||
--rm "$IMAGE" sh -c "etcdctl \
|
||||
--endpoints=https://127.0.0.1:2379 \
|
||||
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
|
||||
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
|
||||
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key \
|
||||
snapshot save /var/lib/wamblee/etcd/$BACKUP"
|
||||
|
||||
echo "IMAGE=$IMAGE" > /var/lib/wamblee/etcd/etcdimage
|
||||
|
||||
echo "Backup done at /var/lib/wamblee/etcd/$BACKUP"
|
20
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-cron
Executable file
20
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-cron
Executable file
@ -0,0 +1,20 @@
|
||||
#!/bin/bash
|
||||
|
||||
PATH=/opt/wamblee/etcd/bin:$PATH
|
||||
|
||||
DATE="$( date +%Y-%m-%d_%H:%M:%S )"
|
||||
DIR="$( date +%Y-%m-%d )"
|
||||
|
||||
etcd-backup etcd-snapshot-latest.db.tmp
|
||||
mv /var/lib/wamblee/etcd/etcd-snapshot-latest.db.tmp /var/lib/wamblee/etcd/etcd-snapshot-latest.db
|
||||
|
||||
ln /var/lib/wamblee/etcd/etcd-snapshot-latest.db /var/lib/wamblee/etcd/etcd-backup-$DATE.db
|
||||
mkdir -p /var/lib/wamblee/etcd/"$DIR"
|
||||
if [[ ! -r /var/lib/wamblee/etcd/$DIR/etcd-backup.db ]]
|
||||
then
|
||||
ln /var/lib/wamblee/etcd/etcd-snapshot-latest.db /var/lib/wamblee/etcd/$DIR/etcd-backup.db
|
||||
fi
|
||||
ls -t /var/lib/wamblee/etcd/etcd-backup* | awk 'NR > 10' | xargs rm -f
|
||||
find /var/lib/wamblee/etcd -mtime +31 | xargs rm -rf
|
||||
|
||||
|
42
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-restore
Executable file
42
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-restore
Executable file
@ -0,0 +1,42 @@
|
||||
#!/bin/bash
|
||||
|
||||
PATH=/opt/wamblee/etcd/bin:$PATH
|
||||
|
||||
if [[ $# -ne 1 ]]
|
||||
then
|
||||
(
|
||||
echo "Usage: $0 <backuppname>"
|
||||
echo " <backupname> must be a relative path to a backup below the /var/lib/wamblee/etcd directory"
|
||||
) 1>&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
backup="$1"
|
||||
|
||||
. /var/lib/wamblee/etcd/etcdimage
|
||||
if [[ -z "$IMAGE" ]]
|
||||
then
|
||||
IMAGE="registry.k8s.io/etcd:3.5.6-0"
|
||||
echo "ETCD image cannot be determined, using fall back $IMAGE" 1>&2
|
||||
fi
|
||||
|
||||
echo "ETCD image: $IMAGE"
|
||||
|
||||
set -e
|
||||
rm -rf /var/lib/etcd.restored
|
||||
mkdir -p /var/lib/etcd.restored
|
||||
# using --network host to work around incompatibility of CNI versions
|
||||
docker run --rm \
|
||||
--network host \
|
||||
-v '/var/lib/wamblee/etcd:/var/lib/wamblee/etcd' \
|
||||
-v '/var/lib/etcd.restored:/var/lib/etcd.restored' \
|
||||
--env ETCDCTL_API=3 \
|
||||
"$IMAGE" \
|
||||
/bin/sh -c "etcdctl snapshot restore /var/lib/wamblee/etcd/$backup --data-dir /var/lib/etcd.restored/data"
|
||||
|
||||
mv /var/lib/etcd.restored/data/* /var/lib/etcd.restored
|
||||
rmdir /var/lib/etcd.restored/data
|
||||
|
||||
echo ""
|
||||
echo "Restore is available at /var/lib/etcd.restored"
|
||||
|
38
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-restore-to-tmpfs
Executable file
38
etcd-inmemory/files/opt/wamblee/etcd/bin/etcd-restore-to-tmpfs
Executable file
@ -0,0 +1,38 @@
|
||||
#!/bin/bash
|
||||
|
||||
PATH=/opt/wamblee/etcd/bin:$PATH
|
||||
|
||||
echo "$0: verifying that etcd is not running"
|
||||
if nc -z 127.0.0.1 2379
|
||||
then
|
||||
echo "$0: etcd port 2379 is already open, skipping restore of data" 1>&2
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "$0: verifying that containerd is running"
|
||||
if ! systemctl status containerd > /dev/null 2>&1
|
||||
then
|
||||
echo "$0: containerd is not running, cannot perform restore" 1>&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "$0: verifying that /var/lib/etcd is empty"
|
||||
size="$( du -s /var/lib/etcd | awk '{ print $1}' )"
|
||||
if [[ "$size" -ne 0 ]]
|
||||
then
|
||||
echo "$0: /var/lib/etcd is not empty, assuming data left from previous etcd" 1>&2
|
||||
exit 0
|
||||
fi
|
||||
|
||||
backupfile="$( cd /var/lib/wamblee/etcd; ls -rt *.db | tail -1 )"
|
||||
echo "$0: Using backup file '$backupfile' for restore"
|
||||
|
||||
etcd-restore "$backupfile"
|
||||
if [[ $? -ne 0 ]]
|
||||
then
|
||||
echo "$0: restore of etcd failed" 1>&2
|
||||
exit 1
|
||||
fi
|
||||
echo "$0: restore of etcd data finished"
|
||||
|
||||
rsync -avz /var/lib/etcd.restored/ /var/lib/etcd/
|
25
etcd-inmemory/files/opt/wamblee/etcd/bin/etcdctl
Executable file
25
etcd-inmemory/files/opt/wamblee/etcd/bin/etcdctl
Executable file
@ -0,0 +1,25 @@
|
||||
#!/bin/bash
|
||||
|
||||
PATH=/opt/wamblee/etcd/bin:$PATH
|
||||
|
||||
. /var/lib/wamblee/etcd/etcdimage
|
||||
if [[ -z "$IMAGE" ]]
|
||||
then
|
||||
echo "ETCD image cannot be determined" 1>&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "ETCD image: $IMAGE"
|
||||
|
||||
docker run --rm \
|
||||
--network host \
|
||||
-v /etc/kubernetes/pki/etcd:/etc/kubernetes/pki/etcd \
|
||||
-v /var/lib/wamblee/etcd:/var/lib/wamblee/etcd \
|
||||
$IMAGE \
|
||||
etcdctl \
|
||||
--endpoints=https://127.0.0.1:2379 \
|
||||
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
|
||||
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
|
||||
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key "$@"
|
||||
|
||||
|
@ -0,0 +1,5 @@
|
||||
|
||||
[Service]
|
||||
ExecStop=/opt/wamblee/etcd/bin/etcd-cron
|
||||
|
||||
|
@ -0,0 +1,7 @@
|
||||
|
||||
[Unit]
|
||||
After=containerd.service
|
||||
|
||||
[Service]
|
||||
ExecStartPre=-/opt/wamblee/etcd/bin/etcd-restore-to-tmpfs
|
||||
|
106
etcd-inmemory/pom.xml
Normal file
106
etcd-inmemory/pom.xml
Normal file
@ -0,0 +1,106 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd ">
|
||||
|
||||
<!--
|
||||
parent pom with distribution management
|
||||
<parent>
|
||||
<groupId>org.brakkee</groupId>
|
||||
<artifactId>root</artifactId>
|
||||
<version>1.0.2</version>
|
||||
</parent>
|
||||
-->
|
||||
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
|
||||
<packaging>rpm</packaging>
|
||||
<groupId>org.brakkee.blog</groupId>
|
||||
<artifactId>etcd-inmemory</artifactId>
|
||||
<version>1.0.1-SNAPSHOT</version>
|
||||
<name>etcd-inmemory</name>
|
||||
<description>running etcd in-memory</description>
|
||||
<organization>
|
||||
<name>org.brakkee</name>
|
||||
</organization>
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.codehaus.mojo</groupId>
|
||||
<artifactId>rpm-maven-plugin</artifactId>
|
||||
<version>2.0.1</version>
|
||||
<extensions>true</extensions>
|
||||
<configuration>
|
||||
<changelogFile>CHANGELOG</changelogFile>
|
||||
<copyright>Apache License 2.0, 2010</copyright>
|
||||
<group>org.wamblee.server</group>
|
||||
<packager>Erik Brakkee</packager>
|
||||
<needarch>x86_64</needarch>
|
||||
|
||||
<mappings>
|
||||
<!-- kubelet service hooks: pre-start restore, post stop backup -->
|
||||
<mapping>
|
||||
<directory>/usr/lib/systemd/system/kubelet.service.d</directory>
|
||||
<filemode>444</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
<directoryIncluded>false</directoryIncluded>
|
||||
<sources>
|
||||
<source>
|
||||
<location>files/usr/lib/systemd/system/kubelet.service.d</location>
|
||||
</source>
|
||||
</sources>
|
||||
</mapping>
|
||||
|
||||
<!-- triggering backups at regualr intervals -->
|
||||
<mapping>
|
||||
<directory>/etc/cron.d</directory>
|
||||
<filemode>644</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
<directoryIncluded>false</directoryIncluded>
|
||||
<configuration>true</configuration>
|
||||
<sources>
|
||||
<source>
|
||||
<location>files/etc/cron.d</location>
|
||||
</source>
|
||||
</sources>
|
||||
</mapping>
|
||||
|
||||
<!-- backup and restore scripts -->
|
||||
<mapping>
|
||||
<directory>/opt/wamblee/etcd/bin</directory>
|
||||
<filemode>555</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
<directoryIncluded>false</directoryIncluded>
|
||||
<sources>
|
||||
<source>
|
||||
<location>files/opt/wamblee/etcd/bin</location>
|
||||
</source>
|
||||
</sources>
|
||||
</mapping>
|
||||
|
||||
<!-- backup location for etcd -->
|
||||
<mapping>
|
||||
<directory>/var/lib/wamblee/etcd</directory>
|
||||
<filemode>755</filemode>
|
||||
<username>root</username>
|
||||
<groupname>root</groupname>
|
||||
</mapping>
|
||||
</mappings>
|
||||
<requires>
|
||||
<require>nerdctl</require>
|
||||
</requires>
|
||||
<provides>
|
||||
<provide>etcd-inmemory</provide>
|
||||
</provides>
|
||||
<postinstallScriptlet>
|
||||
<script><![CDATA[
|
||||
systemctl daemon-reload
|
||||
]]></script>
|
||||
</postinstallScriptlet>
|
||||
</configuration>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</project>
|
Loading…
Reference in New Issue
Block a user