全編目次
none06.hatenadiary.org
概要
Kubernetes環境を構築する。当記事の範囲はコンテナを載せるための基盤を構築するところまで。
経緯
実務経験としてのスキルがオンプレの物理/仮想マシンくらいしかなく、焦りを感じているこの頃。今回はコンテナおよびコンテナオーケストレーションの学習としてKubernetesを触ってみた。
環境
概要図
※赤い箇所が当記事で扱う範囲。
構成
前提
原則として公式ドキュメントに沿って進めていく。
kubernetes.io
※すんなり行くなら二番煎じになるので記事にしないつもりだったが、すんなり行かないから記事にした・・・
構築
コントロールプレーンノードについてのみ確認する。後述するがワーカノードはコントロールプレーンノードを複製して構築する。
[root@k8sctrpln01 ~]# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
[root@k8sctrpln01 ~]# free -h
total used free shared buff/cache available
Mem: 2.0G 128M 1.7G 8.6M 123M 1.7G
Swap: 2.0G 0B 2.0G
[root@k8sctrpln01 ~]# cat /proc/cpuinfo | grep processor
processor : 0
processor : 1
- コントロールプレーンノード とワーカノードが通信可能であることを確認する。ワーカノードは未構築のため割愛する。
- ホスト名、MACアドレス、product_uuidがノード間で重複が無いことを確認する。
後述。
- ポートを開放する。
後述。
- Swapをオフにする。
[root@k8sctrpln01 ~]# swapon -s
Filename Type Size Used Priority
/dev/vda2 partition 2097148 0 -2
[root@k8sctrpln01 ~]# swapoff -a
[root@k8sctrpln01 ~]# swapon -s
[root@k8sctrpln01 ~]# cp -p /etc/fstab{,_`date +%Y%m%d`}
[root@k8sctrpln01 ~]# ls -l /etc/fstab*
-rw-r--r--. 1 root root 501 2月 26 2020 /etc/fstab
-rw-r--r-- 1 root root 501 2月 26 2020 /etc/fstab_20220529
[root@k8sctrpln01 ~]# sed -i /swap/s/^/#/g /etc/fstab
[root@k8sctrpln01 ~]# diff /etc/fstab{,_`date +%Y%m%d`}
11c11
< #UUID=ea60423d-e182-4843-a7fa-393d738a20d1 swap swap defaults 0 0
---
> UUID=ea60423d-e182-4843-a7fa-393d738a20d1 swap swap defaults 0 0
[root@k8sctrpln01 ~]# uname -n
k8sctrpln01
[root@k8sctrpln01 ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:f9:d5:4d brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:25:ca:02 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:15:69:ca brd ff:ff:ff:ff:ff:ff
[root@k8sctrpln01 ~]# cat /sys/class/dmi/id/product_uuid
BB935834-A266-4B61-A591-4589BFE06A62
インターフェースを1つしか使用しないためルーティングの考慮は不要。
[root@k8sctrpln01 ~]# lsmod | grep ^br_netfilter
[root@k8sctrpln01 ~]# modprobe br_netfilter
[root@k8sctrpln01 ~]# lsmod | grep ^br_netfilter
br_netfilter 22256 0
[root@k8sctrpln01 ~]# sysctl -n net.bridge.bridge-nf-call-ip6tables -n net.bridge.bridge-nf-call-iptables
1
1
[root@k8sctrpln01 ~]# cat <<EOF > /etc/sysctl.d/k8s.conf
> net.bridge.bridge-nf-call-ip6tables = 1
> net.bridge.bridge-nf-call-iptables = 1
> EOF
[root@k8sctrpln01 ~]# sysctl --system
* Applying /usr/lib/sysctl.d/00-system.conf ...
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
kernel.yama.ptrace_scope = 0
* Applying /usr/lib/sysctl.d/50-default.conf ...
kernel.sysrq = 16
kernel.core_uses_pid = 1
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.d/disable_ipv6.conf ...
net.ipv6.conf.all.disable_ipv6 = 1
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
* Applying /etc/sysctl.conf ...
[root@k8sctrpln01 ~]# sysctl -n net.bridge.bridge-nf-call-ip6tables -n net.bridge.bridge-nf-call-iptables
1
1
CentOS7はnftablesを導入していないので対象外。
[root@k8sctrpln01 ~]# grep -R -e 6443 -e 2379 -e 2380 -e 10250 -e 10251 -e 10252 /lib/firewalld/services/
/lib/firewalld/services/etcd-client.xml: <port port="2379" protocol="tcp"/>
/lib/firewalld/services/etcd-server.xml: <port port="2380" protocol="tcp"/>
[root@k8sctrpln01 ~]# firewall-cmd --list-services
dhcpv6-client snmp ssh
[root@k8sctrpln01 ~]# firewall-cmd --list-ports
[root@k8sctrpln01 ~]# firewall-cmd --add-service={etcd-client,etcd-server} --permanent
success
[root@k8sctrpln01 ~]# firewall-cmd --add-port={6443/tcp,10250/tcp,10251/tcp,10252/tcp} --permanent
success
[root@k8sctrpln01 ~]# firewall-cmd --reload
success
[root@k8sctrpln01 ~]# firewall-cmd --list-services
dhcpv6-client etcd-client etcd-server snmp ssh
[root@k8sctrpln01 ~]# firewall-cmd --list-ports
6443/tcp 10250/tcp 10251/tcp 10252/tcp
Containerdを選択した。
実は、公式ドキュメントに
Dockerとcontainerdの両方が同時に検出された場合、Dockerが優先されます。
の記述があるため、当初はDockerを選択していた。しかし、Kubernetesのバージョンv1.24以降はContainerdが優先される仕様に変わっているらしく、Containerdを選択して再構築した。
github.com
補足だが、気づいたのはkubeadm init
のトラブルシューティング中。公式ドキュメントが追い付いていないので事前に気づけない。
[root@k8sctrpln01 ~]# kubeadm init --pod-network-cidr 10.244.0.0/16 --v=5
I0528 00:23:59.899926 26554 initconfiguration.go:117] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
[root@k8sctrpln01 ~]# cat > /etc/modules-load.d/containerd.conf <<EOF
> overlay
> br_netfilter
> EOF
[root@k8sctrpln01 ~]# lsmod | grep -e ^overlay -e ^br_netfilter
br_netfilter 22256 0
[root@k8sctrpln01 ~]# modprobe overlay
[root@k8sctrpln01 ~]# modprobe br_netfilter
[root@k8sctrpln01 ~]# lsmod | grep -e ^overlay -e ^br_netfilter
overlay 91659 0
br_netfilter 22256 0
[root@k8sctrpln01 ~]# cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
> net.bridge.bridge-nf-call-iptables = 1
> net.ipv4.ip_forward = 1
> net.bridge.bridge-nf-call-ip6tables = 1
> EOF
[root@k8sctrpln01 ~]# sysctl -n net.bridge.bridge-nf-call-iptables -n net.ipv4.ip_forward -n net.bridge.bridge-nf-call-ip6tables
1
0
1
[root@k8sctrpln01 ~]# sysctl --system
* Applying /usr/lib/sysctl.d/00-system.conf ...
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
kernel.yama.ptrace_scope = 0
* Applying /usr/lib/sysctl.d/50-default.conf ...
kernel.sysrq = 16
kernel.core_uses_pid = 1
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /etc/sysctl.d/99-kubernetes-cri.conf ...
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.d/disable_ipv6.conf ...
net.ipv6.conf.all.disable_ipv6 = 1
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
* Applying /etc/sysctl.conf ...
[root@k8sctrpln01 ~]# sysctl -n net.bridge.bridge-nf-call-iptables -n net.ipv4.ip_forward -n net.bridge.bridge-nf-call-ip6tables
1
1
1
[root@k8sctrpln01 ~]# yum install -y yum-utils device-mapper-persistent-data lvm2
---snip---
[root@k8sctrpln01 ~]# yum list installed -q yum-utils device-mapper-persistent-data lvm2
インストール済みパッケージ
device-mapper-persistent-data.x86_64 0.8.5-3.el7_9.2 @updates
lvm2.x86_64 7:2.02.187-6.el7_9.5 @updates
yum-utils.noarch 1.1.31-54.el7_8 @base
[root@k8sctrpln01 ~]# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
読み込んだプラグイン:fastestmirror
adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
grabbing file https://download.docker.com/linux/centos/docker-ce.repo to /etc/yum.repos.d/docker-ce.repo
repo saved to /etc/yum.repos.d/docker-ce.repo
[root@k8sctrpln01 ~]# yum update -y
---snip---
[root@k8sctrpln01 ~]# yum install -y containerd.io
---snip---
[root@k8sctrpln01 ~]# yum list -q containerd.io
インストール済みパッケージ
containerd.io.x86_64 1.6.4-3.1.el7 @docker-ce-stable
[root@k8sctrpln01 ~]# mkdir -p /etc/containerd
[root@k8sctrpln01 ~]# mv /etc/containerd/config.toml /etc/containerd/config.toml_org
[root@k8sctrpln01 ~]# containerd config default > /etc/containerd/config.toml
[root@k8sctrpln01 ~]# systemctl restart containerd
[root@k8sctrpln01 ~]# systemctl is-active containerd
active
公式ドキュメントでは自動起動の有効化はしていないが、試しに無効のままでOS再起動したら各種サービスが起動しなかったので有効化しておく。
[root@k8sctrpln01 ~]# systemctl is-enabled containerd
disabled
[root@k8sctrpln01 ~]# systemctl enable containerd
Created symlink from /etc/systemd/system/multi-user.target.wants/containerd.service to /usr/lib/systemd/system/containerd.service.
[root@k8sctrpln01 ~]# systemctl is-enabled containerd
enabled
systemdのcgroupドライバーを使うには、/etc/containerd/config.toml内でplugins.cri.systemd_cgroup = trueを設定してください。
plugins.cri.systemd_cgroup
でgrepしても見つからなくて、どこのセクションに書けば良いかわからず困ったが、以下の構造らしい。わからんわ。
[root@k8sctrpln01 ~]# cat -n /etc/containerd/config.toml | grep systemd_cgroup
67 systemd_cgroup = false
[root@k8sctrpln01 ~]# cat -n /etc/containerd/config.toml
---snip---
45 [plugins."io.containerd.grpc.v1.cri"]
46 device_ownership_from_security_context = false
47 disable_apparmor = false
48 disable_cgroup = false
49 disable_hugetlb_controller = true
50 disable_proc_mount = false
51 disable_tcp_service = true
52 enable_selinux = false
53 enable_tls_streaming = false
54 enable_unprivileged_icmp = false
55 enable_unprivileged_ports = false
56 ignore_image_defined_volumes = false
57 max_concurrent_downloads = 3
58 max_container_log_line_size = 16384
59 netns_mounts_under_state_dir = false
60 restrict_oom_score_adj = false
61 sandbox_image = "k8s.gcr.io/pause:3.6"
62 selinux_category_range = 1024
63 stats_collect_period = 10
64 stream_idle_timeout = "4h0m0s"
65 stream_server_address = "127.0.0.1"
66 stream_server_port = "0"
67 systemd_cgroup = false
68 tolerate_missing_hugetlb_controller = true
69 unset_seccomp_profile = ""
---snip---
[root@k8sctrpln01 ~]# sed -i 's/systemd_cgroup = false/systemd_cgroup = true/g' /etc/containerd/config.toml
[root@k8sctrpln01 ~]# cat -n /etc/containerd/config.toml | grep systemd_cgroup
67 systemd_cgroup = true
kubeadmを使う場合はkubeletのためのcgroupドライバーを手動で設定してください。
後ほど設定する。
[root@k8sctrpln01 ~]# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
> [kubernetes]
> name=Kubernetes
> baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
> enabled=1
> gpgcheck=1
> repo_gpgcheck=1
> gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
> EOF
[root@k8sctrpln01 ~]# setenforce 0
[root@k8sctrpln01 ~]# getenforce
Permissive
[root@k8sctrpln01 ~]# sed -i '/^SELINUX=/s/.*/SELINUX=permissive/' /etc/selinux/config
[root@k8sctrpln01 ~]# grep "^SELINUX=" /etc/selinux/config
SELINUX=permissive
[root@k8sctrpln01 ~]# yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror
Loading mirror speeds from cached hostfile
* base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp
base | 3.6 kB 00:00:00
extras | 2.9 kB 00:00:00
kubernetes/signature | 844 B 00:00:00
https://packages.cloud.google.com/yum/doc/yum-key.gpg から鍵を取得中です。
Importing GPG key 0x6B4097C2:
Userid : "Rapture Automatic Signing Key (cloud-rapture-signing-key-2022-03-07-08_01_01.pub)"
Fingerprint: e936 7157 4236 81a4 7ec3 93c3 7325 816a 6b40 97c2
From : https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg から鍵を取得中です。
kubernetes/signature | 1.4 kB 00:00:00 !!!
https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/repodata/repomd.xml: [Errno -1] repomd.xml signature could not be verified for kubernetes
他のミラーを試します。
One of the configured repositories failed (Kubernetes),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Run the command with the repository temporarily disabled
yum --disablerepo=kubernetes ...
4. Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:
yum-config-manager --disable kubernetes
or
subscription-manager repos --disable=kubernetes
5. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=kubernetes.skip_if_unavailable=true
failure: repodata/repomd.xml from kubernetes: [Errno 256] No more mirrors to try.
https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/repodata/repomd.xml: [Errno -1] repomd.xml signature could not be verified for kubernetes
GPGチェックが失敗した様子。
github.com
repo_gpgcheck=0
でチェックをスキップする。
[root@k8sctrpln01 ~]# sed -i 's/^repo_gpgcheck=1/repo_gpgcheck=0/g' /etc/yum.repos.d/kubernetes.repo
[root@k8sctrpln01 ~]# yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
[root@k8sctrpln01 ~]# yum list -q kubelet kubeadm kubectl
インストール済みパッケージ
kubeadm.x86_64 1.24.1-0 @kubernetes
kubectl.x86_64 1.24.1-0 @kubernetes
kubelet.x86_64 1.24.1-0 @kubernetes
[root@k8sctrpln01 ~]# systemctl enable --now kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
[root@k8sctrpln01 ~]# systemctl is-enabled kubelet
enabled
[root@k8sctrpln01 ~]# systemctl is-active kubelet
activating
この時点ではまだactive
にならなくて正しい。
kubeadmが何をすべきか指示するまで、kubeletはクラッシュループで数秒ごとに再起動します。
[root@k8sctrpln01 ~]# cat /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS=
[root@k8sctrpln01 ~]# sed -i "s/^KUBELET_EXTRA_ARGS=/KUBELET_EXTRA_ARGS=--cgroup-driver=systemd/g" /etc/sysconfig/kubelet
[root@k8sctrpln01 ~]# cat /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd
[root@k8sctrpln01 ~]# systemctl daemon-reload
[root@k8sctrpln01 ~]# systemctl restart kubelet
次のページへ。
さっき見た。
へい。
さっきやった。
[root@k8sctrpln01 ~]# kubeadm init --pod-network-cidr 10.244.0.0/16
W0529 05:32:54.020100 15143 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://dl.k8s.io/release/stable-1.txt": dial tcp: lookup dl.k8s.io on 172.16.0.2:53: server misbehaving
W0529 05:32:54.020248 15143 version.go:104] falling back to the local client version: v1.24.1
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.24.1: output: E0529 05:32:55.259199 15185 remote_image.go:238] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to resolve reference \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to do request: Head \"https://k8s.gcr.io/v2/kube-apiserver/manifests/v1.24.1\": dial tcp: lookup k8s.gcr.io on 172.16.0.2:53: server misbehaving" image="k8s.gcr.io/kube-apiserver:v1.24.1"
time="2022-05-29T05:32:55+09:00" level=fatal msg="pulling image: rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to resolve reference \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to do request: Head \"https://k8s.gcr.io/v2/kube-apiserver/manifests/v1.24.1\": dial tcp: lookup k8s.gcr.io on 172.16.0.2:53: server misbehaving"
, error: exit status 1
---snip---
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
うまくいかない。エラー抜粋。
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.24.1: output: E0529 05:32:55.259199 15185 remote_image.go:238] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to resolve reference \"k8s.gcr.io/kube-apiserver:v1.24.1\": failed to do request: Head \"https://k8s.gcr.io/v2/kube-apiserver/manifests/v1.24.1\": dial tcp: lookup k8s.gcr.io on 172.16.0.2:53: server misbehaving" image="k8s.gcr.io/kube-apiserver:v1.24.1"
さらに抜粋。
dial tcp: lookup k8s.gcr.io on 172.16.0.2:53: server misbehaving
k8s.gcr.io
の名前解決に失敗している。
原因はプロキシの設定不足。当環境はプロキシサーバを経由してインターネットにアクセスする構成のため、設定が必要だった。
こちらを参考に
Containerdサービスの起動パラメータにプロキシを設定することで解決した。
[root@k8sctrpln01 ~]# mkdir /etc/systemd/system/containerd.service.d
[root@k8sctrpln01 ~]# cat <<EOF > /etc/systemd/system/containerd.service.d/http-proxy.conf
[Service]
Environment='HTTP_PROXY=http://proxy01:3128'
Environment='HTTPS_PROXY=http://proxy01:3128'
EOF
[root@k8sctrpln01 ~]# systemctl daemon-reload
[root@k8sctrpln01 ~]# systemctl restart containerd
[root@k8sctrpln01 ~]# kubeadm init --pod-network-cidr 10.244.0.0/16
W0529 06:12:34.627517 17351 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://dl.k8s.io/release/stable-1.txt": dial tcp: lookup dl.k8s.io on 172.16.0.2:53: server misbehaving
W0529 06:12:34.627588 17351 version.go:104] falling back to the local client version: v1.24.1
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: E0529 06:12:34.648929 17359 remote_runtime.go:925] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
time="2022-05-29T06:12:34+09:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
が、別のエラーが出た。抜粋。
[ERROR CRI]: container runtime is not running: output: E0529 06:12:34.648929 17359 remote_runtime.go:925] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
まずは自力で調査。Containerdサービスは起動している。
[root@k8sctrpln01 ~]# systemctl is-active containerd
active
しかし、Containerdのユーティリティであるcrictl
コマンドが使えない。
[root@k8sctrpln01 ~]# crictl version
WARN[0000] runtime connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
ERRO[0000] unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory"
E0529 06:36:10.222225 18569 remote_runtime.go:168] "Version from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
FATA[0000] getting the runtime version: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService
WARNはendpointを指定することで回避できたが、エラーは解消せず。
[root@k8sctrpln01 ~]# crictl --runtime-endpoint unix:///run/containerd/containerd.sock version
E0529 06:36:44.521051 18602 remote_runtime.go:168] "Version from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
FATA[0000] getting the runtime version: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService
ググると/etc/containerd/config.toml
を削除すれば良い的な記事が見つかった。解せないが他に対処方法も見つからないため、実施する。
github.com
[root@k8sctrpln01 ~]# rm /etc/containerd/config.toml
rm: 通常ファイル `/etc/containerd/config.toml' を削除しますか? y
[root@k8sctrpln01 ~]# systemctl restart containerd
[root@k8sctrpln01 ~]# systemctl is-active containerd
active
crictl
コマンドが使えるようになった。
[root@k8sctrpln01 ~]# crictl --runtime-endpoint unix:///run/containerd/containerd.sock version
Version: 0.1.0
RuntimeName: containerd
RuntimeVersion: 1.6.4
RuntimeApiVersion: v1
※/etc/containerd/config.toml
にわざわざsystemd_cgroup = true
を設定したが、あればなんだったのか・・・
本題に戻りkubeadm init
をリトライ。ようやく突破した。
[root@k8sctrpln01 ~]# kubeadm init --pod-network-cidr 10.244.0.0/16
W0529 07:07:11.390743 20256 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://dl.k8s.io/release/stable-1.txt": dial tcp: lookup dl.k8s.io on 172.16.0.2:53: server misbehaving
W0529 07:07:11.390884 20256 version.go:104] falling back to the local client version: v1.24.1
[init] Using Kubernetes version: v1.24.1
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8sctrpln01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.16.0.11]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8sctrpln01 localhost] and IPs [172.16.0.11 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8sctrpln01 localhost] and IPs [172.16.0.11 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 39.505506 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-check] Initial timeout of 40s passed.
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8sctrpln01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8sctrpln01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 1alk8n.8haf7p9cszxh2kw1
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.0.11:6443 --token 1alk8n.8haf7p9cszxh2kw1 \
--discovery-token-ca-cert-hash sha256:fc839a2ff1689c9af8f302670a51c1c44e5a16b7d89304931b7cc9052a192caf
コントロールプレーンノードをクラスタ構成にする場合は対処が必要。今回はシングルノード構成なのでスキップ。
[root@k8sctrpln01 ~]# mkdir -p $HOME/.kube
[root@k8sctrpln01 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8sctrpln01 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
[root@k8sctrpln01 ~]# export KUBECONFIG=/etc/kubernetes/admin.conf
Podネットワークアドオンが未インストールの状態だと、corednsがPending
状態のままになっている。
[root@k8sctrpln01 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-66hl2 0/1 Pending 0 3m1s
kube-system coredns-6d4b75cb6d-vnw7j 0/1 Pending 0 3m1s
kube-system etcd-k8sctrpln01 1/1 Running 0 3m30s
kube-system kube-apiserver-k8sctrpln01 1/1 Running 0 3m16s
kube-system kube-controller-manager-k8sctrpln01 1/1 Running 0 3m31s
kube-system kube-proxy-cktd6 1/1 Running 0 3m1s
kube-system kube-scheduler-k8sctrpln01 1/1 Running 0 3m30s
アドオンはいくつかあるが、とくに根拠も無くflannelを使用する。
入手先のURLはkubeadm init
の標準出力に記載されている(普通に見逃す)。さらに、後述するが、flannelのバージョンが最新のv0.18.0では正常動作せず、v0.16.3を使用した。
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
以下にたどり着く。
github.com
Deploying flannel manually
を参考にインストールする。
[root@k8sctrpln01 ~]# kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
Unable to connect to the server: dial tcp: lookup raw.githubusercontent.com on 172.16.0.2:53: server misbehaving
ここでもプロキシの壁が。
[root@k8sctrpln01 ~]# cat <<EOF > /etc/profile.d/k8s.sh
> export http_proxy=http://proxy01:3128
> export https_proxy=http://proxy01:3128
> EOF
[root@k8sctrpln01 ~]# source /etc/profile.d/k8s.sh
[root@k8sctrpln01 ~]# env | grep proxy
http_proxy=http://proxy01:3128
https_proxy=http://proxy01:3128
改めてインストール。
[root@k8sctrpln01 ~]# kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
まずkube-flannelの状態がInit
→Running
に遷移する。次いでcorednsの状態がContainerCreating
に遷移するのだが、Running
まで遷移しない。そしてkube-flannelはError
になる始末。
[root@k8sctrpln01 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-66hl2 0/1 ContainerCreating 0 5m21s
kube-system coredns-6d4b75cb6d-vnw7j 0/1 ContainerCreating 0 5m21s
kube-system etcd-k8sctrpln01 1/1 Running 0 5m50s
kube-system kube-apiserver-k8sctrpln01 1/1 Running 0 5m36s
kube-system kube-controller-manager-k8sctrpln01 1/1 Running 0 5m51s
kube-system kube-flannel-ds-vdwnm 0/1 Error 1 (9s ago) 56s
kube-system kube-proxy-cktd6 1/1 Running 0 5m21s
kube-system kube-scheduler-k8sctrpln01 1/1 Running 0 5m50s
kube-flannelのPodのログを確認する。
[root@k8sctrpln01 ~]# kubectl -n kube-system logs kube-flannel-ds-vdwnm
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
I0528 22:14:46.576077 1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W0528 22:14:46.576246 1 client_config.go:614] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0528 22:14:46.773989 1 kube.go:121] Waiting 10m0s for node controller to sync
I0528 22:14:46.774494 1 kube.go:398] Starting kube subnet manager
I0528 22:14:47.774698 1 kube.go:128] Node controller sync successful
I0528 22:14:47.774727 1 main.go:227] Created subnet manager: Kubernetes Subnet Manager - k8sctrpln01
I0528 22:14:47.774733 1 main.go:230] Installing signal handlers
I0528 22:14:47.774886 1 main.go:463] Found network config - Backend type: vxlan
I0528 22:14:47.775008 1 match.go:195] Determining IP address of default interface
E0528 22:14:47.775135 1 main.go:270] Failed to find any valid interface to use: failed to get default interface: protocol not available
flannnelが通信するインターフェースを自動検出する処理が、Failed to find any valid interface to use: failed to get default interface: protocol not available
で失敗しているのが直接原因の様子。色々と調べたり試したりしたが、解消できなかった。
※/run/flannel/subnet.env
を手動作成してみたり、kubeadm reset
してみたり、kube-flannel.yml
を編集してインターフェースを指定してみたり。
切り分けとして手元にあった古いバージョンのkube-flannel.yml
を使用したところ解消。とりあえずこれでいいや・・・
[root@k8sctrpln01 ~]# curl -s https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.
yml | grep image | grep -v "#"
image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
image: rancher/mirrored-flannelcni-flannel:v0.18.0
image: rancher/mirrored-flannelcni-flannel:v0.18.0
[root@k8sctrpln01 ~]# cat kube-flannel.yml | grep image | grep -v "#"
image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.0.1
image: rancher/mirrored-flannelcni-flannel:v0.16.3
image: rancher/mirrored-flannelcni-flannel:v0.16.3
[root@k8sctrpln01 ~]# kubectl apply -f kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged configured
clusterrole.rbac.authorization.k8s.io/flannel unchanged
clusterrolebinding.rbac.authorization.k8s.io/flannel unchanged
serviceaccount/flannel unchanged
configmap/kube-flannel-cfg unchanged
daemonset.apps/kube-flannel-ds configured
[root@k8sctrpln01 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-66hl2 1/1 Running 0 20m
kube-system coredns-6d4b75cb6d-vnw7j 1/1 Running 0 20m
kube-system etcd-k8sctrpln01 1/1 Running 0 20m
kube-system kube-apiserver-k8sctrpln01 1/1 Running 0 20m
kube-system kube-controller-manager-k8sctrpln01 1/1 Running 0 20m
kube-system kube-flannel-ds-vdzrd 1/1 Running 0 68s
kube-system kube-proxy-cktd6 1/1 Running 0 20m
kube-system kube-scheduler-k8sctrpln01 1/1 Running 0 20m
なお、この時点でコントロールプレーンノードがReady
になっている。
[root@k8sctrpln01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8sctrpln01 Ready control-plane 21m v1.24.1
Podはワーカノードのみで起動させたいので対象外。
当環境のノードは仮想ゲストマシンのため、コントロールプレーンノードからワーカノードを複製し、ワーカノード向けに設定変更する方式で構築する。
ノードの複製、ホスト名やIPアドレスの変更が完了したところから記載する。
Kubernetes環境を初期化する。
[root@k8sworker01 ~]# yes | kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0527 20:08:44.001097 2944 preflight.go:55] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: [preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0527 20:09:23.058909 2944 cleanupnode.go:93] [reset] Failed to remove containers: [failed to remove running container 32ffc7801249a8077520b4876f5cd27f49929bb048ccb679d2387f012a9a2da8: output: E0527 20:08:53.857270 3260 remote_runtime.go:274] "RemovePodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="32ffc7801249a8077520b4876f5cd27f49929bb048ccb679d2387f012a9a2da8"
removing the pod sandbox "32ffc7801249a8077520b4876f5cd27f49929bb048ccb679d2387f012a9a2da8": rpc error: code = DeadlineExceeded desc = context deadline exceeded
, error: exit status 1, failed to remove running container f10a6ab2bba52dd9212ae35d13b4347a939d81df8d65a75b6e8509d3d7908784: output: E0527 20:08:57.084978 3347 remote_runtime.go:274] "RemovePodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="f10a6ab2bba52dd9212ae35d13b4347a939d81df8d65a75b6e8509d3d7908784"
removing the pod sandbox "f10a6ab2bba52dd9212ae35d13b4347a939d81df8d65a75b6e8509d3d7908784": rpc error: code = DeadlineExceeded desc = context deadline exceeded
, error: exit status 1, failed to stop running pod 208e167da6667e0dcf577d1cca0c57f44d8b698eaaaad96941d7bfdac26100f5: output: E0527 20:08:59.095956 3352 remote_runtime.go:248] "StopPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = failed to stop container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\": an error occurs during waiting for container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\" to be killed: wait container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\": context deadline exceeded" podSandboxID="208e167da6667e0dcf577d1cca0c57f44d8b698eaaaad96941d7bfdac26100f5"
time="2022-05-27T20:08:59+09:00" level=fatal msg="stopping the pod sandbox \"208e167da6667e0dcf577d1cca0c57f44d8b698eaaaad96941d7bfdac26100f5\": rpc error: code = DeadlineExceeded desc = failed to stop container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\": an error occurs during waiting for container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\" to be killed: wait container \"0807cc39ba94fdbce032437b78e9ff4306bf3eed3b33b21292bedacacd1a5a86\": context deadline exceeded"
, error: exit status 1, failed to stop running pod 9d3c0e0634883d65e2dde44ecf40c2c7d31a2dc24a2063e48af923433840e2f8: output: E0527 20:09:01.107616 3388 remote_runtime.go:248] "StopPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="9d3c0e0634883d65e2dde44ecf40c2c7d31a2dc24a2063e48af923433840e2f8"
time="2022-05-27T20:09:01+09:00" level=fatal msg="stopping the pod sandbox \"9d3c0e0634883d65e2dde44ecf40c2c7d31a2dc24a2063e48af923433840e2f8\": rpc error: code = DeadlineExceeded desc = context deadline exceeded"
, error: exit status 1, failed to remove running container 830472a335e3e7438337679bcfc7a16813f336eff1905e4f9111a56819ee5623: output: E0527 20:09:16.298865 3753 remote_runtime.go:274] "RemovePodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="830472a335e3e7438337679bcfc7a16813f336eff1905e4f9111a56819ee5623"
removing the pod sandbox "830472a335e3e7438337679bcfc7a16813f336eff1905e4f9111a56819ee5623": rpc error: code = DeadlineExceeded desc = context deadline exceeded
, error: exit status 1, failed to remove running container e0ce40c26ba17869724471443fcc7b0af25356e883fd1ab6753c2c2dcd0ecae4: output: E0527 20:09:23.058159 3807 remote_runtime.go:274] "RemovePodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="e0ce40c26ba17869724471443fcc7b0af25356e883fd1ab6753c2c2dcd0ecae4"
removing the pod sandbox "e0ce40c26ba17869724471443fcc7b0af25356e883fd1ab6753c2c2dcd0ecae4": rpc error: code = DeadlineExceeded desc = context deadline exceeded
, error: exit status 1]
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
手動でのクリーンアップも必要。
[root@k8sworker01 ~]# ls -l $HOME/.kube/config
-rw-------. 1 root root 5635 5月 27 17:03 /root/.kube/config
[root@k8sworker01 ~]# rm -f $HOME/.kube/config
[root@k8sworker01 ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:17:78:f8 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:8d:4c:c5 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:07:f1:85 brd ff:ff:ff:ff:ff:ff
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 12:d2:03:a5:d2:5a brd ff:ff:ff:ff:ff:ff
6: cni0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/ether 6a:55:d4:30:30:97 brd ff:ff:ff:ff:ff:ff
[root@k8sworker01 ~]# ip link delete cni0
[root@k8sworker01 ~]# ip link delete flannel.1
[root@k8sworker01 ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:17:78:f8 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:8d:4c:c5 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:07:f1:85 brd ff:ff:ff:ff:ff:ff
ワーカノード用にファイアウォールを設定する。
[root@k8sworker01 ~]# firewall-cmd --list-services
dhcpv6-client etcd-client etcd-server snmp ssh
[root@k8sworker01 ~]# firewall-cmd --list-ports
6443/tcp 10250/tcp 10251/tcp 10252/tcp
[root@k8sworker01 ~]# firewall-cmd --add-port=30000-32767/tcp --permanent
success
[root@k8sworker01 ~]# firewall-cmd --remove-service={etcd-client,etcd-server} --permanent
success
[root@k8sworker01 ~]# firewall-cmd --remove-port={6443/tcp,10251/tcp,10252/tcp} --permanent
success
[root@k8sworker01 ~]# firewall-cmd --reload
success
[root@k8sworker01 ~]# firewall-cmd --list-services
dhcpv6-client snmp ssh
[root@k8sworker01 ~]# firewall-cmd --list-ports
10250/tcp 30000-32767/tcp
コントロールプレーンノードでkubeadm init
した際の標準出力に、ワーカノードをクラスタに参加させるkubeadm join
コマンドが表示されている。それを使っても構わないが、トークンの期限が24時間で更新が面倒なので、ここで無期限のトークンを作成してしまう。
zaki-hmkc.hatenablog.com
[root@k8sctrpln01 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
pbwo6h.lk8hjvpfuqroxf9u 20h 2022-05-28T08:27:45Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
[root@k8sctrpln01 ~]# kubeadm token delete 1alk8n.8haf7p9cszxh2kw1
bootstrap token "1alk8n" deleted
[root@k8sctrpln01 ~]# kubeadm token create --ttl 0 --print-join-command
kubeadm join 172.16.0.11:6443 --token o133rw.oyma9inaboa1ra9w --discovery-token-ca-cert-hash sha256:fc839a2ff1689c9af8f302670a51c1c44e5a16b7d89304931b7cc9052a192caf
[root@k8sctrpln01 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
o133rw.oyma9inaboa1ra9w <forever> <never> authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
[root@k8sworker01 ~]# kubeadm join 172.16.0.11:6443 --token o133rw.oyma9inaboa1ra9w --discovery-token-ca-cert-hash sha256:fc839a2ff1689c9af8f302670a51c1c44e5a16b7d89304931b7cc9052a192caf
[preflight] Running pre-flight checks
[WARNING HTTPProxy]: Connection to "https://172.16.0.11" uses proxy "http://proxy01:3128". If that is not intended, adjust your proxy settings
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
ワーカノードがクラスタに参加できたことを確認する。
[root@k8sctrpln01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8sctrpln01 Ready control-plane 65m v1.24.1
k8sworker01 Ready <none> 3m23s v1.24.1
せっかくなのでワーカノードをもう1台参加させた。
[root@k8sctrpln01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8sctrpln01 Ready control-plane 69m v1.24.1
k8sworker01 Ready <none> 6m42s v1.24.1
k8sworker02 Ready <none> 55s v1.24.1
おまけ。-o wide
オプションが何かと便利。
[root@k8sctrpln01 ~]# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8sctrpln01 Ready control-plane 69m v1.24.1 172.16.0.11 <none> CentOS Linux 7 (Core) 3.10.0-1160.66.1.el7.x86_64 containerd://1.6.4
k8sworker01 Ready <none> 6m53s v1.24.1 172.16.0.12 <none> CentOS Linux 7 (Core) 3.10.0-1160.66.1.el7.x86_64 containerd://1.6.4
k8sworker02 Ready <none> 66s v1.24.1 172.16.0.13 <none> CentOS Linux 7 (Core) 3.10.0-1160.66.1.el7.x86_64 containerd://1.6.4
とりあえずスキップ。
とりあえずスキップ。
原則不要。環境再構築時など必要に応じて。
見たところ必須ではなさそう。
あとがき
構築は2回目だったが、半日程度かかった。
トラブルの要因はいずれもパッケージのバージョンに起因するものだった。バージョンアップを躊躇させてくるね・・・
参考記事
本文中に記載。