前言

去年 12 月份，当 Kubernetes 社区宣布 1.20 版本之后会逐步弃用 dockershim，当时也有很多自媒体在宣传 Kubernetes 弃用 Docker。其实，我觉得这是一种误导，也许仅仅是为了蹭热度。

dockershim 是 Kubernetes 的一个组件，其作用是为了操作 Docker。Docker 是在 2013 年面世的，而 Kubernetes 是在 2016 年，所以 Docker 刚开始并没有想到编排，也不会知道会出现 Kubernetes 这个庞然大物（它要是知道，也不会败的那么快…）。但是 Kubernetes 在创建的时候就是以 Docker 作为容器运行时，很多操作逻辑都是针对的 Docker，随着社区越来越健壮，为了兼容更多的容器运行时，才将 Docker 的相关逻辑独立出来组成了 dockershim。

正因为这样，只要 Kubernetes 的任何变动或者 Docker 的任何变动，都必须维护 dockershim，这样才能保证足够的支持，但是通过 dockershim 操作 Docker，其本质还是操作 Docker 的底层运行时 Containerd，而且 Containerd 自身也是支持 CRI（Container Runtime Interface），那为什么还要绕一层 Docker 呢？是不是可以直接通过 CRI 和 Containerd 进行交互？这也是社区希望启动 dockershim 的原因之一吧。

再来看看启动 dockershim 究竟对用户、对维护者有多少影响。

对上层用户来说，其实并没有影响，因为上层已经屏蔽调了这些细节，只管用就可以了。更多的影响只是针对我们这些 YAML 工程师，因为我们主要是考虑用哪个容器运行时，如果继续用 Docker，以后版本升级有没有影响？如果不用 Docker，维护的成本、复杂度、学习成本会不会增加？其实我们是想多了，事情也远没那么复杂，喜欢用 docker 的依旧可以用 docker，想用 containerd 的就用 containerd，改动也不大，后面也有相关的部署文档。而且也只是 kubernetes 社区不再维护 dockershim 而已，Mirantis 和 Docker 已经决定之后共同合作维护 dockershim 组件，也就是说 dockershim 依然可以作为连接 docker 的桥梁，只是从 kubernetes 内置携带改成独立的而已。

那什么是 containerd 呢？

Containerd 是从 Docker 中分离的一个项目，旨在为 Kubernetes 提供容器运行时，负责管理镜像和容器的生命周期。不过 Containerd 是可以抛开 Docker 独立工作的。它的特性如下：

支持 OCI 镜像规范，也就是 runc
支持 OCI 运行时规范
支持镜像的 pull
支持容器网络管理
存储支持多租户
支持容器运行时和容器的生命周期管理
支持管理网络名称空间

Containerd 和 Docker 在命令使用上的一些区别主要如下：

功能	Docker	Containerd
显示本地镜像列表	docker images	crictl images
下载镜像	docker pull	crictl pull
上传镜像	docker push	无
删除本地镜像	docker rmi	crictl rmi
查看镜像详情	docker inspect IMAGE-ID	crictl inspecti IMAGE-ID
显示容器列表	docker ps	crictl ps
创建容器	docker create	crictl create
启动容器	docker start	crictl start
停止容器	docker stop	crictl stop
删除容器	docker rm	crictl rm
查看容器详情	docker inspect	crictl inspect
attach	docker attach	crictl attach
exec	docker exec	crictl exec
logs	docker logs	crictl logs
stats	docker stats	crictl stats

可以看到使用方式大同小异。

下面介绍一下使用 kubeadm 安装 K8S 集群，并使用 containerd 作为容器运行时的具体安装步骤。

环境说明

主机节点

IP 地址	系统	内核
192.168.0.5	CentOS7.6	3.10
192.168.0.125	CentOS7.6	3.10

软件说明

软件	版本
kubernetes	1.20.5
containerd	1.4.4

环境准备

（1）在每个节点上添加 hosts 信息：
$ cat /etc/hosts

192.168.0.5 k8s-master
192.168.0.125 k8s-node01

（2）禁用防火墙：

$ systemctl stop firewalld
$ systemctl disable firewalld

（3）禁用 SELINUX：

$ setenforce 0
$ cat /etc/selinux/config
SELINUX=disabled

（4）创建 /etc/sysctl.d/k8s.conf 文件，添加如下内容：

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

（5）执行如下命令使修改生效：

$ modprobe br_netfilter
$ sysctl -p /etc/sysctl.d/k8s.conf

（6）安装 ipvs

$ cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

上面脚本创建了的 /etc/sysconfig/modules/ipvs.modules 文件，保证在节点重启后能自动加载所需模块。使用 lsmod | grep -e ip_vs -e nf_conntrack_ipv4 命令查看是否已经正确加载所需的内核模块。

（7）安装了 ipset 软件包：

$ yum install ipset -y

为了便于查看 ipvs 的代理规则，最好安装一下管理工具 ipvsadm：

$ yum install ipvsadm -y

（8）同步服务器时间

$ yum install chrony -y
$ systemctl enable chronyd
$ systemctl start chronyd
$ chronyc sources

（9）关闭 swap 分区：

$ swapoff -a

（10）修改 /etc/fstab 文件，注释掉 SWAP 的自动挂载，使用 free -m 确认 swap 已经关闭。swappiness 参数调整，修改 /etc/sysctl.d/k8s.conf 添加下面一行：

vm.swappiness=0

执行 sysctl -p /etc/sysctl.d/k8s.conf 使修改生效。

（11）接下来可以安装 Containerd

$ yum install -y yum-utils \
 device-mapper-persistent-data \
 lvm2
$ yum-config-manager \
 --add-repo \
 https://download.docker.com/linux/centos/docker-ce.repo
$ yum list | grep containerd

可以选择安装一个版本，比如我们这里安装最新版本：

$ yum install containerd.io-1.4.4 -y

（12）创建 containerd 配置文件：

mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
# 替换配置文件
sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g"  /etc/containerd/config.toml
sed -i '/containerd.runtimes.runc.options/a\ \ \ \ \ \ \ \ \ \ \ \ SystemdCgroup = true' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#https://registry.cn-hangzhou.aliyuncs.com#g"  /etc/containerd/config.toml

（13）启动 Containerd:

systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd

在确保 Containerd 安装完成后，上面的相关环境配置也完成了，现在我们就可以来安装 Kubeadm 了，我们这里是通过指定 yum 源的方式来进行安装，使用阿里云的源进行安装：

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
 http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

然后安装 kubeadm、kubelet、kubectl（我安装的是最新版，有版本要求自己设定版本）：

$  yum install -y kubelet-1.20.5 kubeadm-1.20.5 kubectl-1.20.5

设置运行时：

$ crictl config runtime-endpoint /run/containerd/containerd.sock

可以看到我们这里安装的是 v1.20.5 版本，然后将 kubelet 设置成开机启动：

$ systemctl daemon-reload
$ systemctl enable kubelet && systemctl start kubelet

到这里为止上面所有的操作都需要在所有节点执行配置。

初始化集群

初始化 Master

然后接下来在 master 节点配置 kubeadm 初始化文件，可以通过如下命令导出默认的初始化配置：

$ kubeadm config print init-defaults > kubeadm.yaml

然后根据我们自己的需求修改配置，比如修改 imageRepository 的值，kube-proxy 的模式为 ipvs，需要注意的是由于我们使用的 containerd 作为运行时，所以在初始化节点的时候需要指定 cgroupDriver 为 systemd

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.0.5 
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock 
  name: k8s-master
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
networking:
  dnsDomain: cluster.local
  podSubnet: 172.16.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

然后使用上面的配置文件进行初始化：

$ kubeadm init --config=kubeadm.yaml

[init] Using Kubernetes version: v1.20.5
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.5]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.0.5 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.0.5 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 70.001862 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.5:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:446623b965cdb0289c687e74af53f9e9c2063e854a42ee36be9aa249d3f0ccec

拷贝 kubeconfig 文件

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

添加节点

记住初始化集群上面的配置和操作要提前做好，将 master 节点上面的 $HOME/.kube/config 文件拷贝到 node 节点对应的文件中，安装 kubeadm、kubelet、kubectl，然后执行上面初始化完成后提示的 join 命令即可：

# kubeadm join 192.168.0.5:6443 --token abcdef.0123456789abcdef \
>     --discovery-token-ca-cert-hash sha256:446623b965cdb0289c687e74af53f9e9c2063e854a42ee36be9aa249d3f0ccec 
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

如果忘记了上面的 join 命令可以使用命令 kubeadm token create --print-join-command 重新获取。

执行成功后运行 get nodes 命令：

$ kubectl get no
NAME         STATUS     ROLES                  AGE   VERSION
k8s-master   NotReady   control-plane,master   29m   v1.20.5
k8s-node01   NotReady   <none>                 28m   v1.20.5

可以看到是 NotReady 状态，这是因为还没有安装网络插件，接下来安装网络插件，可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件，这里我们安装 calio:

$ wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml

因为有节点是多网卡，所以需要在资源清单文件中指定内网网卡

$ vi calico.yaml

......
spec:
 containers:
 - env:
 - name: DATASTORE_TYPE
   value: kubernetes
 - name: IP_AUTODETECTION_METHOD # DaemonSet中添加该环境变量
   value: interface=eth0 # 指定内网网卡
 - name: WAIT_FOR_DATASTORE
   value: "true"
- name: CALICO_IPV4POOL_CIDR # 由于在init的时候配置的172网段，所以这里需要修改
  value: "172.16.0.0/16"

......

安装 calico 网络插件

$ kubectl apply -f calico.yaml

隔一会儿查看 Pod 运行状态：

# kubectl get pod -n kube-system 
NAME                                      READY   STATUS              RESTARTS   AGE
calico-kube-controllers-bcc6f659f-zmw8n   0/1     ContainerCreating   0          7m58s
calico-node-c4vv7                         1/1     Running             0          7m58s
calico-node-dtw7g                         0/1     PodInitializing     0          7m58s
coredns-54d67798b7-mrj2b                  1/1     Running             0          46m
coredns-54d67798b7-p667d                  1/1     Running             0          46m
etcd-k8s-master                           1/1     Running             0          46m
kube-apiserver-k8s-master                 1/1     Running             0          46m
kube-controller-manager-k8s-master        1/1     Running             0          46m
kube-proxy-clf4s                          1/1     Running             0          45m
kube-proxy-mt7tt                          1/1     Running             0          46m
kube-scheduler-k8s-master                 1/1     Running             0          46m

网络插件运行成功了，node 状态也正常了：

# kubectl get nodes 
NAME         STATUS   ROLES                  AGE   VERSION
k8s-master   Ready    control-plane,master   47m   v1.20.5
k8s-node01   Ready    <none>                 46m   v1.20.5

用同样的方法添加另外一个节点即可。**

配置命令自动补全

yum install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc

踩坑

在 1.20 版本以上，当使用 nfs 做存储的时候，在创建 PVC 时，会报以下错误：

I0323 08:41:25.264754       1 controller.go:987] provision "default/test-nfs-pvc2" class "nfs-client-storageclass": started
E0323 08:41:25.267631       1 controller.go:1004] provision "default/test-nfs-pvc2" class "nfs-client-storageclass": unexpected error getting claim reference: selfLink was empty, can't make reference

这是因为 kubernetes1.20.0 废弃了 selfLink，解决办法是重新加回来，如下在 kube-apiserver.yaml 中添加如下参数：

$ vim /etc/kubernetes/manifests/kube-apiserver.yaml
# 增加一行
- --feature-gates=RemoveSelfLink=false

然后重新 apply 以下使之生效：

kubectl apply -f /etc/kubernetes/manifests/kube-apiserver.yaml

参考文档

【1】：https://github.com/containerd/containerd/issues/4857
【2】：https://github.com/containerd/containerd
【3】：https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md

Menu

Share

kubeadm部署K8S，使用containerd做运行时！

前言

环境说明

主机节点

环境准备

初始化集群

初始化 Master

添加节点

因为有节点是多网卡，所以需要在资源清单文件中指定内网网卡

踩坑

参考文档

Comment

Linux基于等保的安全加固

【夜莺监控】告警管理，香！

Calico下如何切换数据面到eBPF

基于Kubernetes的CICD实战

【夜莺监控】从日志提取指标的瑞士军刀

【夜莺监控】海王——Categraf

【运维必读】运维的十一条规范

什么是SRE？应具备什么能力？

记一次k8s control-plane 排障经历

Argo Workflows-Kubernetes的工作流引擎