kubernetes worker 节点运行如下组件:
- kube-nginx;
- containerd;
- kubelet;
- kube-proxy;
- cilium;
graph TB
subgraph "Worker Node"
subgraph "高可用代理层"
nginx["kube-nginx
(127.0.0.1:8443)"]
end
subgraph "核心组件"
kubelet["kubelet
(10250)"]
kube-proxy["kube-proxy
(IPVS模式)"]
end
subgraph "容器运行时"
containerd["containerd"]
runc["runc"]
crictl["crictl / nerdctl"]
end
subgraph "网络组件"
cilium["cilium
(CNI插件)"]
end
subgraph "运行的Pod"
pod["Pod容器"]
end
end
subgraph "Master Nodes"
api1["kube-apiserver
(Node1:6443)"]
api2["kube-apiserver
(Node2:6443)"]
end
%% 连接关系
nginx -->|"负载均衡
健康检查"| api1
nginx -->|"负载均衡
健康检查"| api2
kubelet -->|"通过本地代理访问"| nginx
kubelet -->|"管理容器"| containerd
kube-proxy -->|"watch Service/Endpoint"| nginx
kube-proxy -->|"创建IPVS规则
实现Service负载均衡"| pod
containerd -->|"运行容器"| runc
runc -->|"创建"| pod
crictl -.->|"CLI工具"| containerd
cilium -->|"配置Pod网络"| pod
cilium -->|"CNI接口"| kubelet
style nginx fill:#e1f5fe
style kubelet fill:#fff3e0
style kube-proxy fill:#fff3e0
style containerd fill:#e8f5e9
style cilium fill:#fce4ec
style pod fill:#e1bee7
注意:如果没有特殊指明,本文档的所有操作均在 k8s-01 节点上执行。
部署 kube-apiserver 高可用组件#
本小节讲解使用 nginx 4 层透明代理功能实现 Kubernetes worker 节点组件高可用访问 kube-apiserver 集群的步骤。
注意:如果没有特殊指明,本小节的所有操作均在 k8s-01 节点上执行。
基于 nginx 代理的 kube-apiserver 高可用方案#
- 控制节点的 kube-controller-manager、kube-scheduler 是多实例部署且连接本机的 kube-apiserver,所以只要有一个实例正常,就可以保证高可用;
- 集群内的 Pod 使用 Kubernetes 服务域名 kubernetes 访问 kube-apiserver,kube-dns 会自动解析出多个 kube-apiserver 节点的 IP,所以也是高可用的;
- 在每个节点起一个 nginx 进程,后端对接多个 apiserver 实例,nginx 对它们做健康检查和负载均衡;
- kubelet、kube-proxy 通过本地的 nginx(监听 127.0.0.1)访问 kube-apiserver,从而实现 kube-apiserver 的高可用;
下载和编译 Nginx#
下载源码:
cd /opt/k8s/work
wget http://nginx.org/download/nginx-1.27.2.tar.gz
tar -xzvf nginx-1.27.2.tar.gz
配置编译参数:
cd /opt/k8s/work/nginx-1.27.2
mkdir nginx-prefix
apt install -y gcc make
./configure --with-stream --without-http --prefix=$(pwd)/nginx-prefix --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
--with-stream:开启 4 层透明转发(TCP Proxy)功能;--without-xxx:关闭所有其他功能,这样生成的动态链接二进制程序依赖最小;
输出:
Configuration summary
+ PCRE library is not used
+ OpenSSL library is not used
+ zlib library is not used
nginx path prefix: "/root/tmp/nginx-1.15.3/nginx-prefix"
nginx binary file: "/root/tmp/nginx-1.15.3/nginx-prefix/sbin/nginx"
nginx modules path: "/root/tmp/nginx-1.15.3/nginx-prefix/modules"
nginx configuration prefix: "/root/tmp/nginx-1.15.3/nginx-prefix/conf"
nginx configuration file: "/root/tmp/nginx-1.15.3/nginx-prefix/conf/nginx.conf"
nginx pid file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/nginx.pid"
nginx error log file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/error.log"
nginx http access log file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/access.log"
nginx http client request body temporary files: "client_body_temp"
nginx http proxy temporary files: "proxy_temp"
编译和安装:
cd /opt/k8s/work/nginx-1.27.2
make && make install
验证编译的 Nginx#
cd /opt/k8s/work/nginx-1.27.2
./nginx-prefix/sbin/nginx -v
nginx version: nginx/1.27.2
输出:
nginx version: nginx/1.27.2
安装和部署 Nginx#
创建目录结构:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p /opt/k8s/kube-nginx/{conf,logs,sbin}"
done
拷贝二进制程序:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p /opt/k8s/kube-nginx/{conf,logs,sbin}"
scp /opt/k8s/work/nginx-1.27.2/nginx-prefix/sbin/nginx root@${node_ip}:/opt/k8s/kube-nginx/sbin/kube-nginx
ssh root@${node_ip} "chmod a+x /opt/k8s/kube-nginx/sbin/*"
done
- 重命名二进制文件为 kube-nginx;
配置 nginx,开启 4 层透明转发功能:
cd /opt/k8s/work
cat > kube-nginx.conf << \EOF
worker_processes 1;
events {
worker_connections 1024;
}
stream {
upstream backend {
hash $remote_addr consistent;
server 10.37.91.93:6443 max_fails=3 fail_timeout=30s;
server 10.37.43.62:6443 max_fails=3 fail_timeout=30s;
}
server {
listen 127.0.0.1:8443;
proxy_connect_timeout 1s;
proxy_pass backend;
}
}
EOF
- upstream backend 中的 server 列表为集群中各 kube-apiserver 的节点 IP,需要根据实际情况修改;
分发配置文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp kube-nginx.conf root@${node_ip}:/opt/k8s/kube-nginx/conf/kube-nginx.conf
done
配置 systemd unit 文件,启动服务#
配置 kube-nginx systemd unit 文件:
cd /opt/k8s/work
cat > kube-nginx.service <<EOF
[Unit]
Description=kube-apiserver nginx proxy
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=forking
ExecStartPre=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -t
ExecStart=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx
ExecReload=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -s reload
PrivateTmp=true
Restart=always
RestartSec=5
StartLimitInterval=0
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
分发 systemd unit 文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp kube-nginx.service root@${node_ip}:/etc/systemd/system/
done
启动 kube-nginx 服务:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx"
done
检查 kube-nginx 服务运行状态#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "systemctl status kube-nginx |grep 'Active:'"
done
确保状态为 active (running)。否则查看日志,确认原因:
journalctl -u kube-nginx
部署 containerd 组件#
containerd 实现了 kubernetes 的 Container Runtime Interface (CRI) 接口,提供容器运行时核心功能,如镜像管理、容器管理等,相比 dockerd 更加简单、健壮和可移植。当前企业中用的最多的容器运行时也是 containerd。
注意:
- 如果没有特殊指明,本小节的所有操作均在 k8s-01 节点上执行。
- 如果想使用 docker,请参考 附录D:部署 Docker;
- docker 需要与 flannel 配合使用,且先安装 flannel;
下载和分发二进制文件#
下载二进制文件:
cd /opt/k8s/work
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.31.1/crictl-v1.31.1-linux-amd64.tar.gz \
https://github.com/opencontainers/runc/releases/download/v1.1.15/runc.amd64 \
https://github.com/containernetworking/plugins/releases/download/v1.5.1/cni-plugins-linux-amd64-v1.5.1.tgz \
https://github.com/containerd/containerd/releases/download/v1.7.22/containerd-1.7.22-linux-amd64.tar.gz
解压下载的压缩包:
cd /opt/k8s/work
mkdir containerd
tar -xvf containerd-1.7.22-linux-amd64.tar.gz -C containerd
tar -xvf crictl-v1.31.1-linux-amd64.tar.gz
mkdir cni-plugins
sudo tar -xvf cni-plugins-linux-amd64-v1.5.1.tgz -C cni-plugins
sudo mv runc.amd64 runc
分发二进制文件到所有 worker 节点:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp containerd/bin/* crictl cni-plugins/* runc root@${node_ip}:/opt/k8s/bin
ssh root@${node_ip} "chmod a+x /opt/k8s/bin/* && mkdir -p /etc/cni/net.d"
done
创建和分发 containerd 配置文件#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat << EOF | sudo tee containerd-config.toml
version = 2
root = "${CONTAINERD_DIR}/root"
state = "${CONTAINERD_DIR}/state"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.cn-beijing.aliyuncs.com/zhoujun/pause-amd64:3.1"
config_path = "/etc/containerd/certs.d"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/k8s/bin"
conf_dir = "/etc/cni/net.d"
[plugins."io.containerd.runtime.v1.linux"]
shim = "containerd-shim"
runtime = "runc"
runtime_root = ""
no_shim = false
shim_debug = false
EOF
分发 containerd 配置文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p /etc/containerd/ ${CONTAINERD_DIR}/{root,state}"
scp containerd-config.toml root@${node_ip}:/etc/containerd/config.toml
done
配置镜像加速地址:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p /etc/containerd/ ${CONTAINERD_DIR}/{root,state}"
scp containerd-config.toml root@${node_ip}:/etc/containerd/config.toml
done
创建 containerd systemd unit 文件#
cd /opt/k8s/work
cat <<EOF | sudo tee containerd.service
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target
[Service]
Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
ExecStartPre=/sbin/modprobe overlay
ExecStart=/opt/k8s/bin/containerd
Restart=always
RestartSec=5
Delegate=yes
KillMode=process
OOMScoreAdjust=-999
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
[Install]
WantedBy=multi-user.target
EOF
分发 systemd unit 文件,启动 containerd 服务#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp containerd.service root@${node_ip}:/etc/systemd/system
ssh root@${node_ip} "systemctl enable containerd && systemctl restart containerd"
done
创建和分发 crictl 配置文件#
crictl 是兼容 CRI 容器运行时的命令行工具,提供类似于 docker 命令的功能。具体参考 ##官方文档。
cd /opt/k8s/work
cat << EOF | sudo tee crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
分发到所有 worker 节点:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp crictl.yaml root@${node_ip}:/etc/crictl.yaml
done
安装 nerdctl 工具#
安装完 containerd 之后,会默认安装 circtl、ctr 命令,但是 circtl、ctr 命令在功能和使用习惯上和 docker 命令还是有一些确实和差异。所以,这里推荐使用 nerdctl 命令来操作容器镜像。nerdctl 是 containerd 组中的一个子项目,目的是为了兼容 Docker CLI,即可以使用 nerdctl 命令替代 docker 命令,它只支持 containerd 容器运行时。
nerdctl 命令安装和部署命令如下:
wget https://github.com/containerd/nerdctl/releases/download/v1.7.7/nerdctl-1.7.7-linux-amd64.tar.gz
tar -xvzf nerdctl-1.7.7-linux-amd64.tar.gz
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp nerdctl root@${node_ip}:/opt/k8s/bin
done
配置 containerd 镜像加速#
cat << 'END' | sudo tee containerd-image-mirror.sh
# docker hub 镜像加速
mkdir -p /etc/containerd/certs.d/docker.io
cat > /etc/containerd/certs.d/docker.io/hosts.toml << EOF
server = "https://docker.io"
[host."https://registry.docker-cn.com"]
capabilities = ["pull", "resolve"]
[host."http://hub-mirror.c.163.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.mirrors.ustc.edu.cn"]
capabilities = ["pull", "resolve"]
[host."https://dockerpull.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.anyhub.us.kg"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub.jobcher.com"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub.icu"]
capabilities = ["pull", "resolve"]
[host."https://dockerproxy.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.m.daocloud.io"]
capabilities = ["pull", "resolve"]
[host."https://reg-mirror.qiniu.com"]
capabilities = ["pull", "resolve"]
EOF
# registry.k8s.io 镜像加速
mkdir -p /etc/containerd/certs.d/registry.k8s.io
tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml << 'EOF'
server = "https://registry.k8s.io"
[host."https://k8s.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# docker.elastic.co 镜像加速
mkdir -p /etc/containerd/certs.d/docker.elastic.co
tee /etc/containerd/certs.d/docker.elastic.co/hosts.toml << 'EOF'
server = "https://docker.elastic.co"
[host."https://elastic.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# gcr.io 镜像加速
mkdir -p /etc/containerd/certs.d/gcr.io
tee /etc/containerd/certs.d/gcr.io/hosts.toml << 'EOF'
server = "https://gcr.io"
[host."https://gcr.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# ghcr.io 镜像加速
mkdir -p /etc/containerd/certs.d/ghcr.io
tee /etc/containerd/certs.d/ghcr.io/hosts.toml << 'EOF'
server = "https://ghcr.io"
[host."https://ghcr.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# k8s.gcr.io 镜像加速
mkdir -p /etc/containerd/certs.d/k8s.gcr.io
tee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml << 'EOF'
server = "https://k8s.gcr.io"
[host."https://k8s-gcr.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# mcr.m.daocloud.io 镜像加速
mkdir -p /etc/containerd/certs.d/mcr.microsoft.com
tee /etc/containerd/certs.d/mcr.microsoft.com/hosts.toml << 'EOF'
server = "https://mcr.microsoft.com"
[host."https://mcr.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# nvcr.io 镜像加速
mkdir -p /etc/containerd/certs.d/nvcr.io
tee /etc/containerd/certs.d/nvcr.io/hosts.toml << 'EOF'
server = "https://nvcr.io"
[host."https://nvcr.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# quay.io 镜像加速
mkdir -p /etc/containerd/certs.d/quay.io
tee /etc/containerd/certs.d/quay.io/hosts.toml << 'EOF'
server = "https://quay.io"
[host."https://quay.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# registry.jujucharms.com 镜像加速
mkdir -p /etc/containerd/certs.d/registry.jujucharms.com
tee /etc/containerd/certs.d/registry.jujucharms.com/hosts.toml << 'EOF'
server = "https://registry.jujucharms.com"
[host."https://jujucharms.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
# rocks.canonical.com 镜像加速
mkdir -p /etc/containerd/certs.d/rocks.canonical.com
tee /etc/containerd/certs.d/rocks.canonical.com/hosts.toml << 'EOF'
server = "https://rocks.canonical.com"
[host."https://rocks-canonical.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
END
chmod +x containerd-image-mirror.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp containerd-image-mirror.sh root@${node_ip}:/opt/k8s/bin
ssh root@${node_ip} "/opt/k8s/bin/containerd-image-mirror.sh"
done
containerd 镜像加速配置说明(仅介绍,无需操作)#
在安装部署 containerd 的过程中,我们配置了镜像加速。这里,我再来详细介绍下 containerd 配置镜像加速的方式。
镜像配置建议
在 Kubernetes 中、创建 Pod 时,很多镜像需要从国外的镜像仓库中下载,但这些镜像仓库很可能因为众所周知的原因,无法直接访问下载。这时候,一个通用的解决办法便是配置一个镜像加速地址,通过这个地址来下载镜像。
网上太多数配置 containerd 镜像加速的文章都是直接修改 /etc/containerd/config.toml 配置文件,这种方式在较新版本的 containerd中 已经被废弃,将来肯定会被移除,只不过现在还可以使用而已。另外,这种方式有一个不好的地方就是,每次修改 /etc/containerd/config.toml 配置文件,都需要执行 systemctl restart containerd.service 命令重启 containerd。
新版本的 containerd 镜像仓库配置都是建议放在一个单独的文件夹当中,并且在 /etc/containerd/config.toml 配置文件当中打开 config_path 配置,指向镜像仓库配置目录即可。这种方式只需要在第一次修改 /etc/containerd/config.toml 文件打开 config_path 配置时需要重启 containerd,后续我们增加镜像仓库配置都无需重启 containerd,非常方便。带 config_path 的 containerd 配置示例如下:
version = 2
root = "/data/k8s/containerd/root"
state = "/data/k8s/containerd/state"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.cn-beijing.aliyuncs.com/zhoujun/pause-amd64:3.1"
config_path = "/etc/containerd/certs.d"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/k8s/bin"
conf_dir = "/etc/cni/net.d"
[plugins."io.containerd.runtime.v1.linux"]
shim = "containerd-shim"
runtime = "runc"
runtime_root = ""
no_shim = false
shim_debug = false
若我们在 /etc/containerd/config.toml 配置文件中指定 config_path = /etc/containerd/certs.d,那么 containerd 镜像仓库的格式如下:
$ tree /etc/containerd/certs.d
/etc/containerd/certs.d
/etc/containerd/certs.d/
├── 192.168.11.20
│ └── hosts.toml
└── docker.io
└── hosts.toml
可以看到,第一级目录为镜像仓库的域名或者 IP 地址,第二级为 hosts.toml 文件。hosts.toml 文件中的内容仅支持:server、capabilities、ca、client、skip_verify、[header]、override_path。hosts.toml 文件示例如下:
[host."https://mirror.registry"]
capabilities = ["pull"]
ca = "/etc/certs/mirror.pem"
skip_verify = false
[host."https://mirror.registry".header]
x-custom-2 = ["value1", "value2"]
[host."https://mirror-bak.registry/us"]
capabilities = ["pull"]
skip_verify = true
[host."http://mirror.registry"]
capabilities = ["pull"]
[host."https://test-1.registry"]
capabilities = ["pull", "resolve", "push"]
ca = ["/etc/certs/test-1-ca.pem", "/etc/certs/special.pem"]
client = [["/etc/certs/client.cert", "/etc/certs/client.key"],
["/etc/certs/client.pem", ""]]
[host."https://test-2.registry"]
client = "/etc/certs/client.pem"
[host."https://test-3.registry"]
client = ["/etc/certs/client-1.pem", "/etc/certs/client-2.pem"]
[host."https://non-compliant-mirror.registry/v2/upstream"]
capabilities = ["pull"]
override_path = true
特别需要注意的是,hosts.toml 中可以配置多个镜像仓库,containerd 下载镜像时会根据配置的顺序使用镜像仓库,只有当上一个仓库下载失败才会使用下一个镜像仓库。因此,镜像仓库的配置顺序原则就是镜像仓库下载速度越快,那么这个仓库就应该放在最前面。
镜像加速配置
下面是一个镜像加速配置示例(亲测可用):
# docker hub 镜像加速
mkdir -p /etc/containerd/certs.d/docker.io
cat > /etc/containerd/certs.d/docker.io/hosts.toml << EOF
server = "https://docker.io"
[host."https://registry.docker-cn.com"]
capabilities = ["pull", "resolve"]
[host."http://hub-mirror.c.163.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.mirrors.ustc.edu.cn"]
capabilities = ["pull", "resolve"]
[host."https://dockerpull.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.anyhub.us.kg"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub.jobcher.com"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub.icu"]
capabilities = ["pull", "resolve"]
[host."https://dockerproxy.com"]
capabilities = ["pull", "resolve"]
[host."https://docker.m.daocloud.io"]
capabilities = ["pull", "resolve"]
[host."https://reg-mirror.qiniu.com"]
capabilities = ["pull", "resolve"]
EOF
# registry.k8s.io 镜像加速
mkdir -p /etc/containerd/certs.d/registry.k8s.io
tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml << 'EOF'
server = "https://registry.k8s.io"
[host."https://k8s.m.daocloud.io"]
capabilities = ["pull", "resolve", "push"]
EOF
注意,除了 docker.io 仓库,其余仓库的镜像仓库都是使用了 daocloud 的镜像仓库,daocloud镜 像仓库并非支持所有镜像的下载,其支持的镜像列表可以参考:daocloud镜像仓库支持列表。
镜像仓库加速验证
如果我们配置了镜像加速,可以通过下面的方式来验证是否配置成功。
对于 nerdctl 命令来说,会自动使用 /etc/containerd/certs.d 目录下的配置镜像加速,但是对于 ctr 命令,需要指定 –hosts-dir=/etc/containerd/certs.d。举个例子:ctr -n k8s.io i pull –hosts-dir=/etc/containerd/certs.d registry.k8s.io/sig-storage/csi-provisioner:v3.5.0,如果要确定此命令是否真的使用了镜像加速,可以增加 –debug=true 参数,譬如:ctr –debug=true -n k8s.io i pull –hosts-dir=/etc/containerd/certs.d registry.k8s.io/sig-storage/csi-provisioner:v3.5.0。
registry.k8s.io镜像仓库验证:
# nerdctl --debug=true image pull registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
DEBU[0000] verifying process skipped
...
layer-sha256:fe5ca6266f04366c8e7f605aa82997d71320183e99962fa76b3209fdfbb8b58:
done |++++++++++++++++++++++++++++++++++++++|
elapsed: 5.5 s
total: 27.0 M (4.9 MiB/s)
# nerdctl images
REPOSITORY TAG IMAGE ID CREATED
PLATFORM SIZE BLOB SIZE
registry.k8s.io/sig-storage/csi-provisioner v3.5.0 d078dc174323 21 seconds
ago linux/amd64 66.1 MiB 27.0 MiB
k8s.gcr.io镜像仓库验证:
# nerdctl --debug=true image pull k8s.gcr.io/kube-apiserver:v1.17.3
DEBU[0000] verifying process skipped
...
layer-sha256:597de8ba0c30cdd0b372023aa2ea3ca9b3affbcba5ac8db922f57d6cb67db7c8:
done |++++++++++++++++++++++++++++++++++++++|
elapsed: 133.0s
total: 48.3 M (371.8 KiB/s)
# nerdctl images
REPOSITORY TAG IMAGE ID CREATED
PLATFORM SIZE BLOB SIZE
k8s.gcr.io/kube-apiserver v1.17.3 33400ea29255 About a
minute ago linux/amd64 167.3 MiB 48.3 MiB
registry.k8s.io/sig-storage/csi-provisioner v3.5.0 d078dc174323 6 minutes
ago linux/amd64 66.1 MiB 27.0 MiB
k8s.gcr.io镜像仓库验证:
# nerdctl --debug=true image pull k8s.gcr.io/kube-apiserver:v1.17.3
DEBU[0000] verifying process skipped
...
layer-sha256:694976bfeffdb16265017f2c99283712340bd8c23e50c78e3ca9b3affbcba5ac8db922f57d6cb67db7c8:
done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:597de8ba0c30cdd0b372023aa2ea3ca9b3affbcba5ac8db922f57d6cb67db7c8:
done |++++++++++++++++++++++++++++++++++++++|
elapsed: 133.0s
total: 48.3 M (371.8 KiB/s)
# nerdctl images
REPOSITORY TAG IMAGE ID CREATED
PLATFORM SIZE BLOB SIZE
k8s.gcr.io/kube-apiserver v1.17.3 33400ea29255 About a
minute ago linux/amd64 167.3 MiB 48.3 MiB
registry.k8s.io/sig-storage/csi-provisioner v3.5.0 d078dc174323 6 minutes
ago linux/amd64 66.1 MiB 27.0 MiB
docker.io镜像仓库验证:
# nerdctl --debug=true image pull docker.io/library/ubuntu:20.04
DEBU[0000] verifying process skipped
...
layer-sha256:846c0b181fff0c667d9444f8378e8fcfa13116da8d308bf21673e8e8db580:
done |++++++++++++++++++++++++++++++++++++++|
elapsed: 2.6 s
total: 27.3 M (10.5 MiB/s)
# nerdctl images
REPOSITORY TAG IMAGE ID CREATED
PLATFORM SIZE BLOB SIZE
ubuntu 20.04 b872b0383a21 40
seconds ago linux/amd64 75.8 MiB 27.3 MiB
k8s.gcr.io/kube-apiserver v1.17.3 33400ea29255 9 minutes
ago linux/amd64 167.3 MiB 48.3 MiB
registry.k8s.io/sig-storage/csi-provisioner v3.5.0 d078dc174323 14
minutes ago linux/amd64 66.1 MiB 27.0 MiB
部署 kubelet 组件#
kubelet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。
kubelet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。
为确保安全,部署时关闭了 kubelet 的非安全 http 端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster 的请求)。
注意:如果没有特殊指明,本小节的所有操作均在 k8s-01 节点上执行。
下载和分发 kubelet 二进制文件#
在 部署 Master 节点组件 课程中,我们已经下载和分发了 kubelet 文件。
创建 kubelet bootstrap kubeconfig 文件#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_name in ${NODE_NAMES[@]}
do
echo ">>> ${node_name}"
# 创建 token
export BOOTSTRAP_TOKEN=$(kubeadm token create \
--description kubelet-bootstrap-token \
--groups system:bootstrappers:${node_name} \
--kubeconfig ~/.kube/config)
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/cert/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=kubelet-bootstrap \
--kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
done
- 向 kubeconfig 写入的是 token,bootstrap 结束后 kube-controller-manager 为 kubelet 创建 client 和 server 证书;
查看 kubeadm 为各节点创建的 token:
$ kubeadm token list --kubeconfig ~/.kube/config
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
ue3uta.z5mtucvzcwzzw8nq 23h 2024-10-10T14:26:54Z authentication,signing kubelet-bootstrap-token system:bootstrappers:k8s-02
y6vwww.3g32e8ghmqyoz9jr 23h 2024-10-10T14:26:53Z authentication,signing kubelet-bootstrap-token system:bootstrappers:k8s-01
- token 有效期为 1 天,超期后将不能再被用来 boostrap kubelet,且会被 kube-controller-manager 的 tokencleaner 清理;
- kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 system:bootstrap:
,group 设置为 system:bootstrappers,后续将为这个 group 设置 ClusterRoleBinding;
分发 bootstrap kubeconfig 文件到所有 worker 节点#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_name in ${NODE_NAMES[@]}
do
echo ">>> ${node_name}"
scp kubelet-bootstrap-${node_name}.kubeconfig root@${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig
done
创建和分发 kubelet 参数配置文件#
从 v1.10 开始,部分 kubelet 参数需在配置文件中配置,kubelet –help 会提示:
DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag
创建 kubelet 参数配置文件模板(可配置项参考 KubeletConfiguration):
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat > kubelet-config.yaml.template <<EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: "##NODE_IP##"
staticPodPath: ""
syncFrequency: 1m
fileCheckFrequency: 20s
httpCheckFrequency: 20s
staticPodURL: ""
port: 10250
readOnlyPort: 0
rotateCertificates: true
serverTLSBootstrap: true
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: "/etc/kubernetes/cert/ca.pem"
authorization:
mode: Webhook
registryPullQPS: 0
registryBurst: 20
eventRecordQPS: 0
eventBurst: 20
enableDebuggingHandlers: true
enableContentionProfiling: true
healthzPort: 10248
healthzBindAddress: "##NODE_IP##"
clusterDomain: "${CLUSTER_DNS_DOMAIN}"
clusterDNS:
- "${CLUSTER_DNS_SVC_IP}"
nodeStatusUpdateFrequency: 10s
nodeStatusReportFrequency: 1m
imageMinimumGCAge: 2m
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
volumeStatsAggPeriod: 1m
kubeletCgroups: ""
systemCgroups: ""
cgroupRoot: ""
cgroupsPerQOS: true
cgroupDriver: cgroupfs
runtimeRequestTimeout: 10m
hairpinMode: promiscuous-bridge
maxPods: 220
podCIDR: "${CLUSTER_CIDR}"
podPidsLimit: -1
resolvConf: /etc/resolv.conf
maxOpenFiles: 1000000
kubeAPIQPS: 1000
kubeAPIBurst: 2000
serializeImagePulls: false
evictionHard:
memory.available: "100Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%"
evictionSoft: {}
enableControllerAttachDetach: true
failSwapOn: true
containerLogMaxSize: 20Mi
containerLogMaxFiles: 10
systemReserved: {}
kubeReserved: {}
systemReservedCgroup: ""
kubeReservedCgroup: ""
enforceNodeAllocatable: ["pods"]
EOF
- address:kubelet 安全端口(https,10250)监听的地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API;
- readOnlyPort=0:关闭只读端口(默认 10255),等效于未指定;
- authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
- authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证;
- authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
- 对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized;
- authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC);
- featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 –experimental-cluster-signing-duration 参数;
- 需要 root 账户运行;
为各节点创建和分发 kubelet 配置文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
sed -e "s/##NODE_IP##/${node_ip}/" kubelet-config.yaml.template > kubelet-config-${node_ip}.yaml.template
scp kubelet-config-${node_ip}.yaml.template root@${node_ip}:/etc/kubernetes/kubelet-config.yaml
done
创建和分发 kubelet systemd unit 文件#
创建 kubelet systemd unit 文件模板:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat > kubelet.service.template <<EOF
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=containerd.service
Requires=containerd.service
[Service]
WorkingDirectory=${K8S_DIR}/kubelet
ExecStart=/opt/k8s/bin/kubelet \\
--bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \\
--cert-dir=/etc/kubernetes/cert \\
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \\
--root-dir=${K8S_DIR}/kubelet \\
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
--config=/etc/kubernetes/kubelet-config.yaml \\
--hostname-override=##NODE_NAME## \\
--volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \\
--v=2
Restart=always
RestartSec=5
StartLimitInterval=0
[Install]
WantedBy=multi-user.target
EOF
- 如果设置了 –hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
- –bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
- K8S approve kubelet 的 csr 请求后,在 –cert-dir 目录创建证书和私钥文件,然后写入 –kubeconfig 文件;
- –pod-infra-container-image 不使用 redhat 的 pod-infrastructure:latest 镜像,它不能回收容器的僵尸;
为各节点创建和分发 kubelet systemd unit 文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_name in ${NODE_NAMES[@]}
do
echo ">>> ${node_name}"
sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service
scp kubelet-${node_name}.service root@${node_name}:/etc/systemd/system/kubelet.service
done
授予 kube-apiserver 访问 kubelet API 的权限#
在执行 kubectl exec、run、logs 等命令时,apiserver 会将请求转发到 kubelet 的 https 端口。这里定义 RBAC 规则,授权 apiserver 使用的证书(kubernetes.pem)用户名(CN: kuberntes-master)访问 kubelet API 的权限:
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes-master
Bootstrap Token Auth 和授予权限#
kubelet 启动时查找 –config 参数对应的文件是否存在,如果不存在则使用 –bootstrap-kubeconfig 指定的 kubeconfig 文件向 kube-apiserver 发送证书签名请求 (CSR)。
kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证,认证通过后将请求的 user 设置为 system:bootstrap:
默认情况下,这个 user 和 group 没有创建 CSR 的权限,kubelet 启动失败,错误日志如下:
$ sudo journalctl -u kubelet -a |grep -A 2 'certificatesigningrequests'
May 26 12:13:41 k8s-01 kubelet[128468]: I0526 12:13:41.798230 128468 certificate_manager.go:366] Rotating certificates
May 26 12:13:41 k8s-01 kubelet[128468]: E0526 12:13:41.801997 128468 certificate_manager.go:385] Failed while requesting a signed certificate from the master: cannot create certificate signing request:
certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:82jfrm" cannot create resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
解决办法是:创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定:
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
自动 approve CSR 请求,生成 kubelet client 证书#
kubelet 创建 CSR 请求后,下一步需要创建被 approve,有两种方式:
- kube-controller-manager 自动 approve;
- 手动使用命令 kubectl certificate approve;
CSR 被 approve 后,kubelet 向 kube-controller-manager 请求创建 client 证书,kube-controller-manager 中的 csrapproving controller 使用 SubjectAccessReview API 来检查 kubelet 请求(对应的 group 是 system:bootstrappers)是否具有相应的权限。
创建三个 ClusterRoleBinding,分别授予 group system:bootstrappers 和 group system:nodes 进行 approve client、renew client、renew server 证书的权限(server csr 是手动 approve 的,见后文):
cd /opt/k8s/work
cat > csr-crb.yaml <<EOF
# Approve all CSRs for the group "system:bootstrappers"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: auto-approve-csrs-for-group
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
apiGroup: rbac.authorization.k8s.io
---
# To let a node of the group "system:nodes" renew its own credentials
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: node-client-cert-renewal
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
apiGroup: rbac.authorization.k8s.io
---
# A ClusterRole which instructs the CSR approver to approve a node requesting a
# serving cert matching its client cert.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: approve-node-server-renewal-csr
rules:
- apiGroups: ["certificates.k8s.io"]
resources: ["certificatesigningrequests/selfnodeserver"]
verbs: ["create"]
---
# To let a node of the group "system:nodes" renew its own server credentials
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: node-server-cert-renewal
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: approve-node-server-renewal-csr
apiGroup: rbac.authorization.k8s.io
EOF
kubectl apply -f csr-crb.yaml
- auto-approve-csrs-for-group:自动 approve node 的第一次 CSR; 注意第一次 CSR 时,请求的 Group 为 system:bootstrappers;
- node-client-cert-renewal:自动 approve node 后续过期的 client 证书,自动生成的证书 Group 为 system:nodes;
- node-server-cert-renewal:自动 approve node 后续过期的 server 证书,自动生成的证书 Group 为 system:nodes;
启动 kubelet 服务#
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/"
ssh root@${node_ip} "/usr/sbin/swapoff -a"
ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet"
done
- 启动服务前必须先创建工作目录;
- 关闭 swap 分区,否则 kubelet 会启动失败;
kubelet 启动后使用 –bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 –kubeletconfig 文件。
注意:kube-controller-manager 需要配置 –cluster-signing-cert-file 和 –cluster-signing-key-file 参数,才会为 TLS Bootstrap 创建证书和私钥。
查看 kubelet 情况#
稍等一会,三个节点的 CSR 都被自动 approved:
$ kkubectl get csr
NAME AGE SIGNERNAME REQUESTOR
REQUESTEDDURATION CONDITION
csr-42hxv 65s kubernetes.io/kubelet-serving system:node:k8s-01
<none> Pending
csr-dtpqw 64s kubernetes.io/kubelet-serving system:node:k8s-02
<none> Pending
csr-q76zk 65s kubernetes.io/kube-apiserver-client-kubelet
system:bootstrap:y6vwww <none> Approved,Issued
csr-r5jhg 64s kubernetes.io/kube-apiserver-client-kubelet
system:bootstrap:ue3uta <none> Approved,Issued
- Pending 的 CSR 用于创建 kubelet server 证书,需要手动 approve,参考后文。
所有节点均注册(NotReady 状态是预期的,后续安装了网络插件后就好):
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-01 NotReady <none> 3m12s v1.31.1
k8s-02 NotReady <none> 3m11s v1.31.1
kube-controller-manager 为各 node 生成了 kubeconfig 文件和公私钥:
$ ls -l /etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2258 Oct 9 22:36 /etc/kubernetes/kubelet.kubeconfig
$ ls -l /etc/kubernetes/cert/kubelet-client-*
-rw------- 1 root root 1224 Oct 9 22:40 /etc/kubernetes/cert/kubelet-client-2024-10-09-22-40-38.pem
lrwxrwxrwx 1 root root 59 Oct 9 22:40 /etc/kubernetes/cert/kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2024-10-09-22-40-38.pem
- 没有自动生成 kubelet server 证书;
手动 approve server cert csr#
基于安全性考虑,CSR approving controllers 不会自动 approve kubelet server 证书签名请求,需要手动 approve:
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR
REQUESTEDDURATION CONDITION
csr-42hxv 4m31s kubernetes.io/kubelet-serving system:node:k8s-01
<none> Pending
csr-dtpqw 4m30s kubernetes.io/kubelet-serving system:node:k8s-02
<none> Pending
csr-q76zk 4m31s kubernetes.io/kube-apiserver-client-kubelet
system:bootstrap:y6vwww <none> Approved,Issued
csr-r5jhg 4m30s kubernetes.io/kube-apiserver-client-kubelet
system:bootstrap:ue3uta <none> Approved,Issued
# 手动 approve
$ kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
# 自动生成了 server 证书
$ ls -l /etc/kubernetes/cert/kubelet-*
-rw------- 1 root root 1224 Oct 9 22:40 /etc/kubernetes/cert/kubelet-client-2024-10-09-22-40-38.pem
lrwxrwxrwx 1 root root 59 Oct 9 22:40 /etc/kubernetes/cert/kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2024-10-09-22-40-38.pem
-rw------- 1 root root 1261 Oct 9 22:46 /etc/kubernetes/cert/kubelet-server-2024-10-09-22-46-00.pem
lrwxrwxrwx 1 root root 59 Oct 9 22:46 /etc/kubernetes/cert/kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2024-10-09-22-46-00.pem
kubelet api 认证和授权#
kubelet 配置了如下认证参数:
- authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
- authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTPs 证书认证;
- authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
同时配置了如下授权参数:
- authroization.mode=Webhook:开启 RBAC 授权;
kubelet 收到请求后,使用 clientCAFile 对证书签名进行认证,或者查询 bearer token 是否有效。如果两者都没通过,则拒绝请求,提示 Unauthorized:
$ curl -s --cacert /etc/kubernetes/cert/ca.pem https://10.37.91.93:10250/metrics
Unauthorized
$ curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer 123456" https://10.37.91.93:10250/metrics
Unauthorized
通过认证后,kubelet 使用 SubjectAccessReview API 向 kube-apiserver 发送请求,查询证书或 token 对应的 user、group 是否有操作资源的权限(RBAC);
证书认证和授权#
# 权限不足的证书;
$ curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /etc/kubernetes/cert/kube-controller-manager.pem --key /etc/kubernetes/cert/kube-controller-manager-key.pem https://10.37.91.93:10250/metrics
Forbidden (user=system:kube-controller-manager, verb=get, resource=nodes, subresource=metrics)
# 使用部署 kubectl 命令行工具时创建的、具有最高权限的 admin 证书;
$ curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://10.37.91.93:10250/metrics|head
# HELP aggregator_discovery_aggregation_count_total [ALPHA] Counter of number of times discovery was aggregated
# TYPE aggregator_discovery_aggregation_count_total counter
aggregator_discovery_aggregation_count_total 0
# HELP aggregator_discovery_aggregation_count [ALPHA] Counter of number of times discovery was aggregated
# TYPE aggregator_discovery_aggregation_count counter
aggregator_discovery_aggregation_count 0
# HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend.
- –cacert、–cert、–key 的参数值必须是文件路径,如上面的 /opt/k8s/work/admin.pem,否则返回 401 Unauthorized;
Bearer token 认证和授权#
创建一个 ServiceAccount,将它和 ClusterRole system:kubelet-api-admin 绑定,从而具有调用 kubelet API 的权限:
kubectl create sa kubelet-api-test
kubectl create clusterrolebinding kubelet-api-test --clusterrole=system:kubelet-api-admin --serviceaccount=default:kubelet-api-test
cat > kubelet-api-test-secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
name: kubelet-api-test
annotations:
kubernetes.io/service-account.name: kubelet-api-test
type: kubernetes.io/service-account-token
EOF
kubectl apply -f kubelet-api-test-secret.yaml
SECRET=$(kubectl get secrets | grep kubelet-api-test | awk '{print $1}')
TOKEN=$(kubectl describe secret ${SECRET} | grep -E '^token' | awk '{print $2}')
echo ${TOKEN}
使用 token 访问 /metrics:
$ curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer ${TOKEN}" https://10.37.91.93:10250/metrics | head
# HELP aggregator_discovery_aggregation_count_total [ALPHA] Counter of number of times discovery was aggregated
# TYPE aggregator_discovery_aggregation_count_total counter
aggregator_discovery_aggregation_count_total 0
# HELP aggregator_discovery_aggregation_count [ALPHA] Counter of number of times discovery was aggregated
# TYPE aggregator_discovery_aggregation_count counter
aggregator_discovery_aggregation_count 0
# HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend.
cadvisor 和 metrics
浏览器访问 https://10.37.91.93:10250/metrics 和 https://10.37.91.93:10250/metrics/cadvisor 分别返回 kubelet 和 cadvisor 的 metrics。
注意:
- kubelet.config.json 设置 authentication.anonymous.enabled 为 false,不允许匿名证书访问 10250 的 https 服务;
- 参考 附录C:浏览器访问kube-apiserver安全端口,创建和导入相关证书,然后访问上面的 10250 端口;
参考#
- kubelet 认证和授权:https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/
部署 kube-proxy 组件#
kube-proxy 运行在所有 worker 节点上,它监听 apiserver 中 service 和 endpoint 的变化情况,创建路由规则以提供服务 IP 和负载均衡功能。
本文档讲解部署 ipvs 模式的 kube-proxy 过程。
注意:如果没有特殊指明,本小节的所有操作均在 k8s-01 节点上执行,然后远程分发文件和执行命令。
下载和分发 kube-proxy 二进制文件#
在 部署 Master 节点组件 课程中,我们已经下载和分发了 kube-proxy 文件。
创建 kube-proxy 证书#
创建证书签名请求:
cd /opt/k8s/work
cat > kube-proxy-csr.json <<EOF
{
"CN": "system:kube-proxy",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "superproj"
}
]
}
EOF
- CN:指定该证书的 User 为 system:kube-proxy;
- 预定义的 RoleBinding system:node-proxier 将 User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
- 该证书只会被 kube-proxy 当做 client 证书使用,所以 hosts 字段为空;
生成证书和私钥:
cd /opt/k8s/work
cfssl gencert -ca=/opt/k8s/work/ca.pem \
-ca-key=/opt/k8s/work/ca-key.pem \
-config=/opt/k8s/work/ca-config.json \
-profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
ls kube-proxy*
创建和分发 kubeconfig 文件#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/k8s/work/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy \
--client-certificate=kube-proxy.pem \
--client-key=kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
分发 kubeconfig 文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_name in ${NODE_NAMES[@]}
do
echo ">>> ${node_name}"
scp kube-proxy.kubeconfig root@${node_name}:/etc/kubernetes/
done
创建 kube-proxy 配置文件#
从 v1.10 开始,kube-proxy 部分参数可以配置文件中配置。可以使用 –write-config-to 选项生成该配置文件,或者参考 KubeProxyConfiguration。
创建 kube-proxy config 文件模板:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat > kube-proxy-config.yaml.template <<EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
burst: 200
kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
qps: 100
bindAddress: ##NODE_IP##
healthzBindAddress: ##NODE_IP##:10256
metricsBindAddress: ##NODE_IP##:10249
enableProfiling: true
clusterCIDR: ${CLUSTER_CIDR}
hostnameOverride: ##NODE_NAME##
mode: "ipvs"
portRange: ""
iptables:
masqueradeAll: false
ipvs:
scheduler: rr
excludeCIDRs: []
EOF
- bindAddress: 监听地址;
- clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
- clusterCIDR: kube-proxy 根据 –cluster-cidr 判断集群内部和外部流量,指定 –cluster-cidr 或 –masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
- hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
- mode: 使用 ipvs 模式;
为各节点创建和分发 kube-proxy 配置文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for (( i=0; i < 2; i++ ))
do
echo ">>> ${NODE_NAMES[i]}"
sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-proxy-config.yaml.template > kube-proxy-config-${NODE_NAMES[i]}.yaml.template
scp kube-proxy-config-${NODE_NAMES[i]}.yaml.template root@${NODE_NAMES[i]}:/etc/kubernetes/kube-proxy-config.yaml
done
创建和分发 kube-proxy systemd unit 文件#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat > kube-proxy.service <<EOF
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
WorkingDirectory=${K8S_DIR}/kube-proxy
ExecStart=/opt/k8s/bin/kube-proxy \\
--config=/etc/kubernetes/kube-proxy-config.yaml \\
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
分发 kube-proxy systemd unit 文件:
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_name in ${NODE_NAMES[@]}
do
echo ">>> ${node_name}"
scp kube-proxy.service root@${node_name}:/etc/systemd/system/
done
启动 kube-proxy 服务#
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy"
ssh root@${node_ip} "modprobe ip_vs_rr"
ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy"
done
检查启动结果#
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "systemctl status kube-proxy|grep Active"
done
确保状态为 active (running),否则查看日志,确认原因:
journalctl -u kube-proxy
查看监听端口#
$ sudo netstat -lnpt|grep kube-prox
tcp 0 0 10.37.91.93:10256 0.0.0.0:* LISTEN 711192/kube-proxy
tcp 0 0 10.37.91.93:10249 0.0.0.0:* LISTEN 711192/kube-proxy
- 10249: http prometheus metrics port;
- 10256: http healthz port;
查看 ipvs 路由规则#
sudo apt install -y ipvsadm
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "/usr/sbin/ipvsadm -ln"
done
预期输出:
>>> 10.37.91.93
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.254.0.1:443 rr
-> 10.37.43.62:6443 Masq 1 0 0
-> 10.37.91.93:6443 Masq 1 0 0
>>> 10.37.43.62
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.254.0.1:443 rr
-> 10.37.43.62:6443 Masq 1 0 0
-> 10.37.91.93:6443 Masq 1 0 0
可见所有通过 https 访问 Kubernetes service kubernetes 的请求都转发到 kube-apiserver 节点的 6443 端口。