安装k8s 1.28 版本报错问题总结如下:
📅 2024/9/28 日
本来今天是个美好的一天,但是这个美好从我起床之后就消逝,直到我将这些错误解决.....
1️⃣ 使用kubeadm init --config 初始化时,kubelet一直报错说 The kubelet is not running
等等
排查了很久,应该时关于我之前安装准备工作和containerd时候的问题,建议仔细安装,或者重新安装一次(让你死了排错那条心)
2️⃣ 使用kubeadm join 加入时报错的奇葩问题
我的k8s 加入命令为
kubeadm join k8s-master:6443 --token fw94hy.vvvs2uu3qfsezeif --discovery-token-ca-cert-hash sha256:62a464c9f62b47e897a983ca86246dbc0c1fba82d736d38f663ddc15a4c4931b
执行完毕后在master节点查看node为not ready 状态 再次查看node的kubelet报错如下:
"Failed to ensure lease exists, will retry" err="Get \"https://[::1]:6443/api/v1/namespaces/kube-node-lease/resourcequotas\": dial tcp [::1]:6443: connect: cannot assign requested address" interval="7s"
之后查看master节点的kube-apiserver发现报错
E0928 07:02:57.794032 1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://[::1]:6443/apis/node.k8s.io/v1/runtimeclasses?resourceVersion=4502": dial tcp [::1]:6443: connect: cannot assign requested address
很奇怪为什么会去连接 https://[::1]:6443
这样一个ipv6地址呢而且我在配置中禁用了ipv6
net.ipv6.conf.all.disable_ipv6=1
下面是我的kubeadm-config文件:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.200.10
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: k8s-master
taints: null
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
scheduler: {}
下面是node节点中使用kubeadm join 命令后生成的kubelet.conf文件
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURCVENDQWUyZ0F3SUJBZ0lJTlJ0RWdMcmtCeXN3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TkRBNU1qZ3dOVE16TWpaYUZ3MHpOREE1TWpZd05UTTRNalphTUJVeApFekFSQmdOVkJBTVRDbXQxWW1WeWJtVjBaWE13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLCkFvSUJBUURlclNvb0hadEkwN1VwNG9VM2t1NnBnQWFQS21TM2xQMHpxS1p2S2s1aXNiNnc4V1hUd0xiRWhick4KcEx4STlldElIYTU0aTlPUitkOEplUFRFTCtwc2haUU9GMUM4YXRFVDM2OUxuQjM4Z3NIY3lxN1pER0RyZ3ZEcgo2SG43YVFSd0c0QXJVR3BsUUdjMzJDS1BaVVM1VTlaYjJCdmRwZmxSMll0SWQwdi85SHpPRkcydExsbFpiOUFoCkpCaElleHpaVUpFZkN3TzFKcVF0dGFjNlVweTlhcElLWEhNZjhwcUZuLy9hbFdLYUUyL0R2M2RDTFlGanR1dmMKb1JjZmVRTUdXcTdIby9MK29WZFBnaWFPNVBlaUFzNkhxNkxiS0tjWDE1eXdSNDE5b2MydzB5Yk5pN1A5dUZEZgptRndvY01zVUFCaGNPbzhHeFhINEVkMEE5R0puQWdNQkFBR2pXVEJYTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQCkJnTlZIUk1CQWY4RUJUQURBUUgvTUIwR0ExVWREZ1FXQkJUSEsxRXZYRHVaeE1oam1qN3RneUZWc3pPN1REQVYKQmdOVkhSRUVEakFNZ2dwcmRXSmxjbTVsZEdWek1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQmJlclhmYlovSQp1VGFwcTUvN1Y2bjZpdzlEZnNsWGUrSTF6MUZKRmY3Q3QzMklFcWtodVJQRmpTc09KK0ZabGJlV2swYUNENkVmClpJMmkxNHdlejdtL1hCV1B5UkpLWHQwRDJzM3F5bFlwanAxRzByYXpUR2ZJQ0ZyNkFSYWtFWEdLMFlLQkIrZzYKdEhRMGZ5Y1k3UG1jYmZBSHJtUE05VmIyOXNxOHV5V0ZqU2VGSktrVFlJTy9nL1VBOXMzenBBUEFTcFo0aDJ4MwpVMU0rOUwwSU8vNVFKMXBjQzZsZDlsb054R21MQkJPMWJXaGxIcytvMDJDRzlWWjFXQktqSFFNOGJjZE9ITUlYCm9MY1VzVlpSblhmaVFYNG9JNm04UEpMQ25LaUU1VGRzNE9OcFRLN1psY0NZUW9DOVl4WVV2OEt4aUFMdzRmTkQKMWJUTC9tbitic0VKCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
server: https://192.168.200.10:6443
name: default-cluster
contexts:
- context:
cluster: default-cluster
namespace: default
user: default-auth
name: default-context
current-context: default-context
kind: Config
preferences: {}
users:
- name: default-auth
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
😄 解决:
我尝试将node和master节点都 kubeadmn reset
,删除它提示需要删除的文件之后,重新 kubeadmn init
但是!先不要安装 CNI
插件,直接将 node
节点加入 k8s
集群,之后在部署 CNI
插件