安装k8s 1.28 版本报错问题总结如下:

📅 2024/9/28 日

本来今天是个美好的一天,但是这个美好从我起床之后就消逝,直到我将这些错误解决.....

1️⃣ 使用kubeadm init --config 初始化时,kubelet一直报错说 The kubelet is not running 等等

排查了很久,应该时关于我之前安装准备工作和containerd时候的问题,建议仔细安装,或者重新安装一次(让你死了排错那条心)

2️⃣ 使用kubeadm join 加入时报错的奇葩问题

我的k8s 加入命令为

kubeadm join k8s-master:6443 --token fw94hy.vvvs2uu3qfsezeif --discovery-token-ca-cert-hash sha256:62a464c9f62b47e897a983ca86246dbc0c1fba82d736d38f663ddc15a4c4931b

执行完毕后在master节点查看node为not ready 状态 再次查看node的kubelet报错如下:

"Failed to ensure lease exists, will retry" err="Get \"https://[::1]:6443/api/v1/namespaces/kube-node-lease/resourcequotas\": dial tcp [::1]:6443: connect: cannot assign requested address" interval="7s"

之后查看master节点的kube-apiserver发现报错

E0928 07:02:57.794032       1 reflector.go:147] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://[::1]:6443/apis/node.k8s.io/v1/runtimeclasses?resourceVersion=4502": dial tcp [::1]:6443: connect: cannot assign requested address

很奇怪为什么会去连接 https://[::1]:6443 这样一个ipv6地址呢而且我在配置中禁用了ipv6

net.ipv6.conf.all.disable_ipv6=1

下面是我的kubeadm-config文件:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:

- groups:
  - system:bootstrappers:kubeadm:default-node-token
    token: abcdef.0123456789abcdef
    ttl: 0s
    usages:
  - signing
  - authentication
    kind: InitConfiguration
    localAPIEndpoint:
    advertiseAddress: 192.168.200.10
    bindPort: 6443
    nodeRegistration:
    criSocket: unix:///var/run/containerd/containerd.sock
    imagePullPolicy: IfNotPresent
    name: k8s-master
    taints: null

apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16
scheduler: {}

下面是node节点中使用kubeadm join 命令后生成的kubelet.conf文件

apiVersion: v1
clusters:

- cluster:
  certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURCVENDQWUyZ0F3SUJBZ0lJTlJ0RWdMcmtCeXN3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TkRBNU1qZ3dOVE16TWpaYUZ3MHpOREE1TWpZd05UTTRNalphTUJVeApFekFSQmdOVkJBTVRDbXQxWW1WeWJtVjBaWE13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLCkFvSUJBUURlclNvb0hadEkwN1VwNG9VM2t1NnBnQWFQS21TM2xQMHpxS1p2S2s1aXNiNnc4V1hUd0xiRWhick4KcEx4STlldElIYTU0aTlPUitkOEplUFRFTCtwc2haUU9GMUM4YXRFVDM2OUxuQjM4Z3NIY3lxN1pER0RyZ3ZEcgo2SG43YVFSd0c0QXJVR3BsUUdjMzJDS1BaVVM1VTlaYjJCdmRwZmxSMll0SWQwdi85SHpPRkcydExsbFpiOUFoCkpCaElleHpaVUpFZkN3TzFKcVF0dGFjNlVweTlhcElLWEhNZjhwcUZuLy9hbFdLYUUyL0R2M2RDTFlGanR1dmMKb1JjZmVRTUdXcTdIby9MK29WZFBnaWFPNVBlaUFzNkhxNkxiS0tjWDE1eXdSNDE5b2MydzB5Yk5pN1A5dUZEZgptRndvY01zVUFCaGNPbzhHeFhINEVkMEE5R0puQWdNQkFBR2pXVEJYTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQCkJnTlZIUk1CQWY4RUJUQURBUUgvTUIwR0ExVWREZ1FXQkJUSEsxRXZYRHVaeE1oam1qN3RneUZWc3pPN1REQVYKQmdOVkhSRUVEakFNZ2dwcmRXSmxjbTVsZEdWek1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQmJlclhmYlovSQp1VGFwcTUvN1Y2bjZpdzlEZnNsWGUrSTF6MUZKRmY3Q3QzMklFcWtodVJQRmpTc09KK0ZabGJlV2swYUNENkVmClpJMmkxNHdlejdtL1hCV1B5UkpLWHQwRDJzM3F5bFlwanAxRzByYXpUR2ZJQ0ZyNkFSYWtFWEdLMFlLQkIrZzYKdEhRMGZ5Y1k3UG1jYmZBSHJtUE05VmIyOXNxOHV5V0ZqU2VGSktrVFlJTy9nL1VBOXMzenBBUEFTcFo0aDJ4MwpVMU0rOUwwSU8vNVFKMXBjQzZsZDlsb054R21MQkJPMWJXaGxIcytvMDJDRzlWWjFXQktqSFFNOGJjZE9ITUlYCm9MY1VzVlpSblhmaVFYNG9JNm04UEpMQ25LaUU1VGRzNE9OcFRLN1psY0NZUW9DOVl4WVV2OEt4aUFMdzRmTkQKMWJUTC9tbitic0VKCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
  server: https://192.168.200.10:6443
  name: default-cluster
  contexts:
- context:
  cluster: default-cluster
  namespace: default
  user: default-auth
  name: default-context
  current-context: default-context
  kind: Config
  preferences: {}
  users:
- name: default-auth
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
    client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

😄 解决:

我尝试将node和master节点都 kubeadmn reset,删除它提示需要删除的文件之后,重新 kubeadmn init但是!先不要安装 CNI插件,直接将 node节点加入 k8s集群,之后在部署 CNI插件