Error Marking Master: Timed Out Waiting for the Condition [Kubernetes]

error marking master: timed out waiting for the condition [kubernetes]

I would recommend to bootstrap Kubernetes cluster as guided in the official documentation. I've proceeded with some steps to build cluster on the same CentOS version CentOS Linux release 7.5.1804 (Core) and will share them with you, hope it can be helpful to you to get rid of the issue during installation.

First wipe your current cluster installation:

# kubeadm reset -f && rm -rf /etc/kubernetes/

Add Kubernetes repo for further kubeadm, kubelet, kubectl installation:

[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF

Check whether SELinux is in permissive mode:

# getenforce
Permissive

Ensure net.bridge.bridge-nf-call-iptables is set to 1 in your sysctl:

# cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

Install required Kubernetes components and start services:

# yum update && yum upgrade && yum install -y docker kubelet kubeadm kubectl --disableexcludes=kubernetes

# systemctl start docker kubelet && systemctl enable docker kubelet

Deploy the cluster via kubeadm:

kubeadm init --pod-network-cidr=10.244.0.0/16

I prefer to install Flannel as the main CNI in my cluster, although there are some prerequisites for proper Pod network installation, I've passed --pod-network-cidr=10.244.0.0/16 flag to kubeadm init command.

Create Kubernetes Home directory for your user and store config file:

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install Pod network, in my case it was Flannel:

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml

Finally check Kubernetes core Pods status:

$ kubectl get pods --all-namespaces

NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-576cbf47c7-4x7zq             1/1     Running   0          36m
kube-system   coredns-576cbf47c7-666jm             1/1     Running   0          36m
kube-system   etcd-centos-7-5                      1/1     Running   0          35m
kube-system   kube-apiserver-centos-7-5            1/1     Running   0          35m
kube-system   kube-controller-manager-centos-7-5   1/1     Running   0          35m
kube-system   kube-flannel-ds-amd64-2bmw9          1/1     Running   0          33m
kube-system   kube-proxy-pcgw8                     1/1     Running   0          36m
kube-system   kube-scheduler-centos-7-5            1/1     Running   0          35m

In case you still have any doubts, just write down a comment below this answer.

Kubernetes on worker node - kubelet.service not starting

Node has joined the cluster after commenting the entries from /etc/resolv.conf file then once node has joined to the cluster successfully again Un-commented. Now on my master all the namespaces and nodes are running fine.

kubernetes api server not automatically start after master reboots

Kubelet is not started because of port already in use and hence not able to create pod for api server.
Use following command to find out which process is holding the port 10250

root@master admin]# ss -lntp | grep 10250
LISTEN     0      128         :::10250                   :::*                   users:(("kubelet",pid=23373,fd=20))

It will give you PID of that process and name of that process. If it is unwanted process which is holding the port, you can always kill the process and that port becomes available to use by kubelet.

After killing the process again run the above command, it should return no value.

Just to be on safe side run kubeadm reset and then run kubeadm init and it should go through

Edit:

Using snap stop kubelet did the trick of stopping kubelet on the node.

Running K8s cluster after source code compilation

eventually i figured that the latest commit on the repo is not a good state to start with. when you do yum install kubeadm kubectl kubelet; the binaries you get are compiled from stable branch tag; which is same as binary versions.

i figured that yum install is getting me v1.14.0 version of binaries; now i checked out branch with same tag and that seem to have fixed the issue

kubeadm v1.18.2 with crio version 1.18.2 failing to start master node from private repo on Centos7 / RH7

So the problem is not exactly a bug on CRI-O as we initially thought (also the CRI-O dev team) but it seems to be a lot of configurations that need to be applied if the user desires to use CRI-O as the CRI for kubernetes and also desire to use a private repo.

So I will not put here the configurations for the CRI-O as it is already documented on the ticket that I raised with the team
Kubernetes v1.18.2 with crio version 1.18.2 failing to sync with kubelet on RH7
#3915.

The first configuration that someone should apply is to configure the registries of the containers where the images will be pulled:

$ cat /etc/containers/registries.conf
[[registry]]
prefix = "k8s.gcr.io"
insecure = false
blocked = false
location = "k8s.gcr.io"

[[registry.mirror]]
location = "my.private.repo"

CRI-O recommends that this configuration should be passed as a flag to the kubelet (haircommander/cri-o-kubeadm) but for me it was not working with only this configuration.

I went back to the kubernetes manual and it is recommended not to pass the flag there for kubelet but to the file /var/lib/kubelet/config.yaml during run time. For me this is not possible as the node needs to start with the CRI-O socket and not any other socket (ref Configure cgroup driver used by kubelet on control-plane node).

So I managed to get it up and running by passing this flag on my config file sample below:

$ cat /tmp/config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/crio/crio.sock
  name: node.name
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
controlPlaneEndpoint: 1.2.3.4:6443
imageRepository: my.private.repo
kind: ClusterConfiguration
kubernetesVersion: v1.18.2
networking:
  dnsDomain: cluster.local
  podSubnet: 10.85.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

Then simply the user can start the master / worker node with the flag --config <file.yml> and the node will be launched successfully.

Hope all the information here will help someone else.

Error Marking Master: Timed Out Waiting for the Condition [Kubernetes]