A cluster is a set of nodes (physical or virtual machines) running Kubernetes agents, managed by the control plane. Kubernetes v1.23 supports clusters with up to 5000 nodes. More specifically, Kubernetes is designed to accommodate configurations that meet all of the following criteria:
You can scale your cluster by adding or removing nodes. The way you do this depends on how your cluster is deployed.
To avoid running into cloud provider quota issues, when creating a cluster with many nodes, consider:
For a large cluster, you need a control plane with sufficient compute and other resources.
Typically you would run one or two control plane instances per failure zone, scaling those instances vertically first and then scaling horizontally after reaching the point of falling returns to (vertical) scale.
You should run at least one instance per failure zone to provide fault-tolerance. Kubernetes nodes do not automatically steer traffic towards control-plane endpoints that are in the same failure zone; however, your cloud provider might have its own mechanisms to do this.
For example, using a managed load balancer, you configure the load balancer to send traffic that originates from the kubelet and Pods in failure zone A, and direct that traffic only to the control plane hosts that are also in zone A. If a single control-plane host or endpoint failure zone A goes offline, that means that all the control-plane traffic for nodes in zone A is now being sent between zones. Running multiple control plane hosts in each zone makes that outcome less likely.
To improve performance of large clusters, you can store Event objects in a separate dedicated etcd instance.
When creating a cluster, you can (using custom tooling):
See Operating etcd clusters for Kubernetes and Set up a High Availability etcd cluster with kubeadm for details on configuring and managing etcd for a large cluster.
Kubernetes resource limits help to minimize the impact of memory leaks and other ways that pods and containers can impact on other components. These resource limits apply to addon resources just as they apply to application workloads.
For example, you can set CPU and memory limits for a logging component:
...
containers:
- name: fluentd-cloud-logging
image: fluent/fluentd-kubernetes-daemonset:v1
resources:
limits:
cpu: 100m
memory: 200Mi
Addons' default limits are typically based on data collected from experience running each addon on small or medium Kubernetes clusters. When running on large clusters, addons often consume more of some resources than their default limits. If a large cluster is deployed without adjusting these values, the addon(s) may continuously get killed because they keep hitting the memory limit. Alternatively, the addon may run but with poor performance due to CPU time slice restrictions.
To avoid running into cluster addon resource issues, when creating a cluster with many nodes, consider the following:
VerticalPodAutoscaler
is a custom resource that you can deploy into your cluster to help you manage resource requests and limits for pods.
Visit Vertical Pod Autoscaler to learn more about VerticalPodAutoscaler
and how you can use it to scale cluster components, including cluster-critical addons.
The cluster autoscaler integrates with a number of cloud providers to help you run the right number of nodes for the level of resource demand in your cluster.
The addon resizer helps you in resizing the addons automatically as your cluster's scale changes.
© 2022 The Kubernetes Authors
Documentation Distributed under CC BY 4.0.
https://kubernetes.io/docs/setup/best-practices/cluster-large/