Deployment of Kubernetes Control Nodes

Tags and nodeSelector

Labels are key-value pairs attached to Kubernetes objects. If json is used to represent labels attached to metadata:

"metadata": {
  "labels": {
    "key1" : "value1",
    "key2" : "value2"
  }
}

yaml：

metadata:
  labels:
    key1: "value1"
    key2: "value2"

Tags are mainly used to represent attribute identifiers of objects that are meaningful to the user.

You can set some labels for nodes. For example, in the kube-system namespace, the core components of Kubernetes are running. We can view the labels of all components in this namespace.

kubectl get nodes --namespace=kube-system --show-labels

/arch=amd64,
/os=linux,
/arch=amd64,
... ...

We can also manually tag a Node.

kubectl label nodes <node-name> <label-key>=<label-value>

For example, we set a nodedisksize, indicating whether the node's hard disk is large enough.

kubectl label nginx disksize=big

Then when we are writing the yaml file, we hope that this pod will run on a large capacity Node, and we can write it like this:

  nodeSelector:
    disksize=big

By the way, let’s talk about an official example, setting the Label of Node means that the hard disk is ssd.

kubectl label nodes  disktype=ssd

In the node selector of the yaml file, add the selection.

spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    disktype: ssd

Label can be used in multiple places, such as adding a Label on a Node to identify this Node; and in NodeSelector, you can choose the appropriate Node to run the Pod;metadataUsed in , the metadata can be described.

Labels added in metadata can be filtered during command query.

Query the pod's Label:

kubectl get pods --show-labels

Find pods that meet the criteria (refer to the LABELS field, you can select them according to the tags inside):

kubectl get pods -l app=nginx

Tag selection

In the previous step, we learned nodeSelector, which can help us choose the right Node to run the pod. In fact, Kubernets' tag selection is rich and diverse, such as:

  nodeSelector:
    disktype: ssd
    disksize: big

It means that the node selector is an equivalent value selection, and the expression isdisktype=ssd && disksize=big。

There are two types of label selection: equal value and set, among which equal value selection:=、==、!=Three kinds,=and==No difference. In case of multiple requirements (multiple labels), relative to use&&operator, but selector does not exist||This logic or operator.

yaml only supports{key}:{value}In this form, when we use the command form, we can use the above three operators.

kubectl get nodes -l disktype=ssd,disksize!=big
# Use multiple conditions comma",""Separate, not "&amp;&amp;"。

For collection selection methods, three types of operators are supported:in、notinandexists. But don't understand it as choosing from a collection, let's give an example below.

If there are three Nodes whose disksizes include big, medium, and small, we need to deploy a pod that can run in big and medium, then:

... -l disksize in (big,medium)

... -l disksize notin (small)
# Not here small Running in

And exists follows!=Similar, but exists means that as long as this label exists, no matter what value it is set.

-l disksize
# Equivalent -l disksize in (big,medium,small)

We can also use''Wrap the selection expression.

kubectl get pods -l 'app=nginx'

The nodeSelector and imperative selection of yaml have been mentioned earlier. Here we introduce the selector of yaml.

We mentioned earlier that we add Label to the metadata of Deployment, that is, pod plus Label, and we are alsokubectl get podsUse Label to select filter pods. Similarly, when we create a Service or use a ReplicationController, we can also use the tag to select the appropriate pod.

If we have deployed nginx, then querykubectl get pods --show-labelsWhen , the LABELS of its pod will haveapp=nginx, then we can choose:

  selector:
    app: nginx

Full version:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 6666
status:
  loadBalancer:
    ingress:
      - ip: 192.0.2.127

The selector also supports the following selection methodsmatchLabels、matchExpressions：

matchLabelsIt is by{key,value}Mapping of composition.matchLabelsSingle in the map{key,value }Equivalent tomatchExpressionsThe element ofkeyThe field is "key",operatorFor "In", andvaluesThe array contains only "value".

matchExpressionsis a list of Pod selection operator requirements. Valid operators includeIn、NotIn、ExistsandDoesNotExist. existInandNotInIn the case of , the set value must be non-empty. FrommatchLabelsandmatchExpressionsAll requirements of are logically combined with the relationship -- they must all be satisfied to match.

Examples are as follows:

selector:
  matchLabels:
    component: redis
  matchExpressions:
    - {key: tier, operator: In, values: [cache]}
    - {key: environment, operator: NotIn, values: [dev]}

I won’t talk about these selection rules in detail here. The aforementioned are enough. Readers can check the official documents to learn more complex operations:/zh/docs/concepts/overview/working-with-objects/labels/

Affinity and anti-affinity

We have learned beforenodeSelector,usenodeSelectorSelecting the right Label can express the type of our constraints.

Affinity is similar to nodeSelector, which can be constrained to which nodes the pod can be scheduled according to the tag on the node.

There are two types of pod affinity:

requiredDuringSchedulingIgnoredDuringExecution

Hard requirement, scheduling pods to rules that must be met by a node.
preferredDuringSchedulingIgnoredDuringExecution。

Trying to execute but no preference is guaranteed.

Here is an official example:

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: /e2e-az-name
            operator: In
            values:
            - e2e-az1
            - e2e-az2
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: another-node-label-key
            operator: In
            values:
            - another-node-label-value
  containers:
  - name: with-node-affinity
    image: /pause:2.0

The constraints of affinity relative to:

... ... -l /e2e-az-name in (e2e-az1,e2e-az2)

affinity sets intimacy, nodeAffinity sets intimacy, and finally reaches affinity, which means that it must be satisfied and as much as possible.

If we set multiple nodeSelectorTerms:

requiredDuringSchedulingIgnoredDuringExecution:
  nodeSelectorTerms:
  ...
  nodeSelectorTerms:

Then you only need to satisfy one of them to schedule pods to node.

If you specify it at the same timenodeSelectorandnodeAffinity, both must be satisfied before the pod can be scheduled to the candidate node.

The node affinity syntax supports the following operators:In，NotIn，Exists，DoesNotExist，Gt，Lt。

Pod affinity and anti-affinity legal operators areIn，NotIn，Exists，DoesNotExist。

pass-AffinityAffinity can be set, such as node affinitynodeAffinity, and set up anti-affinity use-AntiAffinity,For examplenodeAntiAffinity。

Anti-affinity is the same as affinity, bothrequiredDuringSchedulingIgnoredDuringExecutionHard restrictions andpreferredDuringSchedulingIgnoredDuringExecutionSoft limits, just anti-affinity is an opposite representation, and cannot be dispatched if the conditions are met.

This is the end of the explanation of affinity and anti-affinity. The configurations of the two are relatively numerous and complex. Readers can refer to the official documents, which will not be described here.

Dirty and tolerance

As mentioned earlier, we select the appropriate node through the pod, or the service to select the appropriate pod. These objects with Label are selected.

Here we introduce stains and tolerance that can repel the fate of being chosen.

A node taint can reject a specific class of pods, while Tolerance means tolerating the taint of this object.

When a node adds a taint, the pod will not be scheduled to the node unless the pod declares that it can tolerate the taint.

The system will try to avoid scheduling Pods to nodes where they cannot tolerate taints, but this is not mandatory. Kubernetes’ process of dealing with multiple taints and tolerance is like a filter: iterates from all taints in a node, filtering out those taints in the Pod that match the tolerance.

But if you only have one worker, then if you set the taint, then the pod can only be run on this node.

Add blem format:

kubectl taint node [node] key=value:[effect]

Update stains or covers:

kubectl taint node [node] key=value:[effect] --overwrite=true

usekubectl taintAdd a stain to the node.

kubectl taint nodes node1 key1=value1:NoSchedule

Remove stains:

kubectl taint nodes node1 key1=value1:NoSchedule-

Among them, the stain needs to set label and set the effect of this label to NoSchedule.

The effect of a taint is called effect, and the taint of a node can be set to the following three effects:

NoSchedule: Pods that cannot tolerate this taint will not be scheduled to the node; they will not affect existing pods.
PreferNoSchedule: Kubernetes avoids scheduling pods that cannot tolerate this taint onto nodes.
NoExecute: If a pod is already running on a node, the pod is expelled from the node; if it is not running on a node, it is not scheduled to the node.

But some systems create pods that can tolerate allNoExecuteandNoScheduleBlack, so it will not be expelled. For example, the master node cannot be deployed with pods, butkube-systemThere are many system pods in the namespace. Of course, by modifying the stain, the user pod can be deployed into the master node.

Query the stains of nodes:

kubectl describe nodes | grep Taints

Taints:             /master:NoSchedule
Taints:             key1=value1:NoSchedule

System default stain

We remove the stain from master:

kubectl taint node instance-1 /master:NoSchedule-

Then deploy nginx pod.

kubectl create deployment nginxtaint --image=nginx:latest --replicas=3

View pod:

kubectl get pods -o wide

As a result, the author found that all three copies are on the master node.

To ensure the cluster is secure, we need to restore the master's stain.

kubectl taint node instance-1 /master:NoSchedule

When a certain condition is true, the node controller will automatically add a blem to the node. Currently built-in stains include:

/not-ready: The node is not ready. This is equivalent to the node stateReadyThe value of is "False"。
/unreachable: The node controller cannot access the node. This is equivalent to the node statusReadyThe value of is "Unknown"。
/out-of-disk: The node disk is exhausted.
/memory-pressure: The node has memory pressure.
/disk-pressure: The node has disk pressure.
/network-unavailable: The node network is not available.
/unschedulable: Nodes cannot be scheduled.
/uninitialized: If a "external" cloud platform driver is specified when kubelet is started, it will add a blem to the current node and flag it as unavailable. After a controller in cloud-controller-manager initializes this node, the kubelet removes the taint.

Tolerance

A node can set taints and reject pods, but pods can also set tolerances and tolerate taints of nodes.

tolerations:
- key: "key1"
  operator: "Exists"
  effect: "NoSchedule"

Value can also be set.

tolerations:
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoSchedule"

operatorThe default value isEqual。

A tolerance and a stain "match" means that they have the same key name and effect, and:

ifoperatoryesExists

Tolerance cannot be specified at this timevalue, if there is a label with key of key1, and the stain effect isNoSchedule, then tolerate it.
ifoperatoryesEqual, then theirvalueShould be equal

ifeffectLeave blank, it means that as long as it is label,key1All nodes can be tolerated.

if:

tolerations:
  operator: "Exists"

This means that this pod can tolerate any stains, no matter how node is setkey、value 、effect , this pod won't mind.

If you want to deploy pods on master, you can modify the tolerance of pods:

    spec:
      tolerations:
      # this toleration is to have the daemonset runnable on master nodes
      # remove it if your masters can't run pods
      - key: /master
        effect: NoSchedule

DaemonSet

In Kubernetes, there are three-Set, are ReplicaSet, DaemonSet, and StatefulSets. The load types include Deployments, ReplicaSet, DaemonSet, StatefulSets, etc. (or these controllers are present).

Deployments have been introduced before, andkind: ReplicaSetGenerally it is unnecessary, you cankind: DeploymentPlusreplicas: 。

andkind: DaemonSetIt needs to be described using a yaml, but the overall is the same as Deployment.

DaemonSet ensures that a node only runs one replica of the pod. If there is a nginx pod, when a new Node joins the cluster, a pod will be automatically deployed on this Node; when the node is removed from the cluster, the pods on this Node will be recycled; if the DaemontSet configuration is deleted, all pods created by it will also be deleted.

Some typical uses of DaemonSet:

Run the cluster daemon on each node
Run the log collection daemon on each node
Run the monitoring daemon on each node

In yaml, to configure Daemont, you can usetolerations, configuration example:

kind: DaemontSet
... ...

Other places are consistent with Deployment.

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.