K8s亲和性问题

kubernetes默认调度器的调度过程

调度过程如下:

  • 预选(Predicates)
  • 优选(Priorities)
  • 选定(Select)

节点亲和性和pod亲和性的区别

举个例子,假设给小明分配班级(小明是pod,班级是节点)

  • 节点亲和性:直接告诉小明,你去一年级
  • pod亲和性:从小朋友中找出和小明同年的,找到了小张,发现小张是一年级的,于是让小明去一年级

节点亲和性:硬亲和性

  • requiredDuringSchedulinglgnoredDuringExecution:用于定义节点硬亲和性
  • nodeSelectorTerm:节点选择器,可以有多个,之间的关系是逻辑或,即一个nodeSelectorTerm满足即可
  • matchExpressions:匹配规则定义,多个之间的关系是逻辑与,即同一个nodeSelectorTerm下所有matchExpressions定义的规则都匹配,才算匹配成功
1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Pod
metadata:
name: with-required-nodeaffinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- {key: zone, operator: In, values: ["foo"]}
containers:
- name: nginx
image: nginx
  • 功能与nodeSelector类似,用的是匹配表达式,可以被理解为新一代节点选择器
  • 不满足硬亲和性条件时,pod为Pending状态
  • 在预选阶段,节点硬亲和性被用于预选策略MatchNodeSelector

节点亲和性:软亲和性

特点:条件不满足时也能被调度

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy-with-node-affinity
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 60
preference:
matchExpressions:
- {key: zone, operator: In, values: ["foo"]}
- weight: 30
preference:
matchExpressions:
- {key: ssd, operator: Exists, values: []}
containers:
- name: nginx
image: nginx
  • 集群中的节点,由于标签不同,导致的优先级结果如下:

image.png

  • 在优选阶段,节点软亲和性被用于优选函数NodeAffinityPriority
  • 注意:NodeAffinityPriority并非决定性因素,因为优选阶段还会调用其他优选函数,例如SelectorSpreadPriority(将pod分散到不同节点以分散节点故障导致的风险)
  • pod副本数增加时,分布的比率会参考节点亲和性的权重

Pod亲和性(podAffinity)

  • 如果需求是:新增的pod要和已经存在pod(假设是A)在同一node上,此时用节点亲和性是无法完成的,因为A可能和节点没啥关系(可能是随机调度的),此时只能用pod亲和性来实现

  • pod亲和性:一个pod与已经存在的某个pod的亲和关系,需要通过举例来说明

创建一个deployment,这个pod有标签app=tomcat:

1
kubectl run tomcat -l app=tomcat --image tomcat:alpine

创建pod,需求是和前面的pod在一起,使用pod亲和性来实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity-1
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["tomcat"]}
topologyKey: kubernetes.io/hostname
containers:
- name: nginx
image: nginx

调度逻辑:

1
2
3
A[1. 用matchExpressions的规则app=tomcat搜索] -->B(2. 找到tomcat的pod,也就确定了该pod的节点,假设是A节点)
B --> C(3. topologyKey是kubernetes.io/hostname,所以去找A节点kubernetes.io/hostname标签的值,假设是xxx)
C --> D(4. 将新的pod调度到kubernetes.io/hostname=xxx的节点)

硬亲和:requiredDuringSchedulingIgnoredDuringExecution
软亲和:preferredDuringSchedulingIgnoredDuringExecution

Pod反亲和(podAntiAffinity)

  • 与亲和性相反,将当前pod调度到满足匹配条件之外的节点上

  • 适用场景:

    • 分散同一类应用
    • 将不同安全级别的pod调度至不同节点
  • 示例如下,匹配表达式和自身标签一致,作用是分散同一类应用,让相同pod不要调度到同一个节点:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: myapp-with-pod-anti-affinity
    spec:
    replicas: 4
    selector:
    matchLabels:
    app: myapp
    template:
    metadata:
    name: myapp
    labels:
    app: myapp
    spec:
    affinity:
    podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
    matchExpressions:
    - {key: app, operator: In, values: ["myapp"]}
    topologyKey: kubernetes.io/hostname
    containers:
    - name: nginx
    image: nginx
  • 如果集群中只有三个节点,那么执行上述yaml的结果就是最多创建三个pod,另一个始终处于pending状态

参考

本篇笔记参考了以下文章,两张图片也来自该文章,致敬作者

1
https://mp.weixin.qq.com/s/AaiX_7j97_V-TeIiUBU73Q
打赏

扫一扫,分享到微信

微信分享二维码
  • Copyrights © 2020-2023 交个朋友之猿天地
  • Powered By Hexo | Title - Nothing
  • 访问人数: | 浏览次数:

请我喝杯咖啡吧~

支付宝
微信