基于Coredns多集群服务发现

Karmada 多集群服务发现一文中跨集群调用时有个派生的前缀 derived-<service name>, 对于应用调用上有感知,本文介绍如何让应用在调用本地集群或者跨集群都是用同一个service。

基础环境 Karmada 多集群服务发现

编译Codedns插件(multicluster)

拉取coredns代码

1
2
3
git clone https://github.com/coredns/coredns
cd coredns
git checkout v1.9.3

修改插件配置

1
2
3
4
5
# vim plugin.cfg
...
kubernetes:kubernetes
multicluster:github.com/coredns/multicluster
...

注意:后面没有空格

编译二进制

1
make

编译镜像

1
docker build -t coredns/coredns:v1.9.3 .

编译好的阿里云个人镜像仓库: registry.cn-hangzhou.aliyuncs.com/seam/coredns:v1.9.3

修改coredns配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# kubectl edit configmap coredns -n kube-system
  ....
    Corefile: |
     .:53 {
         errors
         health {
            lameduck 5s
         }
         ready
         multicluster clusterset.local # Add this line.
         kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
         }
         prometheus:9153
         forward ./etc/resolv.conf {
            max_concurrent 1000
         }
         cache 30
         the loop
         reload
         load balance
     }
...

修改coredns权限

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# kubectl edit clusterrole system:coredns
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: "2023-08-01T02:17:36Z"
  name: system:coredns
  resourceVersion: "3667"
  uid: 845750de-af6d-4e75-b8bf-c564de8f805e
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
- apiGroups:
  - discovery.k8s.io
  resources:
  - endpointslices
  verbs:
  - list
  - watch
# 添加以下内容
- apiGroups:
  - multicluster.x-k8s.io
  resources:
  - serviceimports
  verbs:
  - list
  - watch

member1集群创建部署

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
cat << EOF | kubectl --kubeconfig kubeconfig-member1 apply -f -
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: serve
    version: v1
  name: serve
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: serve
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: serve
        version: v1
    spec:
      containers:
      - args:
        - '--message=''hello from cluster  (Node: {{env "NODE_NAME"}} Pod: {{env "POD_NAME"}}
          Address: {{addr}})'''
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        image: 969049650220.dkr.ecr.ap-east-1.amazonaws.com/jeremyot/serve:0a40de8
        imagePullPolicy: IfNotPresent
        name: serve
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: serve
    version: v1
  name: serve
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: serve
    version: v1
  sessionAffinity: None
  type: ClusterIP
---
EOF

member1集群创建 ServiceExport

1
2
3
4
5
6
7
8
9
cat << EOF | kubectl --kubeconfig=kubeconfig-member1 apply -f -
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceExport
metadata:
  labels:
    app: serve
  name: serve
  namespace: default
EOF

member2集群创建 Serviceimporter

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
cat << EOF | kubectl --kubeconfig=kubeconfig-member2 apply -f -
apiVersion: v1
kind: Service
metadata:
  labels:
    app: serve
  name: serve
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: serve
  sessionAffinity: None
  type: ClusterIP
---
EOF


cat << EOF | kubectl --kubeconfig=kubeconfig-member2 apply -f -
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceImport
metadata:
  labels:
    app: serve
  name: serve
  namespace: default
spec:
  ips:
  - 10.13.187.164
  ports:
  - port: 80
    protocol: TCP
  type: ClusterSetIP
EOF

member2集群创建服务调用A集群服务

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
cat << EOF | kubectl --kubeconfig=kubeconfig-member2 apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: request
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: request
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: request
    spec:
      containers:
      - args:
        - --duration=3600s
        - --address=serve.default.svc
        image: jeremyot/request:0a40de8
        imagePullPolicy: IfNotPresent
        name: request
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsConfig:
        options:
        - name: timeout
          value: 10s
        - name: retries
          value: "0"
        searches:
        - clusterset.local
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
EOF

优先解析 cluster.local ,不存在时解析 clusterset.local

总结

使用 coredns multicluster 完成 serviceimportsdns 解析,但在多集群服务发现中优先解析本集群内部 service 对应的 cluster.local,当本集群内部 service 无法返回,将解析 serviceimports 对应的 clusertset.local。会有如下情况

  1. 当本集群同时存在 serviceserviceimports 时,本集群后端 Pod 可用时,调用本集群服务;
  2. 当本集群同时存在 serviceserviceimports 时,本集群后端 Pod 不可用时,服务调用失败;
  3. 当本集群 service 不存在,serviceimports 提供跨集群访问;

虽然 coredns multicluster 支持多集群访问,并且做到了统一域名对业务无感,但是如何做到当本集群服务不可用时

  • 如何自动切换到其他集群,删除本集群 service ?
  • 如何快速刷新dns?

参考资料

1 karmada删除派生svc提案 2 设计细节

0%