暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Prometheus系列6 - thanos组件详解之storer&receiver

栋总侃技术 2021-09-04
4509

Prometheus系列4 - 高可用集群thanos一文中向大家介绍了基于 Prometheus 的高可用集群方案thanos。大家对thanos的架构有着一定的了解后,这一节开始将深入讲解每个组件的作用,以及其启动参数的含义讲解。同时,也提供一套在k8s中运行的yaml文件模板。

本小节将会带来store、receive两个组件的讲解。

Minio

thanos可使用对象存储将指标数据做持久化存储,我们先部署一个minio服务做对象存储。部署方式可参考 https://docs.min.io/docs/minio-client-quickstart-guide,在这里就不再讲解。

而在thanos中与对象存储直接交互的组件有:

  • sidecar - 将prometheus采集的指标写入对象存储;

  • receive - 将从prometheus上报的数据写入对象存储;

  • store - query通过store在对象存储中查询指标数据;

  • compact - 将对象存储里的指标数据压缩处理;

  • rule - 将新生成的指标数据存储至对象存储。

这些组件通过读取定义的存储文件配置访问Minio,该存储文件的示例如下:

    // bucket_config.yaml
    type: s3
    config:
    bucket: thanos
          endpoint: 10.6.110.11:9000
          access_key: admin
          secret_key: 12345678  #minio的密码8位以上
    insecure: true
    复制

    而若部署在k8s中,我们可以定义一个该配置文件的secret,其他组件读取该secret即可:

      apiVersion: v1
      kind: Secret
      metadata:
      name: thanos-objectstorage
      namespace: thanos
      type: Opaque
      stringData:
      objectstorage.yaml: |
      type: s3
      config:
      bucket: thanos
      endpoint: 10.6.110.11:9000
      access_key: admin
      secret_key: 12345678
      insecure: true
      复制

      Store

      store是供query从对象存储中查询历史指标数据的一个组件。store通过上述定义的bucket_config.yaml配置连接至对象存储。

      其启动参数如下:

        store
        --data-dir=/var/thanos/store
        --grpc-address=0.0.0.0:10901
        --http-address=0.0.0.0:10902
        --objstore.config-file=/etc/thanos/objectstorage.yaml
        复制
        • store - 以store组件运行

        • --data-dir - 指定缓存文件的目录

        • --grpc-address - 指定grpc服务的启动端口

        • --http-address - 指定http服务的启动端口

        • --objstore.config-file - 指定对象存储配置文件路径

        在k8s中可定义StoregeClass,使用动态绑定机制生成PVC作为缓存文件的目录:

          apiVersion: storage.k8s.io/v1
          kind: StorageClass
          metadata:
          name: thanos-data-db
          provisioner: fuseim.pri/ifs
          parameters:
          archiveOnDelete: "false"
          复制

          以Statefulset控制器运行store副本:

            apiVersion: apps/v1
            kind: StatefulSet
            metadata:
            name: thanos-store
            namespace: thanos
            labels:
            app.kubernetes.io/name: thanos-store
            spec:
            replicas: 2
            selector:
            matchLabels:
            app.kubernetes.io/name: thanos-store
            serviceName: thanos-store
            podManagementPolicy: Parallel
            template:
            metadata:
            labels:
            app.kubernetes.io/name: thanos-store
            spec:
            containers:
            - args:
            - store
            - --log.level=debug
            - --data-dir=/var/thanos/store
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902
            - --objstore.config-file=/etc/thanos/objectstorage.yaml
            #- --experimental.enable-index-header
            image: registry-dev.uihcloud.cn/library/thanos/thanos:v0.21.1
            livenessProbe:
            failureThreshold: 8
            httpGet:
            path: /-/healthy
            port: 10902
            scheme: HTTP
            periodSeconds: 30
            name: thanos-store
            ports:
            - containerPort: 10901
            name: grpc
            - containerPort: 10902
            name: http
            readinessProbe:
            failureThreshold: 20
            httpGet:
            path: /-/ready
            port: 10902
            scheme: HTTP
            periodSeconds: 5
            terminationMessagePolicy: FallbackToLogsOnError
            volumeMounts:
            - mountPath: var/thanos/store
            name: data
            readOnly: false
            - name: thanos-objectstorage
            subPath: objectstorage.yaml
            mountPath: /etc/thanos/objectstorage.yaml
            terminationGracePeriodSeconds: 120
            volumes:
            - name: thanos-objectstorage
            secret:
            secretName: thanos-objectstorage
            volumeClaimTemplates:
            - metadata:
            labels:
            app.kubernetes.io/name: thanos-store
            name: data
            spec:
            storageClassName: thanos-data-db
            accessModes:
            - ReadWriteOnce
            resources:
            requests:
            storage: 20Gi
            复制

            同时定义一个service,供query在集群内可以访问到store:

              apiVersion: v1
              kind: Service
              metadata:
              name: thanos-store
              namespace: thanos
              labels:
              app.kubernetes.io/name: thanos-store
              spec:
              clusterIP: None
              ports:
              - name: grpc
              port: 10901
              targetPort: 10901
              - name: http
              port: 10902
              targetPort: 10902
              selector:
              app.kubernetes.io/name: thanos-store
              复制

              Receive

              在理解Receive工作机制之前,我们需要先了解以remote_write、租户这两个概念。

              remote_write 

              prometheus通过remote write机制,将采集到的指标数据以hook机制发送出去,在prometheus的配置文件中增加配置指定hook地址。而这里的hook地址正是receive提供的http接口。

                remote_write:
                  - url: http://10.6.118.123:32291/api/v1/receive
                复制

                租户

                receive会集成许多个prometheus(集群)上传上来的指标,每一个prometheus(集群)认为就是一个租户。

                由于receive将会收集多个租户的指标数据,那么receive必然是需要支持可集群扩展的。在定义receive集群中,集群的hash配置文件起着关键的作用。我们来结合一个hash文件的样例来了解该配置文件:

                  [
                  {
                  "hashring":"default",
                  "endpoints":[
                  "thanos-receive-0.thanos-receive.thanos.svc.cluster.local:10901",
                  "thanos-receive-1.thanos-receive.thanos.svc.cluster.local:10901"
                  ]
                  },
                  {
                  "hashring":"hashring-0",
                  "endpoints":[
                  "thanos-receive-2.thanos-receive.thanos.svc.cluster.local:10901"
                  ],
                  "tenants":[
                              "tenant-a"
                  ]
                  },
                  {
                  "hashring":"hashring-1",
                  "endpoints":[
                  "thanos-receive-3.thanos-receive.thanos.svc.cluster.local:10901"
                  ],
                          "tenants":[
                  "tenant-b"
                  ]
                  }
                  ]
                  复制

                  该json文件指出,receive集群一共运行了4个副本:thanos-receive-0、thanos-receive-1、thanos-receive-2、thanos-receive-3;同时指定租户tenant-a通过 thanos-receive-2收集指标,tenant-b通过 thanos-receive-3收集指标,其他的租户通过thanos-receive-0、thanos-receive-1收集指标。


                  在启动receive之前,可以将该配置通过configMap设置到kubernetes中。

                    apiVersion: v1
                    kind: ConfigMap
                    metadata:
                    name: thanos-receive-hashrings
                    namespace: thanos
                    data:
                    thanos-receive-hashrings.json: |
                    [
                    {
                    "hashring":"default",
                    "endpoints":[
                    "thanos-receive-0.thanos-receive.thanos.svc.cluster.local:10901",
                    "thanos-receive-1.thanos-receive.thanos.svc.cluster.local:10901"
                    ]
                    },
                    {
                    "hashring":"hashring-0",
                    "endpoints":[
                    "thanos-receive-2.thanos-receive.thanos.svc.cluster.local:10901"
                    ],
                    "tenants":[
                    "tenant-a"
                    ]
                    },
                    {
                    "hashring":"hashring-1",
                    "endpoints":[
                    "thanos-receive-3.thanos-receive.thanos.svc.cluster.local:10901"
                    ],
                    "tenants":[
                    "tenant-b"
                    ]
                    }
                    ]
                    复制

                    我们来看下启动receive需要指定的参数:

                      receive
                      --receive.replication-factor=1
                      --grpc-address=0.0.0.0:10901
                      --http-address=0.0.0.0:10902
                      --remote-write.address=0.0.0.0:19291
                      --objstore.config-file=/etc/thanos/objectstorage.yaml
                      --tsdb.path=/var/thanos/receive
                      --tsdb.retention=12h
                      --label=receive_replica="$(NAME)"
                      --label=receive="true"
                      --receive.hashrings-file=/etc/thanos/thanos-receive-hashrings.json
                      --receive.local-endpoint="$(NAME).thanos-receive.thanos.svc.cluster.local:10901"
                      复制

                      各参数的含义:

                      • receive - 以receive组件运行

                      • --receive.replication-factor - 采集到的指标备份的数量,若配置大于1则会在多个receive实例中存储相同的一份指标数据

                      • --grpc-address - grpc服务的端口

                      • --http-address - http服务的端口

                      • --remote-write.address - remote_write的接口端口

                      • --objstore.config-file - 对象存储配置文件路径

                      • --tsdb.path - 临时文件暂存路径

                      • --tsdb.retention - 多长时间清理一次临时文件

                      • --label=receive_replica - 当前副本处理的数据需要增加的label

                      • --receive.hashrings-file - 集群配置文件的路径

                      • --receive.local-endpoint-  当前副本在集群配置文件中的地址,在集群文件中解析成当前集群。

                      同样的,我们使用StatefulSet控制器运行receiver副本。

                        apiVersion: apps/v1
                        kind: StatefulSet
                        metadata:
                        labels:
                        app: thanos-receive
                        tenant: default-tenant
                        controller.receive.thanos.io: thanos-receive-controller
                        controller.receive.thanos.io/hashring: default
                        part-of: thanos
                        name: thanos-receive
                        namespace: thanos
                        spec:
                        replicas: 4
                        selector:
                        matchLabels:
                        app: thanos-receive
                        tenant: default-tenant
                        controller.receive.thanos.io: thanos-receive-controller
                        controller.receive.thanos.io/hashring: default
                        part-of: thanos
                        serviceName: thanos-receive
                        template:
                        metadata:
                        labels:
                        app: thanos-receive
                        tenant: default-tenant
                        controller.receive.thanos.io: thanos-receive-controller
                        controller.receive.thanos.io/hashring: default
                        part-of: thanos
                        spec:
                        affinity: {}
                        containers:
                        - args:
                        - receive
                        - --receive.replication-factor=1
                        - --objstore.config=$(OBJSTORE_CONFIG)
                        - --tsdb.path=/var/thanos/receive
                        - --label=receive_replica="$(NAME)"
                        - --receive.local-endpoint=$(NAME).thanos-receive.$(NAMESPACE).svc.cluster.local:10901
                        - --tsdb.retention=15d
                        - --receive.hashrings-file=/etc/thanos/thanos-receive-hashrings.json
                        env:
                        - name: NAME
                        valueFrom:
                        fieldRef:
                        fieldPath: metadata.name
                        - name: NAMESPACE
                        valueFrom:
                        fieldRef:
                        fieldPath: metadata.namespace
                        - name: OBJSTORE_CONFIG
                        valueFrom:
                        secretKeyRef:
                        key: objectstorage.yaml
                        name: thanos-objectstorage
                        image: registry-dev.uihcloud.cn/library/thanos/thanos:v0.22.0
                        livenessProbe:
                        failureThreshold: 8
                        httpGet:
                        path: /-/healthy
                        port: 10902
                        scheme: HTTP
                        periodSeconds: 30
                        name: thanos-receive
                        ports:
                        - containerPort: 10901
                        name: grpc
                        - containerPort: 10902
                        name: http
                        - containerPort: 19291
                        name: remote-write
                        readinessProbe:
                        failureThreshold: 20
                        httpGet:
                        path: /-/ready
                        port: 10902
                        scheme: HTTP
                        periodSeconds: 5
                        terminationMessagePolicy: FallbackToLogsOnError
                        volumeMounts:
                        - mountPath: var/thanos/receive
                        name: data
                        readOnly: false
                        - mountPath: etc/thanos/thanos-receive-hashrings.json
                        name: thanos-receive-hashrings
                        subPath: thanos-receive-hashrings.json
                              terminationGracePeriodSeconds: 900
                        volumeClaimTemplates:
                        - metadata:
                        labels:
                        app.kubernetes.io/name: thanos-receive
                        name: data
                        spec:
                        storageClassName: thanos-receiver-data-db
                        accessModes:
                        - ReadWriteOnce
                        resources:
                        requests:
                        storage: 100Gi

                        复制

                        定义service供集群内访问:

                          apiVersion: v1
                          kind: Service
                          metadata:
                          labels:
                          app: thanos-receive
                          tenant: default-tenant
                          controller.receive.thanos.io/hashring: default
                          part-of: thanos
                          name: thanos-receive
                          namespace: thanos
                          spec:
                          clusterIP: None
                          ports:
                          - name: grpc
                          port: 10901
                          protocol: TCP
                          targetPort: 10901
                          - name: http
                          port: 10902
                          protocol: TCP
                          targetPort: 10902
                          - name: remote-write
                          port: 19291
                          targetPort: 19291
                          protocol: TCP
                          selector:
                          app: thanos-receive
                          tenant: default-tenant
                          controller.receive.thanos.io: thanos-receive-controller
                          controller.receive.thanos.io/hashring: default
                          part-of: thanos
                          复制

                          同时,如果有需要在集群外访问(或许receive的上游prometheus不在一个集群内,设置在不同的局域网内),定义receive的供集群外部访问的端口:

                            apiVersion: v1
                            kind: Service
                            metadata:
                            labels:
                            app: thanos-receive
                            tenant: default-tenant
                            controller.receive.thanos.io/hashring: default
                            part-of: thanos
                            name: thanos-receive-node
                            namespace: thanos
                            spec:
                            type: NodePort
                            ports:
                            - name: grpc
                            port: 10901
                            protocol: TCP
                            targetPort: 10901
                            - name: http
                            port: 10902
                            protocol: TCP
                            targetPort: 10902
                            - name: remote-write
                            port: 19291
                            targetPort: 19291
                            protocol: TCP
                            nodePort: 32291
                            selector:
                            app: thanos-receive
                            tenant: default-tenant
                            controller.receive.thanos.io: thanos-receive-controller
                            controller.receive.thanos.io/hashring: default
                            part-of: thanos
                            复制


                            这样就提供了一个可供query访问、也可供集群外部访问的receive组件。


                            receive组件的负载是在服务内部,自己处理的。当prometheus上传指标时,通过service任意访问到某个副本。该副本根据携带的租户信息判断是否是该当前副本处理,如果不是则会根据hash.json文件的定义将数据转发给对应的副本进行处理。

                            文章转载自栋总侃技术,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                            评论