Prometheus-operator 添加自定义采集规则的方法

部署前在chart value文件中自定义采集规则

在新的prometheus-opreator chart脚本中如果要新增自定义采集规则就要修改chart value.yaml文件中的 additionalScrapeConfigs 部分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# PO chart 的部分内容
## AdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations
## are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form
## as specified in the official Prometheus documentation:
## https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<scrape_config>. As scrape configs are
## appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility
## to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible
## scrape configs are going to break Prometheus after the upgrade.
##
## The scrape configuraiton example below will find master nodes, provided they have the name .*mst.*, relabel the
## port to 2379 and allow etcd scraping provided it is running on all Kubernetes master nodes
##
additionalScrapeConfigs:
- job_name: kubernetes-cadvisor
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
metric_relabel_configs:
- action: replace
source_labels: [id]
regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'
target_label: rkt_container_name
replacement: '${2}-${1}'
- action: replace
source_labels: [id]
regex: '^/system\.slice/(.+)\.service$'
target_label: systemd_service_name
replacement: '${1}'
# - job_name: kube-etcd
# kubernetes_sd_configs:
# - role: node
# scheme: https
# tls_config:
# ca_file: /etc/prometheus/secrets/etcd-client-cert/etcd-ca
# cert_file: /etc/prometheus/secrets/etcd-client-cert/etcd-client
# key_file: /etc/prometheus/secrets/etcd-client-cert/etcd-client-key
# relabel_configs:
# - action: labelmap
# regex: __meta_kubernetes_node_label_(.+)
# - source_labels: [__address__]
# action: replace
# target_label: __address__
# regex: ([^:;]+):(\d+)
# replacement: ${1}:2379
# - source_labels: [__meta_kubernetes_node_name]
# action: keep
# regex: .*mst.*
# - source_labels: [__meta_kubernetes_node_name]
# action: replace
# target_label: node
# regex: (.*)
# replacement: ${1}
# metric_relabel_configs:
# - regex: (kubernetes_io_hostname|failure_domain_beta_kubernetes_io_region|beta_kubernetes_io_os|beta_kubernetes_io_arch|beta_kubernetes_io_instance_type|failure_domain_beta_kubernetes_io_zone)
# action: labeldrop

部署后修改secret来自定义采集规则

如果要修改已经部署的chart的自定义采集规则
除了可以修改 chart的value.yaml文件 再进行更新 helm upgrade
我们还可以直接修改prometheus绑定secret文件内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@ test]# kubectl get secret -n default
NAME TYPE DATA AGE
alertmanager-monit-prometheus-operator-alertmanager Opaque 1 3d
monit-prometheus-operator-prometheus-scrape-confg Opaque 1 3d

# monit-prometheus-operator-prometheus-scrape-confg 就是我们要进行修改的文件
# 示例文件内容如下
apiVersion: v1
data:
additional-scrape-configs.yaml: ICAgIC0gam9iX25hbWU6IGt1YmVybmV0ZXMtY2Fkdmlzb3IKICAgICAga3ViZXJuZXRlc19zZF9jb25maWdzOgogICAgICAtIHJvbGU6IG5vZGUKICAgICAgc2NoZW1lOiBodHRwcwogICAgICB0bHNfY29uZmlnOgogICAgICAgIGNhX2ZpbGU6IC92YXIvcnVuL3NlY3JldHMva3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9jYS5jcnQKICAgICAgYmVhcmVyX3Rva2VuX2ZpbGU6IC92YXIvcnVuL3NlY3JldHMva3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC90b2tlbgogICAgICByZWxhYmVsX2NvbmZpZ3M6CiAgICAgIC0gYWN0aW9uOiBsYWJlbG1hcAogICAgICAgIHJlZ2V4OiBfX21ldGFfa3ViZXJuZXRlc19ub2RlX2xhYmVsXyguKykKICAgICAgLSB0YXJnZXRfbGFiZWw6IF9fYWRkcmVzc19fCiAgICAgICAgcmVwbGFjZW1lbnQ6IGt1YmVybmV0ZXMuZGVmYXVsdC5zdmM6NDQzCiAgICAgIC0gc291cmNlX2xhYmVsczogW19fbWV0YV9rdWJlcm5ldGVzX25vZGVfbmFtZV0KICAgICAgICByZWdleDogKC4rKQogICAgICAgIHRhcmdldF9sYWJlbDogX19tZXRyaWNzX3BhdGhfXwogICAgICAgIHJlcGxhY2VtZW50OiAvYXBpL3YxL25vZGVzLyR7MX0vcHJveHkvbWV0cmljcy9jYWR2aXNvcgogICAgICBtZXRyaWNfcmVsYWJlbF9jb25maWdzOgogICAgICAgIC0gYWN0aW9uOiByZXBsYWNlCiAgICAgICAgICBzb3VyY2VfbGFiZWxzOiBbaWRdCiAgICAgICAgICByZWdleDogJ14vbWFjaGluZVwuc2xpY2UvbWFjaGluZS1ya3RcXHgyZChbXlxcXSspXFwuKy8oW14vXSspXC5zZXJ2aWNlJCcKICAgICAgICAgIHRhcmdldF9sYWJlbDogcmt0X2NvbnRhaW5lcl9uYW1lCiAgICAgICAgICByZXBsYWNlbWVudDogJyR7Mn0tJHsxfScKICAgICAgICAtIGFjdGlvbjogcmVwbGFjZQogICAgICAgICAgc291cmNlX2xhYmVsczogW2lkXQogICAgICAgICAgcmVnZXg6ICdeL3N5c3RlbVwuc2xpY2UvKC4rKVwuc2VydmljZSQnCiAgICAgICAgICB0YXJnZXRfbGFiZWw6IHN5c3RlbWRfc2VydmljZV9uYW1lCiAgICAgICAgICByZXBsYWNlbWVudDogJyR7MX0n
kind: Secret
metadata:
labels:
app: prometheus-operator-prometheus-scrape-confg
chart: prometheus-operator-5.2.0
heritage: Tiller
release: monit
name: monit-prometheus-operator-prometheus-scrape-confg
namespace: default
type: Opaque

## 拿到additional-scrape-configs.yaml: 后面的base64加密字符串 再进行解密就可以得到我们自定义的采集规则
## 同理把明文的采集规则进行base64加密后替换 additional-scrape-configs.yaml: 后面内容即可

部署后修改CRD来自定义采集规则

prometheus-opretor 在部署完成后会自动创建servicemonitors类型的CRD
我们也可以通过创建servicemonitors来完成自定义规则的编写

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
application: c2-ams
release: monit #prometheus-opretor识别标签
name: c2-ams-monitor
namespace: default
spec:
endpoints:
- path: /metrics #采集c2-ams的路径
port: port-8080 #采集c2-ams的端口
selector:
matchLabels:
application: c2-ams #采集的应用名称
namespaceSelector:
matchNames:
- default #采集的应用所在租户

部署后修改Service的标签来自定义采集规则

通过创建自动识别发现Service标签Job来自定义采集规则

要做到自动发现标签首先要在prometheus中新建如下的job

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 自动发现标签采集Job
- job_name: 'kubernetes-app-metrics'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
# 只保留endpoint中含有prometheus.io/scrape: 'true'的annotation的endpoint
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_app_metrics]
regex: true;true
action: keep
# 将用户指定的进程的metrics_path替换默认的metrics_path
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# 用pod_ip和用户指定的进程的metrics端口组合成真正的可以拿到数据的地址来替换原始__address__
- source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_service_annotation_prometheus_io_app_metrics_port]
action: replace
target_label: __address__
regex: (.+);(.+)
replacement: $1:$2
# 去掉label name中的前缀__meta_kubernetes_service_annotation_prometheus_io_app_info_
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

然后在需要进行采集的Service上添加如下的标签

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: v1
kind: Service
metadata:
labels:
application: 'c2-apigateway-postgres-export'
env: 'c2'
annotations:
prometheus.io/scrape: 'true'
prometheus.io/app-metrics: 'true'
prometheus.io/app-metrics-port: '9187'
prometheus.io/app-metrics-path: '/metrics'
prometheus.io/app-info-namespace: 'default'
prometheus.io/app-info-name: 'c2-apigateway-postgres'
prometheus.io/app-info-env: 'c2'
prometheus.io/app-info-code: 'c2-apigateway-postgres'
name: 'c2-apigateway-postgres-export-svc'
namespace: 'default'
spec:
ports:
- name: apigateway-port-9187
port: 9187
protocol: TCP
targetPort: 9187
selector:
application: 'c2-apigateway-postgres-export'
type: ClusterIP