我已将 Grafana Loki 和 Tempo 集成到 EKS 集群中。我可以从 Loki 查询服务,但 Tempo 的内部跟踪 URL 未填充到 Loki 中。我部署了 Flunetbit,它将日志推送到 Loki。
我正在使用 Ingress 控制器访问 Grafana。已部署 otel-collector 以从集群获取跟踪信息。有人可以就此提出建议吗,自从我陷入这个问题以来已经好几天了。!
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Tempo
type: tempo
access: browser
orgId: 1
uid: tempo
url: http://tempo:3100
isDefault: true
editable: true
- name: Loki
type: loki
access: browser
orgId: 1
uid: loki
url: http://loki:3100
isDefault: false
editable: true
jsonData:
derivedFields:
- datasourceName: Tempo
matcherRegex: "traceID=(\\w+)"
name: TraceID
url: "$${__value.raw}"
datasourceUid: tempo
env:
JAEGER_AGENT_PORT: 6831
adminUser: admin
adminPassword: password
service:
type: ClusterIP
tempo:
extraArgs:
"distributor.log-received-traces": true
receivers:
zipkin:
otlp:
protocols:
http:
grpc:
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-conf
namespace: monitoring-loki
labels:
app: opentelemetry
component: otel-collector-conf
data:
otel-collector-config: |
receivers:
zipkin:
endpoint: 0.0.0.0:9411
exporters:
otlp:
endpoint: tempo.tracing.svc.cluster.local:55680
insecure: true
service:
pipelines:
traces:
receivers: [zipkin]
exporters: [otlp]
apiVersion: v1
kind: Service
metadata:
name: otel-collector
namespace: monitoring-loki
labels:
app: opentelemetry
component: otel-collector
spec:
ports:
- name: otlp # Default endpoint for OpenTelemetry receiver.
port: 55680
protocol: TCP
targetPort: 55680
- name: jaeger-grpc # Default endpoint for Jaeger gRPC receiver
port: 14250
- name: jaeger-thrift-http # Default endpoint for Jaeger HTTP receiver.
port: 14268
- name: zipkin # Default endpoint for Zipkin receiver.
port: 9411
- name: metrics # Default endpoint for querying metrics.
port: 8888
- name: prometheus # prometheus exporter
port: 8889
selector:
component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
namespace: monitoring-loki
labels:
app: opentelemetry
component: otel-collector
spec:
selector:
matchLabels:
app: opentelemetry
component: otel-collector
template:
metadata:
labels:
app: opentelemetry
component: otel-collector
spec:
containers:
- command:
- "/otelcontribcol"
- "--config=/conf/otel-collector-config.yaml"
# Memory Ballast size should be max 1/3 to 1/2 of memory.
- "--mem-ballast-size-mib=683"
- "--log-level=DEBUG"
image: otel/opentelemetry-collector-contrib:0.29.0
name: otel-collector
ports:
- containerPort: 55679 # Default endpoint for ZPages.
- containerPort: 55680 # Default endpoint for OpenTelemetry receiver.
- containerPort: 14250 # Default endpoint for Jaeger HTTP receiver.
- containerPort: 14268 # Default endpoint for Jaeger HTTP receiver.
- containerPort: 9411 # Default endpoint for Zipkin receiver.
- containerPort: 8888 # Default endpoint for querying metrics.
- containerPort: 8889 # prometheus exporter
volumeMounts:
- name: otel-collector-config-vol
mountPath: /conf
# livenessProbe:
# httpGet:
# path: /
# port: 13133 # Health Check extension default port.
# readinessProbe:
# httpGet:
# path: /
# port: 13133 # Health Check extension default port.
volumes:
- configMap:
name: otel-collector-conf
items:
- key: otel-collector-config
path: otel-collector-config.yaml
name: otel-collector-config-vol
fluent-bit:
enabled: false
promtail:
enabled: true
prometheus:
enabled: false
alertmanager:
persistentVolume:
enabled: true
server:
persistentVolume:
enabled: false
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: monitoring-loki
data:
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
[INPUT]
Name tail
Path /var/log/*.log
Parser docker
Tag kube.*
Refresh_Interval 5
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
[OUTPUT]
Name loki
Match kube.*
Host "loki-svc.monitoring-loki"
tenant_id ""
Port "3100"
label_keys $trace_id
auto_kubernetes_labels on
我认为您的 Loki 数据源中缺少某些内容:
derivedFields:
- datasourceUid: tempo
matcherRegex: 'traceID=(\\w+)'
name: TraceID
url: $${__value.raw}
urlDisplayLabel: Find this trace in Tempo