Grafana中的Kubernetes Istio延迟路径明智

Question

我正在AWS EKS集群中使用Istio。我正在使用预安装的prometheus和grafana来监视Pod，Istio网格，Istio服务工作负载。

我有在三个不同工作空间中运行的三个服务，

Service 1:- service1.namespace1.svc.cluster.local
Service 2 :- service2.namespace2.svc.cluster.local
Service 3:- service3.namespace3.svc.cluster.local

我可以从Grafana的Istio Service Dashboard中找到每个服务端点的延迟。但是，它仅显示服务端点的延迟，而不是端点前缀。尽管总体服务端点延迟很好，但是我想检查一条路径在服务端点中花费的时间。

假设P50 Latency的service1.namespace1.svc.cluster.local为2.91毫秒，但我也想检查每条路径的延迟。它有四个路径，

service1.namespace1.svc.cluster.local/login => Loging Path , P50 Latency = ?
service1.namespace1.svc.cluster.local/signup => Singup Path , P50 Latency = ?
service1.namespace1.svc.cluster.local/auth => Auth path , P50 Latency = ?
service1.namespace1.svc.cluster.local/list => List path , P50 Latency = ?

我不确定Prometheus和Grafana堆栈中是否可行。推荐的实现方法是什么？

Istioctl version --remote 

client version: 1.5.1
internal-popcraftio-ingressgateway version: 
citadel version: 1.4.3
galley version: 1.4.3
ingressgateway version: 1.5.1
pilot version: 1.4.3
policy version: 1.4.3
sidecar-injector version: 1.4.3
telemetry version: 1.4.3
pilot version: 1.5.1
office-popcraftio-ingressgateway version: 
data plane version: 1.4.3 (83 proxies), 1.5.1 (4 proxies)

Answer 1

据我所知，这不是Istio指标可以提供的。但是，您应该查看服务器框架提供的可用指标（如果有）。因此，这取决于应用程序（框架）。例如，请参阅SpringBoot（https://docs.spring.io/spring-metrics/docs/current/public/prometheus）或Vert.x（https://vertx.io/docs/vertx-micrometer-metrics/java/#_http_server）

关于基于HTTP路径的指标，要注意的一件事是，如果不谨慎使用，可能会使指标基数爆炸。假设您的某些路径包含无限制的动态值（例如/object/123465，其中123456为ID），如果该路径存储为Prometheus标签，则意味着Prometheus将为每个ID创建一个指标。可能会导致Prometheus上的性能问题，并有可能导致您的应用内存不足。

这是我不让Istio提供基于路径的指标的充分理由。另一方面，框架可以有足够的知识来基于路径模板而不是实际路径（例如/object/$ID而不是/object/123465）提供度量，从而解决了基数问题。

Grafana中的Kubernetes Istio延迟路径明智

问题描述投票：0回答：1

1个回答

最新问题

Grafana中的Kubernetes Istio延迟路径明智

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1