云原生网关 Ingress-Nginx 链路追踪实战:OpenTelemetry 采集与观测云集成方案

    banner.png

    背景

    在大型分布式系统中,服务之间调用复杂,链路追踪可以帮助梳理请求流向,现代系统也需要实时监控来快速响应事件以及故障,让我们了解系统瓶颈和高负载路径,从而可以进行优化。

    Ingress-Nginx 是在 Kubernetes 环境中使用的,专门用于管理进入 Kubernetes 集群的外部访问流量。它基于 Nginx,利用其作为反向代理和负载均衡器的能力,但专门配置和优化以适应 Kubernetes 的架构。Ingress Controller 的主要任务是根据预先定义的规则(通过 Kubernetes Ingress 资源设置)将外部请求路由到集群内的特定服务。

    前提

    • Ingress-Nginx 版本 >= 1.10.0
    • 应用服务已经接入 Opentelemetry 采集链路数据
    • K8s 集群版本:

    1. 部署示例服务

    这里我们会部署一个 spring boot 的服务,A 服务会调用 B 服务。本示例中 java 版本是 17,Maven 版本是 3.9.10。

    由于采集 Ingress-Nginx 的链路需要和后端链路打通,所以在部署业务镜像的时候需要将 OTEL 探针一并打包到业务镜像。

    以下是在服务 Dockerfile 中将 Agent 打包到业务服务容器镜像的配置,为服务提供采集链路数据的基础能力。

    FROM curlimages/curl:latest AS agent-download
    USER root
    RUN curl -Lo /opentelemetry-javaagent.jar \
        https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
    
    FROM openjdk:17-jdk-slim
    
    WORKDIR /app
    
    COPY --from=agent-download /opentelemetry-javaagent.jar /app/opentelemetry-javaagent.jar
    
    COPY target/serviceb-1.0-SNAPSHOT.jar /app/service-b.jar
    
    ENV OTEL_SERVICE_NAME="service-b" \
        OTEL_EXPORTER_OTLP_ENDPOINT="http://datakit-endpoint:4317" \
        OTEL_TRACES_SAMPLER="parentbased_always_on" \
        OTEL_PROPAGATORS="tracecontext,baggage" \
        OTEL_METRICS_EXPORTER="none" \
        OTEL_LOGS_EXPORTER="none"
    
    # 修改启动命令,添加 Java Agent
    CMD ["java", "-javaagent:/app/opentelemetry-javaagent.jar", "-jar", "/app/service-b.jar"]
    

    创建 k8s-java-app.yaml 部署服务:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: service-a
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: service-a
      template:
        metadata:
          labels:
            app: service-a
        spec:
          containers:
            - name: service-a
              image: <your-repo>/service-a:otel-1.0
              ports:
                - containerPort: 9090
              env:
              - name: HOST_IP
                valueFrom:
                  fieldRef:
                    fieldPath: status.hostIP
              - name: SPRING_MAIN_ALLOW_CIRCULAR_REFERENCES
                value: "true"
              - name: OTEL_SERVICE_NAME
                value: "service-a"
              - name: OTEL_EXPORTER
                value: "otlp"
              - name: OTEL_EXPORTER_OTLP_PROTOCOL
                value: "grpc"
              - name: OTEL_EXPORTER_OTLP_ENDPOINT
                value: "http://$(HOST_IP):4317"
              - name: OTEL_PROPAGATORS
                value: "tracecontext,baggage"
    
    apiVersion: v1
    kind: Service
    metadata:
      name: service-a
    spec:
      ports:
        - port: 9090
          targetPort: 9090
      selector:
        app: service-a
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: service-b
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: service-b
      template:
        metadata:
          labels:
            app: service-b
        spec:
          containers:
            - name: service-b
              image: <your-repo>/service-b:otel-1.0
              ports:
                - containerPort: 8090
              env:
              - name: HOST_IP
                valueFrom:
                  fieldRef:
                    fieldPath: status.hostIP
              - name: OTEL_SERVICE_NAME
                value: "service-b"
              - name: OTEL_EXPORTER
                value: "otlp"
              - name: OTEL_EXPORTER_OTLP_PROTOCOL
                value: "grpc"
              - name: OTEL_EXPORTER_OTLP_ENDPOINT
                value: "http://$(HOST_IP):4317"
              - name: OTEL_PROPAGATORS
                value: "tracecontext,baggage"
    apiVersion: v1
    kind: Service
    metadata:
      name: service-b
    spec:
      ports:
        - port: 8090
          targetPort: 8090
      selector:
        app: service-b
    

    2. 安装 Ingress Nginx

    创建一个 ingress-nginx.yaml 文件:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: ingress-nginx
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: ingress-nginx
      namespace: ingress-nginx
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: ingress-nginx
    rules:
      - apiGroups:
          - ""
        resources:
          - configmaps
          - endpoints
          - nodes
          - pods
          - secrets
          - services
        verbs:
          - list
          - watch
          - get
      - apiGroups:
          - "discovery.k8s.io"
        resources:
          - endpointslices
        verbs:
          - list
          - watch
      - apiGroups:
          - "coordination.k8s.io"
        resources:
          - leases
        verbs:
          - get
          - watch
          - list
          - create
          - update
      - apiGroups:
          - "networking.k8s.io"
        resources:
          - ingresses
          - ingressclasses
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - "networking.k8s.io"
        resources:
          - ingresses/status
        verbs:
          - update
      - apiGroups:
          - "extensions"
        resources:
          - ingresses
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - "extensions"
        resources:
          - ingresses/status
        verbs:
          - update
      - apiGroups:
          - ""
        resources:
          - events
        verbs:
          - create
          - patch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: ingress-nginx
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: ingress-nginx
    subjects:
      - kind: ServiceAccount
        name: ingress-nginx
        namespace: ingress-nginx
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: ingress-nginx-controller
      namespace: ingress-nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: ingress-nginx
      template:
        metadata:
          labels:
            app: ingress-nginx
        spec:
          hostNetwork: true
          serviceAccountName: ingress-nginx
          containers:
          - name: controller
            image: k8s.gcr.io/ingress-nginx/controller:v1.10.0
            args:
            - /nginx-ingress-controller
            - --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
            - --election-id=ingress-controller-leader
            - --controller-class=k8s.io/ingress-nginx
            - --ingress-class=nginx
            - --configmap=ingress-nginx/ingress-nginx-controller
            env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: "http://$(HOST_IP):4317"
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            ports:
            - name: http
              containerPort: 80
            - name: https
              containerPort: 443
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: ingress-nginx-controller
      namespace: ingress-nginx
    spec:
      type: NodePort
      ports:
      - name: http
        port: 80
        targetPort: 80
      - name: https
        port: 443
        targetPort: 443
      selector:
        app: ingress-nginx
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ingress-nginx-controller
      namespace: ingress-nginx
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/part-of: ingress-nginx
    data:
      enable-opentelemetry: "true"
      otel-sampler: AlwaysOn
      opentelemetry-operation-name: "HTTP $request_method $service_name $uri $opentelemetry_trace_id"
      opentelemetry-trust-incoming-span: "true"
      # Defaults
      # otel-service-name: "nginx"
      # otel-sampler-ratio: 0.01
    

    应用该配置:

    kubectl apply -f ingress-nginx.yaml
    

    3. 采集 Ingress-nginx 链路配置

    3.1 DataKit 开启 OTEL 采集器

    datakit.yaml 中采用 CM 挂载方式开启集群的 OTEL 采集器。

    在 volumeMounts 添加:

            - mountPath: /usr/local/datakit/conf.d/opentelemetry/opentelemetry.conf
              name: datakit-conf
              subPath: opentelemetry.conf
    

    在 CM 处添加采集器:

        opentelemetry.conf: |-
            [[inputs.opentelemetry]]
              [inputs.opentelemetry.http]
               enable = true
               http_status_ok = 200
               trace_api = "/otel/v1/traces"
              [inputs.opentelemetry.grpc]
               trace_enable = true
               metric_enable = true
               addr = "0.0.0.0:4317"
    

    重启 DataKit:

    kubectl apply -f datakit.yaml
    

    3.2 OTEL Agent 采集链路数据

    在服务 Dockerfile 中将 Agent 打包到业务服务容器镜像,为服务提供采集链路数据的基础能力。

    FROM curlimages/curl:latest AS agent-download
    USER root
    RUN curl -Lo /opentelemetry-javaagent.jar \
        https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
    
    FROM openjdk:17-jdk-slim
    
    WORKDIR /app
    
    COPY --from=agent-download /opentelemetry-javaagent.jar /app/opentelemetry-javaagent.jar
    
    COPY target/serviceb-1.0-SNAPSHOT.jar /app/service-b.jar
    
    ENV OTEL_SERVICE_NAME="service-b" \
        OTEL_EXPORTER_OTLP_ENDPOINT="http://datakit-endpoint:4317" \
        OTEL_TRACES_SAMPLER="parentbased_always_on" \
        OTEL_PROPAGATORS="tracecontext,baggage" \
        OTEL_METRICS_EXPORTER="none" \
        OTEL_LOGS_EXPORTER="none"
    
    # 修改启动命令,添加 Java Agent
    CMD ["java", "-javaagent:/app/opentelemetry-javaagent.jar", "-jar", "/app/service-b.jar"]
    

    在服务部署的 yaml 中配置环境变量。

              - name: OTEL_EXPORTER
                value: "otlp"
              - name: OTEL_EXPORTER_OTLP_PROTOCOL
                value: "grpc"
              - name: OTEL_EXPORTER_OTLP_ENDPOINT
                value: "http://$(HOST_IP):4317"
              - name: OTEL_PROPAGATORS
                value: "tracecontext,baggage"
    

    3.3 编辑 ingress-controller CM 资源

    如果 ingress-controller 服务有 configmap 则在 CM 中增加如下四行:

    enable-opentelemetry: "true"
    otel-sampler: AlwaysOn
    opentelemetry-operation-name: "HTTP $request_method $service_name $uri $opentelemetry_trace_id"
    opentelemetry-trust-incoming-span: "true"
    

    Apply 相应的 ingress 的 yaml,并重启 ingress-controller。

    3.4 增加 ingress-controller 环境变量

    在部署 ingress-controller 配置文件 ingress-nginx.yaml 的 deployment 部分中添加 OTEL 配置,位置在 spec.template.spec.containers.env 下,注意端口开启。

            - name: OTEL_EXPORTER
              value: "otlp"
            - name: OTEL_EXPORTER_OTLP_PROTOCOL
              value: "grpc"
    
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: "http://$(HOST_IP):4317"
    
            - name: OTEL_SERVICE_NAME
              value: "nginx"
            - name: OTEL_TRACES_SAMPLER
              value: "always_on"
            - name: OTEL_PROPAGATORS
              value: "tracecontext,baggage"
    

    重新 apply ingress-nginx.yaml,重启 ingress-controller 容器。

    观测云

    再次访问 ingress 域名制造数据。

    到观测云控制台「应用性能监测」,可以看到 Ingress-Nginx 链路数据正常上报。

    联系我们

    加入社区

    微信扫码
    加入官方交流群

    立即体验

    在线开通,按量计费,真正的云服务!

    立即开始

    选择观测云版本

    代码托管平台