[BUG] Python OTLP exporter hardcodes gRPC, ignores OTEL_EXPORTER_OTLP_PROTOCOL#1681
Conversation
…upport The OTLP span and log exporters were hardcoded to use gRPC, ignoring the standard OTEL_EXPORTER_OTLP_TRACES_PROTOCOL and OTEL_EXPORTER_OTLP_PROTOCOL environment variables. This prevented exporting traces to backends that only support OTLP over HTTP, such as Langfuse. Replace the top-level gRPC imports with factory functions that dynamically select the exporter based on the protocol env vars, following the OpenTelemetry specification precedence (signal-specific > general > default "grpc"). Signed-off-by: shmuelarditi <shmuelrdt@gmail.com>
There was a problem hiding this comment.
Pull request overview
Fixes the Python agent’s OTLP exporter selection so it can honor OTEL_EXPORTER_OTLP_TRACES_PROTOCOL / OTEL_EXPORTER_OTLP_PROTOCOL (instead of always using the gRPC exporter), enabling OTLP over HTTP/protobuf for backends that don’t support gRPC.
Changes:
- Replaces hardcoded gRPC OTLP exporter imports with protocol-resolving factory helpers (
_create_span_exporter,_create_log_exporter). - Adds
_resolve_otlp_protocol()to implement env-var precedence (signal-specific > general > defaultgrpc). - Updates
configure()to build span/log exporters via the new factories.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if protocol == "http/protobuf": | ||
| from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter | ||
| else: | ||
| from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter | ||
| logging.info("Using %s protocol for trace exporter", protocol) | ||
| return OTLPSpanExporter(**kwargs) | ||
|
|
||
|
|
||
| def _create_log_exporter(**kwargs): | ||
| """Create an OTLPLogExporter using the protocol from env vars.""" | ||
| protocol = _resolve_otlp_protocol("LOGS") | ||
| if protocol == "http/protobuf": | ||
| from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter | ||
| else: | ||
| from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter | ||
| logging.info("Using %s protocol for log exporter", protocol) |
| def _resolve_otlp_protocol(signal: str) -> str: | ||
| """Resolve the OTLP protocol from signal-specific or general env vars. | ||
|
|
||
| Follows the OpenTelemetry specification precedence: | ||
| signal-specific (e.g. OTEL_EXPORTER_OTLP_TRACES_PROTOCOL) > general > default (grpc). | ||
| """ | ||
| return ( | ||
| os.getenv(f"OTEL_EXPORTER_OTLP_{signal}_PROTOCOL") | ||
| or os.getenv("OTEL_EXPORTER_OTLP_PROTOCOL") | ||
| or "grpc" | ||
| ) | ||
|
|
||
|
|
||
| def _create_span_exporter(**kwargs): | ||
| """Create an OTLPSpanExporter using the protocol from env vars.""" | ||
| protocol = _resolve_otlp_protocol("TRACES") | ||
| if protocol == "http/protobuf": | ||
| from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter | ||
| else: | ||
| from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter | ||
| logging.info("Using %s protocol for trace exporter", protocol) | ||
| return OTLPSpanExporter(**kwargs) |
| # Check standard OTEL env vars: signal-specific endpoint first, then general endpoint | ||
| trace_endpoint = ( | ||
| os.getenv("OTEL_EXPORTER_OTLP_TRACES_ENDPOINT") | ||
| or os.getenv("OTEL_TRACING_EXPORTER_OTLP_ENDPOINT") # Backward compatibility | ||
| or os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT") | ||
| ) | ||
| trace_timeout_seconds = _resolve_otlp_timeout_seconds("TRACES") | ||
| logging.info("Trace endpoint: %s", trace_endpoint or "<default>") | ||
| if trace_endpoint: | ||
| processor = BatchSpanProcessor(OTLPSpanExporter(endpoint=trace_endpoint, timeout=trace_timeout_seconds)) | ||
| processor = BatchSpanProcessor(_create_span_exporter(endpoint=trace_endpoint, timeout=trace_timeout_seconds)) | ||
| else: | ||
| processor = BatchSpanProcessor(OTLPSpanExporter(timeout=trace_timeout_seconds)) | ||
| processor = BatchSpanProcessor(_create_span_exporter(timeout=trace_timeout_seconds)) |
| from fastapi import FastAPI | ||
| from opentelemetry import _logs, trace | ||
| from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter | ||
| from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter | ||
| from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor | ||
| from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor | ||
| from opentelemetry.instrumentation.openai import OpenAIInstrumentor |
- Add opentelemetry-exporter-otlp-proto-http to pyproject.toml so the http/protobuf import doesn't fail at runtime - Normalize protocol value with strip().lower() for robustness - Update test to monkeypatch _create_log_exporter instead of the removed module-level OTLPLogExporter Signed-off-by: shmuelarditi <shmuelrdt@gmail.com>
|
Thanks for the PR! Would you be able to do this for the go runtime as well? See https://github.com/kagent-dev/kagent/blob/main/go/adk/pkg/telemetry/tracing.go#L90 |
Apply the same protocol-aware exporter selection to the Go ADK
telemetry package. The newTracerProvider and newLoggerProvider
functions now check OTEL_EXPORTER_OTLP_{TRACES,LOGS}_PROTOCOL
(falling back to OTEL_EXPORTER_OTLP_PROTOCOL, then "grpc") and
create the corresponding HTTP or gRPC exporter.
This fixes the controller timeout when exporting to HTTP-only
backends like Langfuse.
Signed-off-by: shmuelarditi <shmuelrdt@gmail.com>
Added the same protocol-aware selection to the Go runtime in go/adk/pkg/telemetry/tracing.go, see the latest commit. |
|
Great, you will need to sign your commits as well for DCO to pass in the CI |
Sets the setuid bit on `/usr/bin/bwrap` in both runtime `Dockerfiles` so the non-root agent process (uid 1001) can create the user + network namespaces that bubblewrap relies on to sandbox skills and executed code. Without this, hosts with `kernel.apparmor_restrict_unprivileged_userns=1` deny bwrap's `RTM_NEWADDR` call when it brings up loopback, making every sandboxed command fail and blocking two CI e2e tests. The binary already runs inside a `privileged: true` Kubernetes pod, so the container already has full host capabilities; setuid only changes which process inside that pod holds them, and bubblewrap is a small, audited tool specifically designed to be setuid-safe. Privilege mode is dropped before running the user's command. --------- Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
) ## Motivation This will allow most users to directly switch from `runtime: python` to `runtime: go` without needing to worry about existing LLM provider configs since everything will be supported on the Go side, facilitating adoption of the new go runtime ## Summary Closes most of the gap between python and go identified in kagent-dev#1643 - TLS and api key passthrough for LLM provider - Support Ollama and Bedrock using client SDK as we've done earlier in kagent-dev#1540 - Use Bedrock client instead of messages API for Anthropic on Bedrock to support all bedrock runtime models - Tightens tool config conversion for Anthropic + Bedrock and fixes issues like kagent-dev#1645, kagent-dev#1683 - Sanitize ToolName for bedrock LLMs kagent-dev#1473, see [bedrock API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolSpecification.html) - Refactor embedding models in python to be separate from memory service - Strip approval confirmation synthetic tool calls from LLM requests, these messages are persisted in task / events store and used by ADK internally but sending them to the model will be wasting tokens and confuse the model. If the session is long and has many HITL events, these internal tool messages will be consuming many unnecessary token! ## Testing Plan - [x] All new unit tests in go adk passes, all old unit tests in python passes - [x] Test for no regression with OpenAI and Gemini models in go runtime - [x] Validate with a wide range of use cases such as: builtin tools (ask user, save memory), ADK built-in tools (load memory) MCP tools, Remote A2A (subagent) tools, HITL tools (approvals) - [x] Bedrock LLM and embedding model in Go runtime - [x] OpenAI API key passthrough with A2A `--token` option - [x] Ollama LLM and embedding in Go runtime (local models, Gemma 4 + embedding Gemma) - [x] Ollama with TLS (local https server with self-signed certs) --------- Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
) CVE Scan for postgres:18.3-alpine ``` | CVE ID | SEVERITY | PACKAGE | FIXED IN | SCANNERS | |----------------|----------|--------------|------------------------------|-----------------------------| | CVE-2025-68121 | CRITICAL | stdlib | 1.24.13, 1.25.7, 1.26.0-rc.3 | grype, trivy | | CVE-2025-58183 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(MEDIUM), trivy(HIGH) | | CVE-2025-58187 | HIGH | stdlib | 1.24.9, 1.25.3 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-58188 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61723 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61725 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61726 | HIGH | stdlib | 1.24.12, 1.25.6 | grype, trivy | | CVE-2025-61728 | HIGH | stdlib | 1.24.12, 1.25.6 | grype(MEDIUM), trivy(HIGH) | | CVE-2025-61729 | HIGH | stdlib | 1.24.11, 1.25.5 | grype, trivy | | CVE-2025-61731 | HIGH | stdlib | 1.24.12, 1.25.6 | grype | | CVE-2025-61732 | HIGH | stdlib | 1.24.13, 1.25.7 | grype | | CVE-2026-25679 | HIGH | stdlib | 1.25.8, 1.26.1 | grype, trivy | | CVE-2026-27135 | HIGH | nghttp2-libs | n/a | grype | | CVE-2026-27140 | HIGH | stdlib | 1.25.9, 1.26.2 | grype | | CVE-2026-32280 | HIGH | stdlib | 1.25.9, 1.26.2 | grype, trivy | | CVE-2026-32281 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-32282 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(MEDIUM), trivy(HIGH) | | CVE-2026-32283 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(HIGH), trivy(UNKNOWN) | ``` CVE Scan for postgres:18 ``` | CVE ID | SEVERITY | PACKAGE | FIXED IN | SCANNERS | |----------------|----------|-------------------------|------------------------------|-----------------------------| | CVE-2025-68121 | CRITICAL | stdlib | 1.24.13, 1.25.7, 1.26.0-rc.3 | grype, trivy | | CVE-2025-13151 | HIGH | libtasn1-6 | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2025-58183 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(MEDIUM), trivy(HIGH) | | CVE-2025-58187 | HIGH | stdlib | 1.24.9, 1.25.3 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-58188 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61723 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61725 | HIGH | stdlib | 1.24.8, 1.25.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2025-61726 | HIGH | stdlib | 1.24.12, 1.25.6 | grype, trivy | | CVE-2025-61728 | HIGH | stdlib | 1.24.12, 1.25.6 | grype(MEDIUM), trivy(HIGH) | | CVE-2025-61729 | HIGH | stdlib | 1.24.11, 1.25.5 | grype, trivy | | CVE-2025-61731 | HIGH | stdlib | 1.24.12, 1.25.6 | grype | | CVE-2025-61732 | HIGH | stdlib | 1.24.13, 1.25.7 | grype | | CVE-2025-69720 | HIGH | libncursesw6 | n/a | grype, trivy | | CVE-2025-69720 | HIGH | libtinfo6 | n/a | grype, trivy | | CVE-2025-69720 | HIGH | ncurses-base | n/a | grype, trivy | | CVE-2025-69720 | HIGH | ncurses-bin | n/a | grype, trivy | | CVE-2026-24882 | HIGH | dirmngr | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gnupg | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gnupg-l10n | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gpg | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gpg-agent | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gpgconf | n/a | grype, trivy | | CVE-2026-24882 | HIGH | gpgsm | n/a | grype, trivy | | CVE-2026-25679 | HIGH | stdlib | 1.25.8, 1.26.1 | grype, trivy | | CVE-2026-2673 | HIGH | libssl3t64 | 3.5.5-1~deb13u2 | grype(HIGH), trivy(LOW) | | CVE-2026-2673 | HIGH | openssl | 3.5.5-1~deb13u2 | grype(HIGH), trivy(LOW) | | CVE-2026-2673 | HIGH | openssl-provider-legacy | 3.5.5-1~deb13u2 | grype(HIGH), trivy(LOW) | | CVE-2026-27140 | HIGH | stdlib | 1.25.9, 1.26.2 | grype | | CVE-2026-28388 | HIGH | libssl3t64 | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28388 | HIGH | openssl | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28388 | HIGH | openssl-provider-legacy | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28389 | HIGH | libssl3t64 | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28389 | HIGH | openssl | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28389 | HIGH | openssl-provider-legacy | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-28390 | HIGH | libssl3t64 | 3.5.5-1~deb13u2 | grype, trivy | | CVE-2026-28390 | HIGH | openssl | 3.5.5-1~deb13u2 | grype, trivy | | CVE-2026-28390 | HIGH | openssl-provider-legacy | 3.5.5-1~deb13u2 | grype, trivy | | CVE-2026-29111 | HIGH | libsystemd0 | n/a | grype(MEDIUM), trivy(HIGH) | | CVE-2026-29111 | HIGH | libudev1 | n/a | grype(MEDIUM), trivy(HIGH) | | CVE-2026-31790 | HIGH | libssl3t64 | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-31790 | HIGH | openssl | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-31790 | HIGH | openssl-provider-legacy | 3.5.5-1~deb13u2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-32280 | HIGH | stdlib | 1.25.9, 1.26.2 | grype, trivy | | CVE-2026-32281 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(HIGH), trivy(MEDIUM) | | CVE-2026-32282 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(MEDIUM), trivy(HIGH) | | CVE-2026-32283 | HIGH | stdlib | 1.25.9, 1.26.2 | grype(HIGH), trivy(UNKNOWN) | | CVE-2026-4046 | HIGH | libc-bin | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4046 | HIGH | libc-l10n | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4046 | HIGH | libc6 | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4046 | HIGH | locales | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4437 | HIGH | libc-bin | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4437 | HIGH | libc-l10n | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4437 | HIGH | libc6 | n/a | grype(HIGH), trivy(MEDIUM) | | CVE-2026-4437 | HIGH | locales | n/a | grype(HIGH), trivy(MEDIUM) | ``` --------- Signed-off-by: Jonathan Jamroga <jjamroga@gmail.com> Co-authored-by: Eitan Yarmush <eitan.yarmush@solo.io>
b1a2796 to
80d84d1
Compare
@supreme-gg-gg oh yea, done. |
|
Seems like you need to update uv lock and reformat python |
Signed-off-by: shmuelarditi <shmuelar@elementor.com>
|
@supreme-gg-gg Done! Updated the uv.lock and reformatted the Python code with ruff. |
supreme-gg-gg
left a comment
There was a problem hiding this comment.
lgtm, thanks @shmuelarditi ! I've tested both runtimes with Jaeger
Bug Description
The Python agent tracing code in
python/packages/kagent-core/src/kagent/core/tracing/_utils.pyhardcodes the gRPC OTLP exporter via top-level imports:The standard
OTEL_EXPORTER_OTLP_TRACES_PROTOCOLandOTEL_EXPORTER_OTLP_PROTOCOLenvironment variables are completely ignored. This means agents cannot export traces over HTTP/protobuf, which is required by backends like Langfuse that only support OTLP over HTTP (not gRPC).Steps to Reproduce
otel.tracing.enabled: trueandotel.tracing.exporter.otlp.protocol: http/protobufhttps://cloud.langfuse.com/api/public/otel)OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobufset correctlyExpected Behavior
The
configure()function should respectOTEL_EXPORTER_OTLP_TRACES_PROTOCOL(or the generalOTEL_EXPORTER_OTLP_PROTOCOL) and dynamically select the appropriate exporter, per the OpenTelemetry specification:grpc(default) →opentelemetry.exporter.otlp.proto.grpchttp/protobuf→opentelemetry.exporter.otlp.proto.httpFix
This PR replaces the hardcoded gRPC imports with factory functions (
_create_span_exporter,_create_log_exporter) that resolve the protocol from env vars following the OTel spec precedence: signal-specific > general > default (grpc).The fix applies to both the trace and log exporters. Default behavior (gRPC) is unchanged — this is fully backwards compatible.
Environment
mainbranch)