eval2otel Python SDK Preview

The Python package mirrors the TypeScript eval2otel.v1 contract and can also emit OpenTelemetry spans when the optional OTel extras are installed. Without those extras, it still validates Eval2Otel payloads and returns conversion reports.

pip install -e ".[otel,validation]"

from eval2otel import instrument_all

client = instrument_all()
report = client.process_evaluation({
    "id": "case-1",
    "timestamp": 1700000000000,
    "model": "gpt-4o-mini",
    "system": "openai",
    "operation": "chat",
    "request": {"model": "gpt-4o-mini"},
    "response": {"model": "gpt-4o-mini"},
    "usage": {"inputTokens": 12, "outputTokens": 8},
    "performance": {"duration": 0.25},
    "conversation": {
        "messages": [
            {"role": "user", "content": "What shipped?"},
            {"role": "assistant", "content": "Eval2Otel Python OTLP hooks shipped."}
        ]
    },
    "provenance": {
        "sourceFramework": "deepeval",
        "runId": "nightly",
        "caseId": "case-1"
    }
})

assert report.contract_version == "eval2otel.v1"
client.shutdown()

Zero-Code Instrumentation

The package registers an opentelemetry_instrumentor entry point named eval2otel. In an environment with opentelemetry-instrumentation installed, opentelemetry-instrument can discover Eval2Otel and call the same instrument_all() path used above:

OTEL_SERVICE_NAME=my-ai-service \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
EVAL2OTEL_PROVIDERS=openai,anthropic \
opentelemetry-instrument python main.py

Programmatic use is also available:

from eval2otel import Eval2OtelInstrumentor, get_instrumented_client

Eval2OtelInstrumentor().instrument()
client = get_instrumented_client()

Environment

instrument_all() reads:

OTEL_SERVICE_NAME or EVAL2OTEL_SERVICE_NAME
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT or OTEL_EXPORTER_OTLP_ENDPOINT
OTEL_EXPORTER_OTLP_PROTOCOL
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
EVAL2OTEL_SAMPLE_RATE
EVAL2OTEL_REDACT_PII
EVAL2OTEL_PROVIDERS

Content capture is off by default. When OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true, message content is emitted as span events and sampled by EVAL2OTEL_SAMPLE_RATE. When EVAL2OTEL_REDACT_PII=true, the built-in redactor masks common emails, bearer tokens, secret assignments, and long number sequences before content is emitted.

Provider Hooks

instrument_all() returns client.instrumentation_handles when provider patching is enabled. Each handle reports whether the provider package was available, whether a compatible instrumentor was invoked, and the reason when it could not be instrumented.

Supported provider names:

openai
anthropic
google-generativeai
bedrock
cohere
huggingface

Set EVAL2OTEL_PROVIDERS=openai,anthropic to limit discovery.

Typed Validation

Install the validation extra to use optional Pydantic models:

from eval2otel.models import EvalResultModel

payload = EvalResultModel.model_validate({
    "id": "case-1",
    "model": "gpt-4o-mini",
    "operation": "chat",
    "request": {"model": "gpt-4o-mini"},
    "performance": {"duration": 0.25},
})

client.process_evaluation(payload.to_eval_result())

Development

From the repository root:

PYTHONPATH=python python3 -m unittest discover -s python/tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval2otel Python SDK Preview

Zero-Code Instrumentation

Environment

Provider Hooks

Typed Validation

Development

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

eval2otel Python SDK Preview

Zero-Code Instrumentation

Environment

Provider Hooks

Typed Validation

Development