OpenTelemetry (OTLP)

Export HDDS traces and metrics to any OTLP-compatible backend via gRPC.

Overview

The hdds-telemetry-otlp crate bridges HDDS's tracing instrumentation to OpenTelemetry OTLP exporters. This enables distributed tracing and metrics collection via gRPC (tonic) to backends like Jaeger, Grafana Tempo, or any OTLP-compatible collector.

What it provides:

Automatic export of tracing::info_span! calls as OpenTelemetry spans
DDS-specific metric instruments (messages sent/received, write latency, discovery events)
Clean shutdown via RAII guard pattern

Quick Start

Add the dependency

[dependencies]
hdds-telemetry-otlp = { version = "0.1" }

Initialize

use hdds_telemetry_otlp::{OtlpConfig, init_tracing};

fn main() {
    let config = OtlpConfig::default();
    let _guard = init_tracing(config).expect("Failed to init OTLP tracing");

    // All tracing::info_span! / tracing::info! calls are now exported
    // as OpenTelemetry spans to the configured OTLP endpoint.

    // _guard must be held alive for the duration of the application.
    // Dropping it triggers a clean shutdown of the pipeline.
}

Configuration

OtlpConfig

use hdds_telemetry_otlp::OtlpConfig;

let config = OtlpConfig {
    endpoint: "http://localhost:4317".to_string(),
    service_name: "my-dds-app".to_string(),
    export_traces: true,
    export_metrics: true,
    batch_timeout_ms: 5000,
};

Configuration Options

Option	Default	Description
`endpoint`	`http://localhost:4317`	OTLP collector endpoint (gRPC)
`service_name`	`"hdds"`	Service name reported to the collector
`export_traces`	`true`	Export spans via OTLP
`export_metrics`	`true`	Export metrics via OTLP
`batch_timeout_ms`	`5000`	Batch export timeout in milliseconds

Tracing

When export_traces is enabled, init_tracing() sets up:

An OTLP SpanExporter via gRPC (tonic) pointed at config.endpoint
A SdkTracerProvider with batch span processing
A tracing_opentelemetry::OpenTelemetryLayer wired into tracing_subscriber::Registry
An EnvFilter (defaults to info, configurable via RUST_LOG)

All tracing spans and events in your application (and in HDDS internals) are automatically exported as OpenTelemetry spans.

// These spans appear in your OTLP backend
let _span = tracing::info_span!("dds.write", topic = "SensorData", seq = 42).entered();
tracing::info!("Writing sample to topic SensorData");

Metrics

Pre-Registered Instruments

When export_metrics is enabled, the following DDS instruments are pre-registered:

Instrument	Type	Description
`dds.messages.sent`	Counter (u64)	Total DDS messages sent
`dds.messages.received`	Counter (u64)	Total DDS messages received
`dds.discovery.participants`	Counter (u64)	DDS discovery participant events
`dds.latency.write_ns`	Histogram (u64)	DDS write latency in nanoseconds

HddsMetrics

The HddsMetrics struct provides a convenience wrapper around the pre-registered instruments.

use hdds_telemetry_otlp::metrics::HddsMetrics;

let metrics = HddsMetrics::new();

// Record a write with latency
metrics.record_write(1_200);  // 1200 ns write latency
// Increments dds.messages.sent and records dds.latency.write_ns

// Record a read
metrics.record_read();
// Increments dds.messages.received

// Record a discovery event
metrics.record_discovery_event("participant_added");
// Increments dds.discovery.participants with event_type attribute

Custom Meter

You can create HddsMetrics from an explicit Meter if needed:

use hdds_telemetry_otlp::metrics::HddsMetrics;
use opentelemetry::global;

let meter = global::meter("my-custom-meter");
let metrics = HddsMetrics::from_meter(&meter);

OtlpGuard

The init_tracing() function returns an OtlpGuard that must be held alive for the duration of the application. When dropped, it:

Flushes any remaining spans
Shuts down the SdkTracerProvider
Shuts down the SdkMeterProvider

use hdds_telemetry_otlp::{OtlpConfig, init_tracing};

fn main() {
    // Hold the guard for the entire application lifetime
    let _guard = init_tracing(OtlpConfig::default())
        .expect("Failed to init OTLP");

    run_application();

    // Give batch exporter time to flush before shutdown
    std::thread::sleep(std::time::Duration::from_secs(2));

    // _guard drops here, triggering clean shutdown
}

Guard Lifetime

If the OtlpGuard is dropped too early, spans and metrics may be lost. Hold it in your main() function or equivalent application entry point.

Complete Example

use hdds_telemetry_otlp::{OtlpConfig, init_tracing};
use hdds_telemetry_otlp::metrics::HddsMetrics;

fn main() {
    // 1. Configure and initialize OTLP export
    let config = OtlpConfig {
        endpoint: "http://localhost:4317".to_string(),
        service_name: "hdds-example".to_string(),
        export_traces: true,
        export_metrics: true,
        batch_timeout_ms: 2000,
    };

    let _guard = init_tracing(config).expect("Failed to init OTLP tracing");

    // 2. Create metric instruments
    let metrics = HddsMetrics::new();

    // 3. Simulate DDS activity with tracing spans
    for i in 0..5 {
        {
            let _span = tracing::info_span!(
                "dds.write", topic = "SensorData", seq = i
            ).entered();
            tracing::info!("Writing sample {} to topic SensorData", i);

            let latency_ns = 10_000_000 + (i as u64 * 500_000);
            metrics.record_write(latency_ns);
        }

        {
            let _span = tracing::info_span!(
                "dds.read", topic = "SensorData", seq = i
            ).entered();
            tracing::info!("Reading sample {} from topic SensorData", i);
            metrics.record_read();
        }
    }

    // 4. Discovery event
    {
        let _span = tracing::info_span!("dds.discovery").entered();
        tracing::info!("New participant discovered");
        metrics.record_discovery_event("participant_added");
    }

    // 5. Give the batch exporter a moment to flush
    std::thread::sleep(std::time::Duration::from_secs(3));

    // 6. OtlpGuard is dropped here, triggering clean shutdown
    println!("Shutting down OTLP pipeline...");
}

Backend Setup

Jaeger (with OTLP receiver)

docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 4317:4317 \
  -p 16686:16686 \
  jaegertracing/all-in-one:latest

Access the UI at http://localhost:16686.

Grafana Tempo

# tempo.yaml
server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317

OpenTelemetry Collector

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  logging:
    loglevel: debug

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging]
    metrics:
      receivers: [otlp]
      exporters: [logging]

Error Handling

The Error enum covers initialization failures:

Variant	Description
`Trace`	OpenTelemetry trace subsystem error
`Metrics`	OpenTelemetry metrics subsystem error
`ExporterBuild`	OTLP exporter build error (e.g., invalid endpoint)
`SetSubscriber`	Failed to set global tracing subscriber

Subscriber Conflict

init_tracing() calls tracing_subscriber::registry().try_init(), which will fail if a global subscriber is already set. Only call it once per process.

Environment Variables

Variable	Description	Example
`RUST_LOG`	Controls tracing log level filter	`RUST_LOG=debug`
`OTEL_EXPORTER_OTLP_ENDPOINT`	Override OTLP endpoint (standard OTEL env)	`http://collector:4317`

Limitations

Limitation	Description
gRPC only	OTLP export uses tonic (gRPC), no HTTP/JSON support
Single init	`init_tracing()` can only be called once per process
No auto-instrumentation	HDDS spans must use `tracing` macros manually
No C FFI	OTLP telemetry is not exposed in the C API

Next Steps

Telemetry Guide -- Built-in telemetry overview
Admin API -- Runtime monitoring and diagnostics

Overview​

Quick Start​

Add the dependency​

Initialize​

Configuration​

OtlpConfig​

Configuration Options​

Tracing​

Metrics​

Pre-Registered Instruments​

HddsMetrics​

Custom Meter​

OtlpGuard​

Complete Example​

Backend Setup​

Jaeger (with OTLP receiver)​

Grafana Tempo​

OpenTelemetry Collector​

Error Handling​

Environment Variables​

Limitations​

Next Steps​