Skip to main content

QoS Advisor

The QoS Advisor automatically detects DDS/RTPS misconfigurations and QoS policy issues. It analyzes traffic patterns and reports problems with actionable recommendations.

Overview

Access the QoS Advisor via View → QoS Advisor or press Ctrl+Q.

The advisor runs 15 detection rules covering:

  • QoS compatibility issues between publishers and subscribers
  • Resource exhaustion warnings
  • Timing violations
  • Type consistency problems

Detection Rules

Core QoS Rules (5)

RuleDescriptionSeverity
ReliabilityMismatchPublisher BEST_EFFORT but subscriber expects RELIABLEHigh
DurabilityMismatchDurability levels incompatible (e.g., VOLATILE vs TRANSIENT_LOCAL)High
HistoryDepthMismatchHistory depth too small for subscriber consumption rateMedium
DeadlineViolationMessages not arriving within Deadline QoS periodHigh
LifespanExpiredMessages expiring before delivery (Lifespan QoS)Medium

Extended QoS Rules (10)

RuleDescriptionSeverity
OwnershipConflictMultiple writers with EXCLUSIVE ownership on same topicHigh
PartitionMismatchPublisher and subscriber in different partitionsMedium
ContentFilterInefficientContent filter passing >80% of messages (wasted CPU)Low
LatencyBudgetExceededEnd-to-end latency exceeds LatencyBudget QoSMedium-High
TransportPriorityInversionLow-priority traffic blocking high-priorityHigh
ResourceLimitsNearMaxApproaching max_samples or history depth limitsCritical
LivelinessTimeoutWriter liveliness lease expired (potential crash)Critical
PresentationCoherenceViolationCoherent updates received out of orderMedium
DestinationOrderViolationMessages delivered out of orderMedium
TypeConsistencyWarningPotential type mismatch between publisher/subscriberHigh

Rule Details

ReliabilityMismatch

Problem: A subscriber configured for RELIABLE is connected to a publisher using BEST_EFFORT.

Symptoms:

  • Missing messages
  • Gaps in sequence numbers
  • Subscriber reports data loss

Detection: Compares QoS hashes in discovery data.

Fix:

// Publisher side - match subscriber's reliability
DataWriterQos qos;
qos.reliability().kind = RELIABLE_RELIABILITY_QOS;

DurabilityMismatch

Problem: Durability levels are incompatible (e.g., subscriber expects TRANSIENT_LOCAL but publisher is VOLATILE).

Symptoms:

  • Late-joining subscribers miss historical data
  • Inconsistent state across nodes

Detection: Analyzes Durability QoS in endpoint discovery.

Fix:

// Ensure publisher durability >= subscriber durability
DataWriterQos qos;
qos.durability().kind = TRANSIENT_LOCAL_DURABILITY_QOS;

OwnershipConflict

Problem: Multiple writers claim EXCLUSIVE ownership on the same topic/instance.

Symptoms:

  • Only highest-strength writer's data is delivered
  • Confusing behavior during failover

Detection: Counts writers with ownership.kind = EXCLUSIVE per topic.

Fix:

  • Use SHARED ownership for multi-writer scenarios
  • Or assign different ownership_strength values for failover

LivelinessTimeout

Problem: A writer's liveliness lease has expired, indicating potential crash or network partition.

Symptoms:

  • Large gaps in message timestamps
  • Subscriber's on_liveliness_changed() callback triggered

Detection: Analyzes inter-message gaps vs. median arrival rate.

Fix:

// Configure appropriate lease duration
DataWriterQos qos;
qos.liveliness().kind = AUTOMATIC_LIVELINESS_QOS;
qos.liveliness().lease_duration = Duration_t(1, 0); // 1 second

ResourceLimitsNearMax

Problem: Resource usage approaching configured limits (max_samples, max_instances, history depth).

Symptoms:

  • Sample rejection
  • on_sample_rejected() callback triggered
  • Memory pressure

Detection: Tracks message counts per topic vs. estimated limits.

Fix:

// Increase limits or add subscribers to consume faster
DataReaderQos qos;
qos.resource_limits().max_samples = 10000;
qos.history().depth = 100;

TypeConsistencyWarning

Problem: Different payload sizes detected on same topic, suggesting type mismatch.

Symptoms:

  • Deserialization failures
  • Corrupted data
  • CDR decode errors

Detection: Clusters payload sizes and detects multimodal distributions.

Fix:

  • Verify all publishers use identical IDL types
  • Enable strict type checking:
TypeConsistencyEnforcementQosPolicy qos;
qos.kind = DISALLOW_TYPE_COERCION;

Severity Levels

LevelColorAction Required
CriticalRedImmediate attention - system may fail
HighOrangeFix before production deployment
MediumYellowShould be addressed
LowBlueInformational / optimization opportunity

CLI Usage

Run QoS analysis in headless mode:

# Text output
hdds-viewer --analyze capture.hddscap

# JSON output for CI/CD integration
hdds-viewer --analyze capture.hddscap --format json

# Exit code reflects highest severity
echo $? # 0=OK, 1=Low, 2=Medium, 3=High, 4=Critical

Integration with CI/CD

# GitHub Actions example
- name: QoS Analysis
run: |
hdds-viewer --analyze test-capture.hddscap --format json > qos-report.json
if [ $? -ge 3 ]; then
echo "QoS issues detected!"
exit 1
fi

See Also