Free Report! Gartner® Hype Cycle™ for Monitoring and Observability.Read more

Retry and Queueing

Sending Queue

A sending queue is a buffer that stores telemetry data temporarily before sending it to the destination. The sending queue ensures that telemetry data is not lost due to network connectivity issues or server outages and helps to minimize the number of network connections required for efficient transmission.

ParameterTypeDefaultDescription
sending_queue_enabledbooltrueEnable to buffer telemetry data temporarily before sending
sending_queue_num_consumersint10The number of consumers that dequeue batches.
sending_queue_queue_sizeint5000Maximum number of batches kept in memory before dropping.

Persistent Queue

In addition to the sending queue, the persistent queue may be enabled. When enabled, telemetry data is persisted to the disk, which provides data resiliency in cases where the collector restarts. The sending queue must be enabled to enable persistent queuing.

ParameterTypeDefaultDescription
persistent_queue_enabledbooltrueEnable to buffer telemetry data to disk instead of in memory.
persistent_queue_directorystring$OIQ_OTEL_COLLECTOR_HOME/storageThe path to a directory where telemetry data will be buffered.

Retry on Failure

Retry on failure settings are used to determine whether the exporter should attempt to resend telemetry data that has failed to be transmitted to the destination endpoint. When this setting is enabled, the exporter will automatically retry failed transmissions at a configurable interval until the data is successfully transmitted. This helps to ensure that telemetry data is not lost due to temporary network connectivity issues or server outages.

ParameterTypeDefaultDescription
retry_on_failure_enabledbooltrueAttempt to resend telemetry data that has failed to be transmitted to the destination.
retry_on_failure_initial_intervalint5Time (in seconds) to wait after the first failure before retrying.
retry_on_failure_max_intervalint30The upper bound (in seconds) on backoff.
retry_on_failure_max_elapsed_timeint300The maximum amount of time (in seconds) spent trying to send a batch, used to avoid a never-ending retry loop. When set to 0, the retries are never stopped.