Retry and Queueing
Sending Queue
A sending queue is a buffer that stores telemetry data temporarily before sending it to the destination. The sending queue ensures that telemetry data is not lost due to network connectivity issues or server outages and helps to minimize the number of network connections required for efficient transmission.
Parameter | Type | Default | Description |
---|---|---|---|
sending_queue_enabled | bool | true | Enable to buffer telemetry data temporarily before sending |
sending_queue_num_consumers | int | 10 | The number of consumers that dequeue batches. |
sending_queue_queue_size | int | 5000 | Maximum number of batches kept in memory before dropping. |
Persistent Queue
In addition to the sending queue, the persistent queue may be enabled. When enabled, telemetry data is persisted to the disk, which provides data resiliency in cases where the collector restarts. The sending queue must be enabled to enable persistent queuing.
Parameter | Type | Default | Description |
---|---|---|---|
persistent_queue_enabled | bool | true | Enable to buffer telemetry data to disk instead of in memory. |
persistent_queue_directory | string | $OIQ_OTEL_COLLECTOR_HOME/storage | The path to a directory where telemetry data will be buffered. |
Retry on Failure
Retry on failure settings are used to determine whether the exporter should attempt to resend telemetry data that has failed to be transmitted to the destination endpoint. When this setting is enabled, the exporter will automatically retry failed transmissions at a configurable interval until the data is successfully transmitted. This helps to ensure that telemetry data is not lost due to temporary network connectivity issues or server outages.
Parameter | Type | Default | Description |
---|---|---|---|
retry_on_failure_enabled | bool | true | Attempt to resend telemetry data that has failed to be transmitted to the destination. |
retry_on_failure_initial_interval | int | 5 | Time (in seconds) to wait after the first failure before retrying. |
retry_on_failure_max_interval | int | 30 | The upper bound (in seconds) on backoff. |
retry_on_failure_max_elapsed_time | int | 300 | The maximum amount of time (in seconds) spent trying to send a batch, used to avoid a never-ending retry loop. When set to 0, the retries are never stopped. |