Complimentary Gartner® Report! 'A CTO's Guide to Open-Source Software: Answering the Top 10 FAQs.'Read more

Deduplicate Logs

Description

The Deduplicate Logs processor can be used to deduplicate logs over a time range and emit a single log with the count of duplicate logs.

Logs are considered duplicates if the following match:

  • Severity
  • Log Body
  • Resource Attributes
  • Log Attributes

Supported Types

MetricsLogsTraces

Configuration Table

ParameterTypeDefaultDescription
interval*int10The interval in seconds on which to aggregate logs. An aggregated log will be emitted after the interval passes.
log_count_attribute*stringlog_countThe name of the count attribute of deduplicated logs that will be added to the emitted log.
timezone*stringUTCThe timezone of the first_observed_timestamp and last_observed_timestamp log attributes that are on the emitted log.
exclude_fieldsstringsA list of fields to exclude from duplicate matching. Fields can be excluded from the log body or attributes. These fields will not be present in the emitted log. More details can be found here.
*required field

exclude_fields Parameter

The exclude_fields parameter allows the user to remove fields from being considered when looking for duplicate logs. Fields can be excluded from either the body or attributes of a log. Though the entire body cannot be excluded. Nested fields can be specified by delimiting each part of the path with a .. If a field contains a . as part of its name it can be escaped by using \..

Below are a few examples and how to specify them:

  • Exclude timestamp field from the body -> body.timestamp
  • Exclude a host.name field from the log attributes -> attributes.host\.name
  • Exclude a nested ip field inside a src attribute -> attributes.src.ip

Example Configuration

Basic Configuration

Setting a custom log_count_attribute and timezone while deduplicating logs on a 60 second interval.

Web Interface

observIQ docs - Deduplicate Logs - image 1

Standalone Processor

yaml
1apiVersion: bindplane.observiq.com/v1
2kind: Processor
3metadata:
4  id: log-dedup
5  name: log-dedup
6spec:
7  type: log_dedup
8  parameters:
9    - name: interval
10      value: 60
11    - name: log_count_attribute
12      value: 'dedup_count'
13    - name: timezone
14      value: 'America/Los_Angeles'

Exclude Fields

This example shows the addition of exclude_fields. More information on exclude_fields can be found here.

Web Interface

observIQ docs - Deduplicate Logs - image 2

Standalone Processor

yaml
1apiVersion: bindplane.observiq.com/v1
2kind: Processor
3metadata:
4  id: exclude-fields
5  name: exclude-fields
6spec:
7  type: log_dedup
8  parameters:
9    - name: interval
10      value: 10
11    - name: log_count_attribute
12      value: 'log_count'
13    - name: timezone
14      value: 'UTC'
15    - name: exclude_fields
16      value:
17        - 'attributes.timestamp'
18        - 'body.time'
19        - 'attributes.log\.file\.name'