We’re excited to announce that we’ve recently contributed Redis monitoring support to the OpenTelemetry collector. You can check it out here!
You can utilize this receiver in conjunction with any OTel collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector.
Below are steps to get up and running quickly with observIQ’s distribution, and shipping Redis metrics to a popular backend: Google Cloud Ops. You can find out more on observIQ’s GitHub page: https://github.com/observIQ/observiq-otel-collector
What signals matter?
Unlike other databases, monitoring the performance of Redis is relatively simple, focusing on the following categories of KPIs:
- Memory Utilization
- Database Throughput
- Cache hit ratio and evicted cache data
- Number of connections
- Replication
All of the above categories can be gathered with the Redis receiver – so let’s get started.
Step 1: Installing the collector
The simplest way to get started is with one of the single-line installation commands shown below. For more advanced options, you’ll find a variety of installation options for Linux, Windows, and macOS on GitHub.
Use the following single-line installation script to install the observIQ distribution of the OpenTelemetry Collector. Please note that the collector must be installed on the Redis system.
Windows:
msiexec /i "https://github.com/observIQ/observiq-otel-collector/releases/latest/download/observiq-otel-collector.msi" /quiet
MacOS/Linux:
sudo sh -c "$(curl -fsSlL https://github.com/observiq/observiq-otel-collector/releases/latest/download/install_unix.sh)" install_unix.sh
Step 2: Setting up pre-requisites and authentication credentials
In the following example, we are using Google Cloud Operations as the destination. However, OpenTelemtry offers exporters for many destinations. Check out the list of exporters here.
Setting up Google Cloud exporter prerequisites:
If running outside of Google Cloud (On prem, AWS, etc) or without the Cloud Monitoring scope, the Google Exporter requires a service account.
Create a service account with the following roles:
- Metrics: roles/monitoring.metricWriter
- Logs: roles/logging.logWriter
Create a service account JSON key and place it on the system that is running the collector.
MacOS/Linux
In this example, the key is placed at /opt/observiq-otel-collector/sa.json and its permissions are restricted to the user running the collector process.
sudo cp sa.json /opt/observiq-otel-collector/sa.json
sudo chown observiq-otel-collector: /opt/observiq-otel-collector/sa.json
sudo chmod 0400 /opt/observiq-otel-collector/sa.json
Set the GOOGLE_APPLICATION_CREDENTIALS environment variable by creating a systemd override. A systemd override allows users to modify the systemd service configuration without modifying the service directly. This allows package upgrades to happen seamlessly. You can learn more about systemd units and overrides here.
Run the following command
sudo systemctl edit observiq-otel-collector
If this is the first time an override is being created, paste the following contents into the file:
[Service]
Environment=GOOGLE_APPLICATION_CREDENTIALS=/opt/observiq-otel-collector/sa.json
If an override is already in place, simply insert the Environment parameter into the existing Service section.
Restart the collector
sudo systemctl restart observiq-otel-collector
Windows
In this example, the key is placed at C:/observiq/collector/sa.json.
Set the GOOGLE_APPLICATION_CREDENTIALS with the command prompt setx command.
Run the following command
setx GOOGLE_APPLICATION_CREDENTIALS "C:/observiq/collector/sa.json" /m
Restart the service using the services application.
Step 3: Configuring the Redis receiver
After the installation, the config file for the collector can be found at
- C:\Program Files\observIQ OpenTelemetry Collector\config.yaml (Windows)
- /opt/observiq-otel-collector/config.yaml(Linux)
Edit the configuration file and use the following configuration.
receivers:
redis:
endpoint: "localhost:6379"
collection_interval: 60s
processors:
# Resourcedetection is used to add a unique (host.name)
# to the metric resource(s), allowing users to filter
# between multiple agent systems.
resourcedetection:
detectors: ["system"]
system:
hostname_sources: ["os"]
# Used for Google generic_node mapping.
resource:
attributes:
- key: namespace
value: "redis"
action: upsert
- key: location
value: "global"
action: upsert
normalizesums:
batch:
exporters:
googlecloud:
retry_on_failure:
enabled: false
metric:
prefix: workload.googleapis.com
resource_mappings:
- source_type: ""
target_type: generic_node
label_mappings:
- source_key: host.name
target_key: node_id
- source_key: location
target_key: location
- source_key: namespace
target_key: namespace
service:
pipelines:
metrics:
receivers:
- redis
processors:
- resourcedetection
- resource
- normalizesums
- batch
exporters:
- googlecloud
In the example above, the Redis receiver configuration is set to:
- Receive metrics from the Redis system at the specified endpoint.
- Set the time interval for fetching the metrics. The default value for this parameter is 10s. However, if exporting metrics to Google Cloud operations, this value is set to 60s by default.
- The resource detection processor is used to create a distinction between metrics received from multiple Redis systems. This helps with filtering metrics from specific Redis hosts in the monitoring tool, in this case, Google Cloud operations.
- In the Google Cloud exporter here, do the following mapping:
- Set the target type to a generic node, to simplify filtering metrics from the collector in cloud monitoring.
- Set node_id, location, and namespace for the metrics. Location and namespace are set from the resource processor.
- It is important to note that the project ID is not set in the googlecloud exporter configuration. Google automatically detects the project ID
- Add the normalizesums processor to exclude the first metric that has a zero value when the configuration is done and the collector is restarted. To know more about this processor, check the OpenTelemetry documentation.
- Add the batch processor to bundle the metrics from multiple receivers. We highly recommend using this processor in the configuration, especially for the benefit of the logging component of the collector. To learn more about this processor check the documentation.
- It is recommended to set the retry_on_failure to false. If this is not set, the retry attempts fall into a loop for five attempts.
Step 4: Viewing the metrics collected in Google cloud operations
If you followed the steps detailed above, you should see the following metrics exported to Metrics Explorer.
Metric | Description | Namespace | |
---|---|---|---|
1 | redis.uptime | Number of seconds since Redis server start | custom.googleapis.com/opencensus/redis.uptime |
2 | redis.cpu.time | System CPU consumed by the Redis server in seconds since server start | custom.googleapis.com/opencensus/redis.cpu.time |
3 | redis.clients.connected | Number of client connections (excluding connections from replicas) | custom.googleapis.com/opencensus/redis.clients.connected |
4 | redis.clients.max_input_buffer | Biggest input buffer among current client connections | custom.googleapis.com/opencensus/redis.clients.max_input_buffer |
5 | redis.clients.max_output_buffer | Longest output list among current client connections | custom.googleapis.com/opencensus/redis.clients.max_output_buffer |
6 | redis.clients.blocked | Number of clients pending on a blocking call | custom.googleapis.com/opencensus/redis.clients.blocked |
7 | redis.keys.expired | Total number of key expiration events | custom.googleapis.com/opencensus/redis.keys.expired |
8 | redis.keys.evicted | Number of evicted keys due to maxmemory limit | custom.googleapis.com/opencensus/redis.keys.evicted |
9 | redis.connections.received | Total number of connections accepted by the server | custom.googleapis.com/opencensus/redis.connections.received |
10 | redis.connections.rejected: | Number of connections rejected because of maxclients limit | custom.googleapis.com/opencensus/redis.connections.rejected: |
11 | redis.memory.used | Total number of bytes allocated by Redis using its allocator | custom.googleapis.com/opencensus/redis.memory.used |
12 | redis.memory.peak | Peak memory consumed by Redis (in bytes) | custom.googleapis.com/opencensus/redis.memory.peak |
13 | redis.memory.rss | Number of bytes that Redis allocated as seen by the operating system | custom.googleapis.com/opencensus/redis.memory.rss |
14 | redis.memory.lua | Number of bytes used by the Lua engine | custom.googleapis.com/opencensus/redis.memory.lua |
15 | redis.memory.fragmentation_ratio | Ratio between used_memory_rss and used_memory | custom.googleapis.com/opencensus/redis.memory.fragmentation_ratio |
16 | redis.rdb.changes_since_last_save | Number of changes since the last dump | custom.googleapis.com/opencensus/redis.rdb.changes_since_last_save |
17 | redis.commands | Number of commands processed per second | custom.googleapis.com/opencensus/redis.commands |
18 | redis.commands.processed | Total number of commands processed by the server | custom.googleapis.com/opencensus/redis.commands.processed |
19 | redis.net.input | The total number of bytes read from the network | custom.googleapis.com/opencensus/redis.net.input |
20 | redis.net.output | The total number of bytes written to the network | custom.googleapis.com/opencensus/redis.net.output |
21 | redis.keyspace.hits | Number of successful lookup of keys in the main dictionary | custom.googleapis.com/opencensus/redis.keyspace.hits |
22 | redis.keyspace.misses | Number of failed lookup of keys in the main dictionary | custom.googleapis.com/opencensus/redis.keyspace.misses |
23 | redis.latest_fork | Duration of the latest fork operation in microseconds | custom.googleapis.com/opencensus/redis.latest_fork |
24 | redis.slaves.connected | Number of connected replicas | custom.googleapis.com/opencensus/redis.slaves.connected |
25 | redis.replication.backlog_first_byte_offset | The master offset of the replication backlog buffer | custom.googleapis.com/opencensus/redis.replication.backlog_first_byte_offset |
26 | redis.replication.offset | The server's current replication offset | custom.googleapis.com/opencensus/redis.replication.offset |
27 | redis.db.keys | Number of keyspace keys | custom.googleapis.com/opencensus/redis.db.keys |
28 | redis.db.expires | Number of keyspace keys with an expiration | custom.googleapis.com/opencensus/ redis.db.expires |
29 | redis.db.avg_ttl | Average keyspace keys TTL | custom.googleapis.com/opencensus/redis.db.avg_ttl |
To view the metrics follow the steps outlined below:
- In the Google Cloud Console, head to metrics explorer
- Select the resource as a generic node.
- Follow the namespace equivalent in the table above and filter the metric to view the chart.
observIQ’s distribution is a game-changer for companies looking to implement the OpenTelemetry standards. The single line installer, seamlessly integrated receivers, exporter, and processor pool make working with this collector simple. Follow this space to keep up with all our future posts and simplified configurations for various sources. For questions, requests, and suggestions, reach out to our support team at support@observIQ.com.