We are constantly working on contributing monitoring support for various sources, the latest in that line is support for Cassandra monitoring using the OpenTelemetry collector. If you are as excited as we are, take a look at the details of this support in OpenTelemetry’s repo.
The best part is that this receiver works with any OpenTelemetry collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector.
In this post, we take you through the steps to set up the JMX receiver withobservIQ’s distribution of the OpenTelemetry Collector and send out the metrics to New Relic.
What signals matter?
Performance metrics are the most important to monitor for Cassandra. Here’s a list of signals to keep track of:
Availability of resources:
Monitoring the physical resources and their utilization is critical to Cassandra’s performance. Standard JVM metrics such as memory usage, thread count, garbage collection, etc. are good to monitor. If there’s a decrease in the compute resources, it brings down the Cassandra database’s performance.
Volume of client requests:
As with monitoring other databases, it is necessary to monitor the time taken to send, receive and fulfill requests. The volume of requests is also an indicator of unforeseen spikes in traffic, possibly an issue with the application/ database.
Latency:
Latency is a critical metric to monitor for Cassandra databases. Continuously monitoring helps identify performance issues and latency issues originating from a cluster. Values of read and write requests are monitored to create a holistic view of the speed of execution.
Configuring the JMX metrics receiver
After the installation, the config file for the collector can be found at:
- C:\Program Files\observIQ OpenTelemetry Collector\config.yaml (Windows)
- /opt/observiq-otel-collector/config.yaml(Linux)
The first step is building the receiver’s configuration:
- We are using the JMX receiver to gather Cassandra metrics. The jar_path attribute lets you specify the path to the jar file that facilitates gathering Cassandra metrics using the JMX receiver. This file path is created automatically when observIQ’s distribution of the OpenTelemetry Collector is installed.
- Set the IP address and port for the system from which the metrics are gathered as the endpoint.
- When we connect to JMX there are different categories of metrics; the Cassandra metrics and JVM metrics are the ones that this configuration intends to scrape. This target_system attribute specifies that.
- Set the time interval for fetching the metrics for the collection_interval attribute. The default value for this parameter is 10s. However, if exporting metrics to New Relic, this value is set to 60s by default.
- The Properties attribute allows you to set arbitrary attributes. For instance, if you are configuring multiple JMX receivers to collect metrics from many Cassandra servers, this attribute allows you to set the unique IP addresses for each of those endpoint systems. Please note that this is not the only use of the properties option.
receivers:
jmx:
jar_path: /opt/opentelemetry-java-contrib-jmx-metrics.jar
endpoint: localhost:9000
target_system: Cassandra,jvm
collection_interval: 60s
properties:
# Attribute 'endpoint' will be used for generic_node's node_id field.
otel.resource.attributes: endpoint=localhost:9000
The next step is to configure the processors:
- Use the resourcedetection processor to create an identifier value for each Cassandra instance that the metrics are scraped from.
- Add the batch processor to bundle the metrics from multiple receivers. We highly recommend using this processor in the configuration, especially for the benefit of the logging component of the collector. To learn more about this processor check the documentation.
processors:
resourcedetection:
detectors: ["system"]
system:
hostname_sources: ["os"]
batch:
Finally we’ll set up a destination for exporting the metrics as shown below.
You can check the configuration for your preferred destination from OpenTelemetry’s documentation here.
exporters:
otlp:
endpoint: https://otlp.nr-data.net:443
headers:
api-key: 00000-00000-00000
tls:
insecure: false
Set up the pipeline.
service:
pipelines:
metrics:
receivers:
- jmx
processors:
- resourcedetection
- resourceattributetransposer
- resource
- batch
exporters:
- otlp
Viewing the metrics collected
The JMX metrics gatherer scrapes the following metrics and exports them to the destination, based on the config detailed above.
Metric | Description |
---|---|
cassandra.client.request.count | The total request count |
cassandra.client.request.error.count | The total number of requests that have returned an error |
cassandra.client.request.range_slice.latency.50p | The total number of requests that are range sliced at 50% |
cassandra.client.request.range_slice.latency.99p | The total number of requests that are range sliced at 90% |
cassandra.client.request.range_slice.latency.max | The total number of request range sized at the maximum limit. |
cassandra.client.request.read.latency.50p | The latency for read requests at 50% |
assandra.client.request.read.latency.99p | The latency for read requests at 99% |
cassandra.client.request.read.latency.max | The latency for read requests at the maximum limit. |
cassandra.client.request.write.latency.50p | The latency for write requests at 50% |
cassandra.client.request.write.latency.99p | The latency for write requests at 99% |
cassandra.client.request.write.latency.max | The latency for write requests at the maximum limit. |
cassandra.compaction.tasks.completed | The total number of compaction tasks completed. |
cassandra.compaction.tasks.pending | The number of compaction tasks pending. |
cassandra.storage.load.count | The total storage load count. |
cassandra.storage.total_hints.count | The total storage load hints count. |
cassandra.storage.total_hints.in_progress.count | The total number of hints that are in progress. |
observIQ’s distribution is a game-changer for companies looking to implement the OpenTelemetry standards. The single line installer, seamlessly integrated receivers, exporter, and processor pool make working with this collector simple. Follow this space to keep up with all our future posts and simplified configurations for various sources. For questions, requests, and suggestions, reach out to our support team at support@observIQ.com.