The Observability Blog

Categories:
  • Metrics
  • OpenTelemetry

How to monitor MongoDB with OpenTelemetry

Deepa Ramachandra headshot
by Deepa Ramachandra on
May 19, 2022

MongoDB is a document-oriented and cross-platform database that maintains its documents in the binary encoded JSON format. Mongo’s replication capabilities and horizontal capability using sharding make MongoDB highly available. An effective monitoring solution can make it easier for you to identify issues with MongoDB such as resource availability, execution slowdowns, and scalability.  

observIQ recently built and contributed a MongoDB metric receiver to the OpenTelemetry contrib repo. You can check it out here

You can utilize this receiver in conjunction with any OTel collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector.

Below are steps to get up and running quickly with observIQ’s distribution, shipping MongoDB metrics to any popular backend.  You can find out more in observIQ’s github page: https://github.com/observIQ/observiq-otel-collector.  

You can find OTel config examples for MongoDB and many other applications shipping to Google Cloud here. 

Let’s get started!

What signals matter?

The most important MongoDB related metrics to monitor are:

  • The status of processes and memory utilization: Monitoring MongoDB’s server processes help identify slowness in its activity or health. Unresponsive processes during command execution are an example of a scenario that needs further analysis. The  mongodb.collection.count  metric helps determine the stability, restart numbers, and backup performance related to the collections in that MongoDB instance.  The mongodb.data.size gives the value of the storage space consumed by the data in your current MongoDB instance.
  • Operations and connections metrics: When there are performance issues in the application, it is necessary to rule out if the issue is stemming from the database layer. In this case, monitoring the connections and operations patterns become very critical. Metrics such as mongodb.cache.operations and mongodb.connection.count give insights into the connections’ operation and count. By monitoring the operations, you can draw a pattern and set thresholds and alerts for those thresholds.
  • Query Optimization: For a query, the MongoDB query optimizer chooses and caches the most efficient query plan given the available indexes. The evaluation of the most efficient query plan is based on the number of “work units” ( works ) performed by the query execution plan when the query planner evaluates candidate plans. For instance, metrics such as mongodb.global_lock.time show the trends in lock time for query optimization.

Before creating your configuration, you should have the observIQ OpenTelemetry collector installed. For installation instructions and the latest version of the collector check our GitHub repo.

Configuring the mongoDB receiver

After the installation, the config file for the collector can be found at:

  • C:\Program Files\observIQ OpenTelemetry Collector\config.yaml (Windows)
  • /opt/observiq-otel-collector/config.yaml(Linux)

Let’s begin with the configuration for the receiver.

  • Here we set up the host as the endpoint, essentially the IP address and port of the Mongo system.
  • For all configurations using the Google Cloud Operations as an endpoint, the collection interval is set to 60s, as that is the requirement.
  • Disable TLS. This is done to remove any restriction from TLS to transmit the metrics data to the third party, in this case Google Cloud Operations.
receivers:
 mongodb:
   hosts:
     - endpoint: 127.0.0.1:27017
   collection_interval: 60s
   # disable TLS
   tls:
     insecure: true

Next up, the processors:

Please note that these processors are optional. You may choose to use any of the available processors documented here. 

  • Use the resourcedetection processor to create a unique identifier for each mongoDB instance monitored using this configuration.
  • Use the normalizesums processor to average the initial metrics received for better visualization. To learn more about this processor, check here.
  • Use the batch processor to collate the metrics from multiple receivers and send them to the exporter destination. We recommend using this processor with all receiver configurations, when applicable. To know more about this processor check here.
processors:
 # Resourcedetection is used to add a unique (host.name)
 # to the metric resource(s), allowing users to filter
  # between multiple agent systems.
 resourcedetection:
   detectors: ["system"]
   system:
     hostname_sources: ["os"]

 resourceattributetransposer:
   operations:
     - from: host.name
       to: agent

 normalizesums:

 batch:

In this example, we are showing you a sample config for exporting metrics to Google Cloud. However, you may choose to export the metrics to any of the available destinations documented here. The configuration below exports the metrics to Google Cloud.

exporters:
 googlecloud:
   retry_on_failure:
     enabled: false
   metric:
     prefix: workload.googleapis.com


Finally, set up the pipeline

service:
  pipelines:
    metrics:
      receivers:
     - mongodb
      processors:
     - resourcedetection
     - resourceattributetransposer
     - normalizesums
     - batch
      exporters:
     - googlecloud

Viewing the metrics collected

The following metrics are fetched using the configuration above:

MetricDescription
mongodb.cache.operations

The number of cache operations of the instance.

mongodb.collection.count

The number of collections.

mongodb.data.size

The size of the collection. Data compression does not affect this value.
mongodb.connection.count

The number of connections.

mongodb.extent.count

The number of extents

mongodb.global_lock.time

The time the global lock has been held.

mongodb.index.count

The number of indexes

mongodb.index.size

Sum of the space allocated to all indexes in the database, including free index space.

mongodb.memory.usage

The amount of memory used.

mongodb.object.count

The number of objects.

mongodb.operation.count

The number of operations executed.

mongodb.storage.size

The total amount of storage allocated to this collection.

To view the metrics follow the steps outlined below:

  1. In the Google Cloud Console, head to metrics explorer
  2. Select the resource as a generic node.
  3. Follow the namespace equivalent in the table above and filter the metric to view the chart.

observIQ’s distribution is a game-changer for companies looking to implement the OpenTelemetry standards. The single line installer, seamlessly integrated receivers, exporter, and processor pool make working with this collector simple. Follow this space to keep up with all our future posts and simplified configurations for various sources. For questions, requests, and suggestions, reach out to our support team at support@observIQ.com.