Technical “How-To’s”

How to Monitor SQL Server with OpenTelemetry

Keith Schmitt
Keith Schmitt
Share:

We’re excited to announce the latest OpenTelemetry receiver contribution: SQL Server monitoring. All of our logging and monitoring contributions are available through the open source OpenTelemetry collector. You can check it out here!

You can utilize this receiver in conjunction with any OTel collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector.

Below are steps to get up and running quickly with observIQ’s distribution, and shipping SQL Server metrics to a popular backend: Google Cloud Ops. You can find out more on observIQ’s GitHub page: https://github.com/observIQ/observiq-otel-collector

What signals matter?

SQL Server is a Windows-based system designed to store and manage data. It’s a general-purpose system with a wide range of uses, so which signals are most important to you may vary. However, some common signals to look for are:

  • User Connections
    • Monitor the user connections, and prepare to increase load management during high volume.
  • Page Split Rates
    • Excess page splitting can cause excessive disk I/O and decrease performance
  • Lock Wait Rates
    • Monitor if resource usage is blocked by high volume – lift locks to increase access, but slow down performance

All of the above categories can be gathered with the SQL Server receiver – so let’s get started.

Before you begin

If you don’t already have an OpenTelemetry collector built with the latest SQL Server receiver installed, you’ll need to do that first. We suggest using observIQ’s distribution of the OpenTelemetry Collector that includes the SQL Server receiver (and many others) and is simple to install with our one-line installer.

Configuring the SQL Server receiver

Navigate to your OpenTelemetry configuration file. If you’re using the observIQ Distribution, you’ll find it in the following location:

  • C:\Program Files\observIQ OpenTelemetry Collector\config.yaml (Windows)

Edit the configuration file to include the SQL Server receiver as shown below:

yaml
1receivers:
2  sqlserver:
3    collection_interval: 10s

Don’t forget to also add SQL Server into your Service pipeline so it looks similar to the following. Note that your processors and exporters will likely be different.

yaml
1receivers:
2  sqlserver:
3    collection_interval: 10s
4
5exporters:
6  googlecloud:
7
8service:
9  pipelines:
10    metrics:
11      receivers: [sqlserver]
12      exporters: [googlecloud]

You can further edit the config file to include specific labels under the “processors” field, and export metrics to specific or multiple destinations under the “exporters” field.

Viewing the metrics collected

If you followed the steps detailed above, the following SQL Server metrics will now be delivered to your preferred destination.

MetricDescription
sqlserver.batch.request.rateNumber of batch requests received by SQL Server.
sqlserver.batch.request.rateNumber of SQL compilations needed.
sqlserver.batch.sql_recompilation.rateNumber of SQL recompilations needed.
sqlserver.lock.wait.rateNumber of lock requests resulting in a wait.
sqlserver.lock.wait_time.avgAverage wait time for all lock requests that had to wait.
sqlserver.page.buffer_cache.hit_ratioPages found in the buffer pool without having to read from disk.
sqlserver.page.checkpoint.flush.rateNumber of pages flushed by operations requiring dirty pages to be flushed.
sqlserver.page.lazy_write.rateNumber of lazy writes moving dirty pages to disk.
sqlserver.page.life_expectancyTime a page will stay in the buffer pool.
sqlserver.page.operation.rateNumber of physical database page operations issued.
sqlserver.page.split.rateNumber of pages split as a result of overflowing index pages.
sqlserver.transaction.rateNumber of transactions started for the database (not including XTP-only transactions).
sqlserver.transaction.write.rateNumber of transactions that wrote to the database and committed.
sqlserver.transaction_log.flush.data.rateTotal number of log bytes flushed.
sqlserver.transaction_log.flush.rateNumber of log flushes.
sqlserver.transaction_log.flush.wait.rateNumber of commits waiting for a transaction log flush.
sqlserver.transaction_log.growth.countTotal number of transaction log expansions for a database.
sqlserver.transaction_log.shrink.countTotal number of transaction log shrinks for a database.
sqlserver.transaction_log.usagePercent of transaction log space used.
sqlserver.user.connection.countNumber of users connected to the SQL Server.

observIQ’s distribution of the OpenTelemetry collector is a game-changer for companies looking to implement OpenTelemetry standards. The single line installer, seamlessly integrated receivers, exporter, and processor pool make working with this collector simple. Follow this space to keep up with all our future posts and simplified configurations for various sources. For questions, requests, and suggestions, reach out to our support team at support@observIQ.com.

Keith Schmitt
Keith Schmitt
Share:

Related posts

All posts

Get our latest content
in your inbox every week

By subscribing to our Newsletter, you agreed to our Privacy Notice

Community Engagement

Join the Community

Become a part of our thriving community, where you can connect with like-minded individuals, collaborate on projects, and grow together.

Ready to Get Started

Deploy in under 20 minutes with our one line installation script and start configuring your pipelines.

Try it now