The Observability Blog

Categories:
  • Uncategorized

What Is the OpenTelemetry Project and Why Is It Important?

Joe Howell headshot
by Joe Howell on
May 18, 2021

The OpenTelemetry project is an ambitious endeavor with of goal of bringing together various technologies to form a vendor neutral observability platform. Within the past year, many of the biggest names in tech provide native support within their commercial projects.

Formed through a merger of the OpenTracing and OpenCensus projects under the Cloud Native Computing Foundation (CNCF), OpenTelemetry (powered by observIQ’s log agent, Stanza) aims to make rich data collections across applications easier and more consumable.

What Is the OpenTelemetry Project?

OpenTelemetry is an ecosystem of instrumentation libraries and tools that are used to generate, collect, process, and export telemetry data for analysis. OpenTelemetry helps you better understand your software behavior and performance.

OpenTelemetry provides a standard using open-source software to produce metrics from cloud-native infrastructure and applications. Using language-specific instrumentation libraries, signals can be exported directly from applications. Alternately, the OpenTelemetry Collector can capture signals from web frameworks, RPC systems, storage clients, and other applications in use.

OpenTelemetry Examples

OpenTelemetry can be used to capture logs, metrics, and traces. This allows you to observe the state of microservices within applications. You can trace requests as they flow between microservices, capture related events, and track resource usage of shared systems.

Another way OpenTelemetry is being used is to identify potential constraints, to create tiered requests within applications so shared resources can be prioritized.

The OpenTelemetry ecosystem makes real-life tasks easier. Here some examples:

  • Adding custom attributes to automatically track spans, thereby making it easy to query data
  • Filtering out synthetic traffic
  • Identify long-running tasks 
  • Segment telemetry data using resource APIs

Common Use Cases of OpenTelemetry

  1. Event driven tracing: In an event driven architecture, where the senders and receivers are decoupled, there is a certain level of ambiguity in terms of the receiver’s response to a request, especially when there is a high volume of requests sent and received. Tracing every touchpoint in an event, in most scenarios, but particularly in high pressure scenarios like new feature releases or version upgrades, is critical. The challenge of implementing tracing begins with the highly coupled yet distributed nature of microservices based applications, in unison with REST call extensibility. There’s only one right way to go about solving this: creating standardization for distributed tracing, namely OpenTelemetry. With OpenTelemetry, it is possible to retrace a transaction across a distributed system and study its course to improve or fix its propagation for system performance and efficiency. Capturing the runtime state of every event in an application, makes troubleshooting quicker. Each touchpoint in a trace is tagged with an ID, so the SRE is made aware of which point in the trace made way to an application issue/failed request. OpenTelemetry offers graphical views of a request’s trajectory. With these graphical views, the SRE is able to do more than just collate data and follow an event’s trajectory, by drilling down to the finest details of event driven traces. Github is creating custom internal libraries for OpenTelemetry that would help developers add tracing easily to the application.  The OpenTelemetry community is constantly adding custom tracing for most used libraries. 
  1. Event-driven Observability Pipeline: It takes a myriad of tools and applications to make an application operate at the expected level of reliability and security. Log agents collate a large volume of data, which in turn broadens the possible attack surface on individual services. This leads to a log agent fatigue, making businesses scramble for a solution to consolidate and reuse their agents, and sidecars. Then there is the concern of data capacity anxiety with limited prioritization of data inflow. A refined workflow would include suppressing unwanted noise, and aggregating only the important event data. As a best practice, Ops and SRE are constantly told to apply some foresight into the possible issues and ensure that the data from the problem area is collected, but to facilitate this, a tool that can create an event driven observability pipeline is required. OpenTelemetry offers a sophisticated process for creating event driven observability pipelines.
  1. Using OpenTelemetry to manage AWS Lambdas: Lambdas is AWS’s way of handling small functions in an application without invoking the whole underlying infrastructure. Although this makes for a very efficient operation, Lambdas are not malleable and they do not give users the ability to have much control over them. The solution to this is using the AWS Distro for OpenTelemetry specifically built for Lambdas. ADOT works similar to a dynamic library that is available to Lambda at runtime. ADOT has the most recent version of the OpenTelemetry Collector, which the observIQ agent aligns with architecturally. 
  2. Observability for Java applications using OpenTelemetry Java: OpenTelemtry Java is a Java specific implementation of OpenTelemetry. There are two ways to instrument observability for Java applications, one is through the Manual process using the OpenTelemetry API and SDK. The second process is auto-instrumentation using the OpenTelemetry Java agent. 

Importance of the OpenTelemetry Project

Modern cloud-native applications and data are distributed. This can make it difficult to compile the data you need into a single source. OpenTelemetry solves this problem by tracing and extracting data cross-platform. By standardizing the way telemetry data is collected and transmitted, it creates a common instrumentation format across various services.

Managing the performance of these complex and diverse environments has become a significant concern for development teams. It takes instrumentation for all of your frameworks and libraries, across multiple programming languages, to understand the collective behavior of all your services and applications.

Without such a standard, teams may be left with data silos or blind spots that negatively impact troubleshooting. OpenTelemetry make the detection and resolution of problems easier. With complete interoperability, it provides a standard form of instrumentation across all of your services.

Why Use OpenTelemetry

By providing a standard for observability for native cloud applications, OpenTelemetry can significantly reduce the amount of time developing and implementing mechanisms to collect application telemetry. This frees up developers to spend time working on enhancing features.

The Benefits of Open Telemetry

OpenTelemetry is used by developers to examine features and finds bugs. It provides several important benefits, including:

  • The flexibility to change backends without having to change instrumentation
  • Creating a single set of standards to allow you to work with more vendors, projects, and platforms
  • Simplifying telemetry data management and export
  • Installing and integrating OpenTelemetry is often as simple as dropping in a few lines of code
  • Avoids being locked into vendor configuration and roadmap priorities
  • When new technologies emerge, you don’t have to wait for vendor support for instrumentation

Broad Language Support

OpenTelemetry provides broad language support, including Java, JavaScript, Python, Go, C++, C#, Rust, and Erland/Elixir. It also integrates with most popular libraries and frameworks, including:

  • Akka
  • ASP.NET
  • Core
  • Django
  • Express
  • Flask
  • Gorilla/mux
  • Kafka
  • JDBC
  • Jetty
  • MySQL
  • net/http
  • PostgreSQL
  • RabbitMQ
  • Redis
  • Spring
  • WSGI

OpenTelemetry Architecture Components

OpenTelemetry is made up of a series of components, including:

  • APIs – A core component, APIs are language-specific, such as Python, Java, .Net, etc.) and providing the basic pathway for adding OpenTelemetry to applications
  • SDK – A language-specific SDK acts as the bridge to deliver data gathered from the AP and the Exporter. SDKs allow for configuration, including transaction sampling or request filtering.
  • Exporters – OpenTelemetry Exporters let you configure where you want the telemetry sent. They decouple instrumentation from the backend configuration to make it easier to switch backends.
  • Collector – The OpenTelemetry Collector is optional but helps create a seamless solution. It creates greater flexibility for sending and receiving telemetry within the backend. The Collector utilizes two models for deployment. You can use either an agent residing on the application host or implement a standalone process separate from the application itself.

What Is OpenTracing?

OpenTracing was a project undertaken by CNCF in 2016. It aimed to provide vendor-neutral ways of managing distributed tracing to help developers trace requests from start to finish.

The OpenCensus project, which Google made open source in 2018 after using it internally for their Census library, had much the same application.

observIQ Contributes Log Agent to the OpenTelemetry Project

observIQ’s Stanza Log Agent is a key part of the OpenTelemetry project. The log management platform is the first to take full advantage of the log agent technology that has been incorporated into OpenTelemetry.

observIQ accelerated the OpenTelemetry Project by contributing the Stanza open-source agent to the project as part of its effort to enable high-quality telemetry for all. Stanza is a small footprint, high capability, log shipping agent. It uses roughly 10% of the CPU and memory of other popular log agents.

You can sign up for a free trial of observIQ to see Stanza in action as a native component.

If you’re interested in taking part in the OpenTelemetry Project, join the GitHub community to get started.