The Stanza Story

Mike Kelly
Mike Kelly

We launched the Stanza log agent just over one year ago. Stanza results from an uncompromising stance on log telemetry performance, processing, and reconfigurability. It took mere days for friends and colleagues in the space to raise the apparent objection – there are already so many logging agents, so why spend time on a *new* one?

We also heard from colleagues who had a different take…

“We have Fluentd, Fluentbit, Logstash, Vector, and ultimately OpenTelemetry should be future for sending log data to observability system. So why on earth do we need another open source project to solve this already solved problem? Well another was launched today, have fun :)”

Jonah Kowall, CTO of via Twitter, July 21, 2020

I get it. I have had the same reaction to other ambitious entries into crowded spaces. Many open-source projects choose to reinvent the wheel for little to no benefit, so folks are rightly skeptical when a new project pops up out of nowhere. And while I don’t agree with most of this quote, it’s right about OpenTelemetry. A few months after this tweet, observIQ and OpenTelemetry announced that Stanza was chosen as the log engine for the OpenTelemetry collector. We couldn’t be more proud of that contribution; the collaboration benefits the entire industry.

So, a year later, I want to take a moment and share how we decided to start from scratch and built a log agent in a crowded space that ultimately became the agent of choice for some of the largest companies with some of the most demanding performance requirements in the world.

When we started building observIQ, we launched a log analysis platform after years of data pipeline and agent development work for the largest tech companies in the world. We had recently sold a significant business unit to VMware and were transitioning to our next phase. We spent over a decade building proprietary agents, integrations, and some of the most advanced telemetry technology in the industry. We used our benchmark for nearly every prominent open-source agent on the market. We even tried all the proprietary agents. Despite that, we still couldn’t solve our customers' most pressing log ingestion challenges. It always required a compromise. Do we want speed or configurability? Configurability or platform support? Platform support or installation simplicity? Every agent had its strengths, but all came with their share of weaknesses.

We started with Fluentd. Fluentd was, and is, a revolutionary agent that showed what can be accomplished with a focus on parsing community and a library of plugins for every log source you could imagine. Fluentd has been the measuring stick for all the agents that came after. However, the performance issues our customers experienced were well-known and a significant barrier to adoption—enough of a barrier that Fluent launched a separate project to provide a higher-performance option. However, we found that the high-performance options had their own set of limitations.

In early 2020, just before the entire world went into lockdown, a few of us on the team sat together in our office (soon to be abandoned) and came up with everything we wanted from a log agent. I no longer have the whiteboard photo, but I still have the notes. We left that conference room agreeing that we were tired of compromises. We wanted everything, and everything meant… well, it meant this:

  • High performance, low CPU, low memory usage
    • This was a blocker for too many of our customers with existing agents. They had massive deployments, and the CPU and memory had a material impact on their costs. We needed this solved.
  • Open source
    • No one wanted to use a closed-source agent anymore, and open-source was becoming a requirement across the industry. The team was excited to show what we could do after spending so much time building proprietary agents.
  • Simple installation with wide-ranging platform support and no dependencies
    • Installation was often overlooked, but simplicity has always been a core value at observIQ. We never felt good about the complex installation and limited platform support of other agents.
  • High throughput with multi-threaded support
    • A surprise at that time, even high-performance agents had significant limitations that prevented high log throughput
  • Easy to develop plugin framework (expressive, clear, and powerful) and a curated set of core integrations
    • Fluentd changed the game with their plugin framework. We wanted to take it further and simplify pipeline creation while maintaining the power. Including advanced parsing and manipulation at extremely high performance
    • We also recognized that an agent needs integrations. We didn’t want to launch our agent without a core set of high-quality integrations for all the most common log sources.
  • Alignment with modern telemetry movements
    • At the time, we saw where OpenTelemetry was headed, and it was exciting to see the collaboration of so many industry veterans. Standardizing telemetry helped everyone try to understand their systems, and we wanted to be a part of that. Throughout development, we wanted to be sure Stanza was compatible with OpenTelemetry and that we included all the best parts of the developing industry standard

It was an ambitious list. There was an internal debate about whether it was worth it. It could be more accessible to compromise and contribute a feature update here or there. We decided to move forward and set out to build it. After a few long months of development, we launched Stanza.

It was, as intended, the core agent of our new platform at observIQ. To get a hands-on impression, you can take our platform for a spin for free.

Along the way, we continued working with the team at OpenTelemetry, and they shared our interest in a logging solution without compromise. So, in January 2021, observIQ and OpenTelemetry announced that Stanza would be the engine behind OpenTelemetry’s log parsing and analysis, contributing to the final piece in the OTel trifecta of traces, metrics, and logs.

Today, we’re continuing to innovate in telemetry and analysis with a commitment to bring our Stanza advancements to OpenTelemetry, including new functionality like automated discovery, remote agent management, and alerting at the edge. We have a lot in store for the community, and we hope you’re as excited as we are.

Mike Kelly
Mike Kelly

Related posts

All posts

Get our latest content
in your inbox every week

By subscribing to our Newsletter, you agreed to our Privacy Notice

Community Engagement

Join the Community

Become a part of our thriving community, where you can connect with like-minded individuals, collaborate on projects, and grow together.

Ready to Get Started

Deploy in under 20 minutes with our one line installation script and start configuring your pipelines.

Try it now