Complimentary Gartner® Report! 'A CTO's Guide to Open-Source Software: Answering the Top 10 FAQs.'Read more
BindPlane OP

Gateways and BindPlane

Dylan Myers
Dylan Myers
Share:

The BindPlane Agent is a flexible tool that can be run as an agent, a gateway, or both. As an agent, the collector will be running on the same host it's collecting telemetry from, while a gateway will collect telemetry from other agents and forward the data to their final destination.

Here are a few of the reasons you might want to consider inserting Gateways into your pipelines:

  • Collection nodes do not have access to the final destination
  • Limiting credentials for the final destination to a small subset, the gateways, of systems to reduce vulnerability of exposure
  • If the gateways are on a cloud instance, ability to use instance level credentials - such as for Google Cloud destinations
  • Offloading data processing (parsing) to the gateways to prevent overloading of collection nodes
  • Apply universal parsing to data streams coming from disparate devices
  • Provide correlation to those data streams, such as trace sampling

Today, we will examine these reasons and some possible architectures for implementing gateways. We will also review how they appear in BindPlane.

Prerequisites

  • BindPlane OP
  • Several BindPlane agents used as edge collectors
  • One or more BindPlane agents used as gateway(s)
  • A final destination, such as Google Cloud Logging/Monitoring/Trace
  • (Optional) Load balancer for the gateways if using more than one

Starting Point

As a starting point, I am using the OpenTelemetry microservices demo running on GKE. In addition to this, I’ve created both a deployment and a daemonset of the BindPlane agent in the same cluster. For configuration, I have the generic OTel collector from the demo forwarding all data to my BindPlane daemonset. This is a sort of gateway in and of itself, but is only needed because I am not managing the embedded collector from the demo with BindPlane.


The daemonset configuration consists of a Kubernetes Container source, a Kubernetes Kubelet Source, and an OTLP source. The OTLP source is the endpoint for the data from the embedded generic collector.

Agent architecture without an gateway being used
Here, we're showing the architecture for our telemetry pipeline without a gateway.

And here's the configuration for BindPlane OP.

BindPlane OP configuration without a gateway
Here is what our configuration for the DaemonSet looks like in BindPlane OP.

Moving To Single Node Gateway Model

We’re going to add a BindPlane agent into the pipeline as a gateway. Here is what the final architecture will look like.

Architecture with a single gateway node.

In order to convert this setup from direct to destination to an gateway model, I start by copying the configuration.

Choose the hamburger menu and then "Duplicate current version"
Choose the hamburger menu and then "Duplicate current version"

Once I’ve created the duplicate configuration, I edit it to remove all the processors and replace all the destinations with a single OTLP destination. The processor removal isn’t required. However, I am doing it to illustrate the ability to offload such processing from the edge nodes to the gateway(s). Typically, gateway nodes are dedicated systems that are well-provisioned and do nothing else. Due to those higher resources dedicated entirely to the gateway agent, performing all processing on them is often desirable. This has the added benefit of simplifying the configurations present on the edge nodes.

BindPlane OP configuration after changing the destination to point to the gateway.
Here's the updated configuration pointing to the OTLP Gateway

Configure the OTLP Destination

Finally, the Gateway configuration with an OTLP source
Finally, the Gateway configuration with an OTLP source

In the above screenshots, we are exporting to a single gateway on the IP 10.128.15.205. This gateway is configured with an OTLP source, the destinations previously configured on the pods, and also the processors that we removed from the pods.

Edit Source: OpenTelemetry (OTLP)
Edit Source: OpenTelemetry (OTLP)

Using this model, we have successfully offloaded both credentials and processors from the edge nodes. This reduces our vulnerability of credential exposure by having them present on only a single system. It also reduces the workload on the k8s pods of our edge nodes.

For simplicity and brevity, I showed the configuration of a single node gateway in this section. However, I did not apply these configurations and start the data flow. I will show the data flow at the end of the entire blog.

Related Content: Configuration Management in BindPlane OP

Moving To Multi-Node Gateway Model

The multi-node gateway model is the same as a single node, with the exception of adding a load balancer and more nodes running the gateway configuration.

A Multi-node Gateway Model with a Load Balancer.
A Multi-node Gateway Model with a Load Balancer.

Moving to this model can be done directly from edge node or single gateway node models. Since I wanted to show both models in this blog, I am moving from the previously demonstrated gateway model.

The gateway configuration does not need any changes. It just needs to be applied to one or more additional nodes. In my case, I have a 3-node set. In front of them sits a load balancer forwarding port 4317, the GRPC OTLP port.

A single minor change does need to be made to the edge configuration. This will replace the IP of the single node, 10.128.15.205, with the ip of the load balancer, 10.128.15.208.

Related Content: A Step-by-Step Guide to Standardizing Telemetry with the BindPlane Observability Pipeline

Verifying Data Flow

Now that everything is set up and running, we can check one of our gateway nodes to validate that data is flowing.

BindPlane OP showing data flowing successfully.
BindPlane OP showing data flowing successfully.

From the above screenshot, we can see that telemetry is flowing through our pipeline. We can toggle between logs, metrics, and traces to validate we’re seeing all three signals. For a final validation, we could also check our destinations. I’ll skip that today, as it has been covered in several previous posts.

Next Steps

Now that we have shifted final destinations and data processing to a gateway set, we could add additional processing to the gateway configuration.

Any time a destination change is needed, it will only affect these few nodes. The new configuration for such a change could be rolled out very fast.

Additional data inputs could be added to this configuration for direct-to-gateway sources such as syslog, raw tcp logs and metrics, and native OTLP trace instrumented applications. This sort of change would further offload work from your edge nodes.

Architectures

Today, we’ve examined two simple architectures and discussed ways to enhance them. However, other architectures for gateway exist. Touching on these briefly, with the two we examined today at the top of the list, we have:

  • Single node gateway - ideal for small environments with a limited number of edge nodes
  • Multi-node load balanced gateway set - scalable and ideal for most enterprise environments
  • Multi-layer, multi-node gateway sets - For very large enterprise environments
    • Has a gateway set per data center, region, or other division point
    • These initial gateway sets will perform data processing, offloading
    • The destination for these gateways will be a final load balanced gateway set that performs the transmission to the final destination(s)
      • Ideally, in this large environment, the final gateway set will be distributed across multiple locations, and an intelligent load balancer will sit in front, directing traffic to the closest healthy nodes.
    • Offers the most redundancy and data safety
  • Traffic director gateway sets - For data segregation
    • This could be a multi-layer gateway
    • An initial gateway set figures out where traffic belongs and forwards that traffic
    • Each directed traffic destination can go directly to the final destination or a destination gateway set in the multi-layer style above as appropriate for the volume of traffic

There are likely other setups that I have yet to consider or think of, but these are the ones we see most frequently.

Conclusion

BindPlane provides users with a robust data management environment, and gateways are one of the most important tools in the arsenal. As seen today, there are many ways in which gateways can help protect your credentials, correlate, process, and route your data. Gateways provide much-needed flexibility to your data pipeline, especially when combined with other tools, creativity, and intelligent deployment strategies.


Dylan Myers
Dylan Myers
Share:

Related posts

All posts

Get our latest content
in your inbox every week

By subscribing to our Newsletter, you agreed to our Privacy Notice

Community Engagement

Join the Community

Become a part of our thriving community, where you can connect with like-minded individuals, collaborate on projects, and grow together.

Ready to Get Started

Deploy in under 20 minutes with our one line installation script and start configuring your pipelines.

Try it now