Multi-Project Routing For Google Cloud
When sending data to Google Cloud, like logs, metrics, or traces, it can be beneficial to split the data up across multiple projects. This division may be necessary since each team has its own project, a central project is used for security audit logs, or for any other reason that your organization has. BindPlane has effective tools to manage this process. In this walkthrough, we will add fields to telemetry entries, allowing us to associate entries with a specific project and properly route them.
Prerequisites
- BindPlane OP
- At least 2 Google Cloud projects that you want to split telemetry among
- Permission to create service accounts, either using Google IAM or Workload Identity Federation
- Criteria on how to split your telemetry, based on what is within the telemetry itself
Getting Started
To get started, we first need to establish the criteria for the different backend projects. For this blog, we will be monitoring three log files on a Fedora Linux VM: /var/log/messages
, /var/log/secure
, and /var/log/firewalld
. We will be routing logs from /var/log/secure to one project, all audit logs from /var/log/messages to the secure project, everything from /var/log/firewalld to a different project, SELinux logs from /var/log/messages to the same project as the firewalld logs, and then everything else to the “default” project. The final configuration will look like this:
- Project dylan-alpha
- Default project for everything that doesn’t get routed elsewhere
- Project dylan-beta
- Audit level logs.
- /var/log/secure
- “audit:” and “audit[\d]:” pattern matching from /var/log/messages
- Audit level logs.
- Project dylan-gamma
- Non-audit security type logs
- /var/log/firewalld
- “SELinux” pattern matching from /var/log/messages
- Non-audit security type logs
These projects are already set up and ready to go. Later in the blog, we will set up the credentials to allow cross-project data sending.
Preparing Google Projects
Now that I’ve defined my criteria and have projects, I need to prepare the Google projects. The first step is to set up a service account in project dylan-alpha, as it will function as my primary account. I have decided to grant it full permissions required for logs, metrics, and traces, despite the fact that I am currently only sending logs. You can review the credentials requirements here.
Within IAM settings in your Google Cloud Console, navigate to Service Accounts.
Now, click Create Service Account. For the name of the service account, I suggest using 'telemetry-input.
After generating a service account, copy the generated email address for later use.
Next, click on the menu consisting of three dots and then select the option Manage keys.
When you reach the key management screen, you should generate a JSON key file and download it.
Once the project is switched to dylan-beta, we navigate to the main IAM page and click Grant Access with the copied email address and service key.
This should open a sidebar where you can grant access by entering an email address. Paste the copied address for the service account and assign the required permissions for your telemetry type(s). Finally, click save.
To create more projects, simply follow the same steps that were taken for dylan-beta. Repeat the process as many times as necessary.
Data Flowing
To prepare our data for the Google Cloud Projects, we need to create a configuration in BindPlane and deploy an agent to our system with that configuration.
To do this, click on the Configurations link on the top bar in BindPlane. Then, click the Create Configuration. On the next screen, give your configuration a name and choose the OS type you're using. For example, you could name it Multi-Project-Routing and choose Linux.
By clicking the next button, we will be taken to a screen where we can add sources. To do this, we need to click on Add Source.
As I want to monitor log files on my system, I will select 'File' from the resulting list. I will then input the path values as follows: /var/log/messages, /var/log/secure, and /var/log/firewalld.
Additionally, under the 'Advanced' settings, I have enabled the options for creating attributes of the file name (which is on by default) and file path (which is off by default).
After you have finished configuring the source, click on the Save button followed by the Next button. This will take you to the page where you need to define your destinations. Since this blog focuses on Google Cloud, we will be selecting the Google Cloud destination.
To authenticate, we will be using the json authentication method. Open the JSON service key file that you downloaded earlier and copy its contents into the appropriate box. Don't forget to give this destination a name and enter "dylan-alpha" in the Project ID box.
After configuring my desired settings on the Destination, I click Save twice, resulting in the creation of my configuration and pipeline as shown in the screenshot below.
To install an agent, I first click on the Agents link on the top bar. This will take me to the Agents page where I can click on the Install Agent button. On the next page, I select Linux as my operating system and choose my preferred configuration from the drop-down menu. This generates a one-liner that I can use to install the agent.
Once the installation is complete, I need to go back to the configuration and click the Start Rollout. This will deploy the configuration to the agent, and I should start receiving telemetry data.
Getting Telemetry Where It Belongs
Now that telemetry is flowing to Google from our agent, all is good. Right? Well, no. Right now, everything is flowing to the dylan-alpha project.
To fix the issue, we need to go to the configuration page and add some processors to enhance the logs with metadata for multi-project routing.
First, we need to click on the processor icon on the left side, which is closer to the source. We will use the Add Fields processor twice- once for routing to dylan-beta, and once for routing to dylan-gamma. Using conditionals, we can select the telemetry on which the processor operates. For the first processor, we set the conditional to: (attributes["log.file.path"] == "/var/log/secure") or (IsMatch(body, "^\\w+\\s+\\d+\\s+\\d{2}:\\d{2}:\\d{2}\\s+\\w+\\s+audit(?:\\[\\d+\\])?:.*$")). Under the Attributes section below, we add a new field named gcp.project.id, and set its value to dylan-beta. For the second processor, we do the same thing with a different conditional: (IsMatch(body, ".*SELinux.*")) or (attributes["log.file.path"] == "/var/log/firewalld"), and the value of the attribute is dylan-gamma. The completed processors can be seen in the screenshots below.
After saving these processors, return to the main configuration page. Then, select the right-hand processor icon closer to the destination and add a Group By Attributes processor. Set the attribute field to gcp.project.id
.
This is everything that is needed to route the data to the correct destination projects. However, there’s one more step that should be taken. The “default” project should act as a safety measure for anything that is missing the metadata needed to route it to another project. Since all projects have some basic logs related to the project coming in, I use the Add Fields processor to add a new attribute called no_project
with a value of true
. The conditional for this processor is set to: (resource.attributes["gcp.project.id"] == nil)
.
This allows me to search for telemetry from this agent that doesn’t have this project intentionally set.
Save these processors, and click the Start Rollout button. Once the rollout is complete, and enough time has elapsed for new logs to have been transmitted, we can see that all three projects have the logs that belong to them.
Conclusion
It is possible to perform multi-project routing for the Google Cloud Destination by using just a few simple processors to enrich the logs with a special resource attribute. You can also apply these same techniques to other processors to either enrich or reduce your data for any purpose. This method is also effective when you are using Workload Identity Federation, although the credential steps will differ. We will cover the use of WIF to authenticate in place of a service account in a future blog post once we have added official support for it.