Gazelle (Bazel): Loading other BUILD files

While I'm still a newbie regarding Bazel, one of the main caveats of the system is that still lacks documentation about many topics, or at least I find myself ending up digging into the source code to learn how things work, due to no better alternatives. So I'll try to write from time to time about my findings with Bazel and its tooling. As in this case, with the official BUILD file generator, Gazelle.

The Context

Gazelle traverses project folders and generates rules. Both actions are done in depth-first post-order. On the other hand, setting up the configuration is done from the root to the leaves: First you calculate the root configuration, then gets propagated down (inherited by children nodes), and potentially modified via Gazelle directives (configuration in the form of special annotations).

What the previous brief summary means, among other things, is that when working with Gazelle extensions you can kind of guess at which point a certain BUILD file is when being traversed by Gazelle, but you should in general be very careful to not rely on that history. But at the same time, you can rest assured that, no matter which GenerateRules call you are in, your Config will always have already passed through at minimum the root node's Configure step.

The Problem

Recently, I was in a situation where I wanted to read certain rules from a BUILD file, from other places. An example:

/folder-a/BUILD
/folder-b/BUILD   <- our target
/folder-c/BUILD
/folder-d/subfolder-i/BUILD
BUILD
WORKSPACE

"I want to use the rules from folder-b/BUILD from folder-a, folder-c, folder-d/subfolder-i, and the like"

And I knew the following:

  • That specific BUILD file is in a known path, and won't move
  • That specific BUILD file is manually maintained
  • That specific BUILD file is not on the root directory

So we shouldn't rely on Gazelle's tree walker because depending on where we want to use those rules maybe the file might not be yet read... Or maybe you calculate the traversal path and assume that as of today it will be read before, but tomorrow Gazelle folks implement a multi-threaded parallel traversal and then everything breaks again...

All of this sums up to the fact that, despite being the most common vessel for state transfer, storing the rules from that specific BUILD file in the Config is not a viable approach.

The Solution

As we know exactly where the file resides, and we know that it will always exist, one approach that we can take is to read and parse the file. And Gazelle already knows how to read BUILD files and parse them as ASTs, so it we can have some code like the following:

import (
  // ...
  "github.com/bazelbuild/bazel-gazelle/rule"
)

// ...

loadFolderBRules := func() {
  // `c` is the "master" `Config`, only available at a few methods
  filePath := path.Join(c.RepoRoot, "folder-b", "BUILD")
  fileContent, fileErr := os.ReadFile(filePath)
  // fileErr error handling should go here

  targetData, dataError := rule.LoadData(filePath, "", []byte(fileContent))
  // dataErr error handling should go here

  // For example, let's go through the rules
  for _, rule := range targetData.Rules {
      // Now we can store relevant info from the rule
      // `config` is the config passed to this specific node
      // You should also change your extension's logic to store `myProperty`
      //  and propagate it to its children
      config.myProperty[rule.Name()] = rule.Kind()
  }
}

In the example we can see that we obtain a nice File struct named targetData, populated with the parsed contents of the BUILD file.

Now if we place the previous loadFolderBRules method inside the Configure implementation:

loadFolderBRules := func() {
  // previous code here
}

// Do not place the `rel` check inside `f`
//  in case for some reason there is no root BUILD file
if f != nil {
  // ...
}

// This will happen before any `GenerateRules` call, 
//  because `rel` will equal `""` when traversing the root node
if rel == "" {
  loadFolderBRules()
}

We have loaded that special BUILD file from folder-b only once, stored our desired data in the configuration, and that configuration will be automatically propagated to all the descendants. To use it, you just need to access the configuration parameter that GenerateRules will receive, and use the myProperty map.

Remarks

The rel == "" trick might not be the cleanest approach, but it is the best I could come up with, as Gazelle on its current version lacks any kind of extension hook for when it begins to work.

Storing the data in the node config might also sound like a lot of unnecessary copying, but as mentioned in one of the comments, the "master config" is only available to a few methods (Configure is one of them). It would be a better destination for data that you read once and never mutate, but as of now can't be accessed from GenerateRules, so can't use it.

Gazelle (Bazel): Loading other BUILD files published @ . Author: