Today, the Pants Project announced the release of Pants Build 1.0. Foursquare is a proud contributor to Pants, and we'd like to thank and congratulate our fellow contributors in the Pants community. Foursquare's developer workflow benefits greatly from Pants, especially because of Pants' caching of build artifacts, its dependency management, which enforces code hygiene, and its dependency graph, which allows easily compiling and testing all source affected by a change.
We have many extensions to Pants (generally in the form of Tasks), and the majority of them have caching for free right out of the box. Generally, a Jenkins CI worker runs a variety of jobs (lint, codegen, compile, etc) over each commit pushed upstream. These jobs benefit from the existing cache, so they run faster, and they also populate the cache with new results.
For developers iterating on a change, the majority of build results can be pulled down from the remote cache, so results only need to be recomputed for what the developer has changed. This reduces iteration time significantly and allows plugin developers to write more aggressive automated checks, since any given check will only need to be run incrementally. More aggressive automated checks, in turn, allow developers to focus on the aspects of programming that cannot be automated.
But the star of the caching story is Scala: Pants' Zinc/scalac wrapper emits artifacts to the remote cache as soon as each target is successfully compiled, and double checks the cache before committing to compiling a given target. This “eager write, lazy read" strategy, coupled with randomized target ordering in very large compiles, allows several CI jobs running in parallel over similar target sets to avoid duplicating effort. We now have a CI job that compiles the entire Scala codebase for every commit pushed upstream. In practice, this is only actually building a few dozen targets each time; the vast majority of the build is already cached. This job makes it easier to track down which commit broke the build, and keeps the remote cache fully populated so that developers never have to recompile code that they didn't change.
In a very large codebase, it is critical to keep internal code dependencies under control. Without automatic enforcement of dependency rules, it is almost impossible to prevent circular dependencies, hairballs, or bad cross dependencies (e.g. library code depending on test code). Pants keeps our code very granular (one build target per JVM package per directory) and allows us to enforce arbitrary graph rules with easy-to-write tasks. Buildgen ensures that our BUILD file dependencies always represent the true dependencies of the source code in the target. Developers know instantly — in the form of a build break — if they have introduced a dependency that violates one of our rules. Some examples:
- Library code may not depend on test code.
- Service A may not depend on the concrete implementation of Service B, only the interface.
- New services may not depend on known-bad “hairball" models. The models can't be killed off entirely, but they can at least be contained. (Internally, we've taken to describing code as being in a “hairball" when it is an extremely-tangled mess of dependencies.)
- Common code cannot depend on non-common code.
- Open source code cannot depend on closed source code.
The configuration for these rules is simple and flexible, and the check itself is almost instantaneous. These automatic checks keep the repo sanitized so that developers don't have to.
Likely the most frequently run Pants command at Foursquare is ./pants test-changed (and its equivalent for compilation, compile-changed), which underlies most of our convenience scripts and workflow guidelines for developers. test-changed takes advantage of the user's SCM (git in our case) and Pants' dependency graph to compile and test the targets that the user has changed locally. By default, it also compiles and tests the direct dependees of the changed targets. (If A depends on B, then we say A is a dependee of B). test-changed can optionally run against all transitive dependees, or no dependees at all.
This workflow shows off the power of Pants: with a single command, the user can compile and test exactly what could have been broken by their change — no more and no less. The developer doesn't need to worry about or git grep for what other code might depend on their change — all of the information necessary to confirm that their change is safe is automatically inferrable. And of course the result is cached locally, so further invocations (without more source changes) will be quick no-ops.
We maintain a loose source plugin in our repo for various custom tasks and targets. “Loose source" means that the Python sources sit directly in the repo and are run straight from code — no need to reinstall anything in order for changes to take effect. You can see some of our open source examples here. It is extremely valuable to be able to quickly prototype and deploy extensions to the build tool. From trivial tasks like linters to an alternative JVM dependency resolver, Pants allows us to easily inject our own logic into the build pipeline.
— Patrick Lawson and Mateo Rodriguez (@mateornaut) plus the rest of the Foursquare #build team
P.S. we're hiring!