Rust 2021

Posted on Oct 3, 2020

This is a contribution to the Rust 2021 Roadmap Blogs request.

As a developer using Rust in a proprietary, polyglot code base, I want Rust to improve non-Cargo builds

My story comes from spending the better part of a year migrating a roughly 700,000 line code base (and several hundred upstream dependencies) to build using Bazel circa 2019. As Rust becomes more popular, a large chunk of developers will want to use it as a complement to existing code in other languages. A lot of these developers will already be using other build systems like CMake and Bazel. A developer in such a situation has 2 options:

  1. Use Cargo just for the Rust specific parts, and have the other build system simply call into Cargo.
  2. Spend a lot of effort integrating rustc into their build system, solving all the problems that Cargo solves, so they can continue to use existing workflows throughout the build.

The former isn’t great because it trips up the main build system’s job server and makes it more difficult for the build system to track changes in inputs and outputs, since it can only observe the outputs of the final crate and loses all granularity for intermediate steps, thus being forced to always spawn Cargo. In addition, for tools that provide common workflows regardless of language (such as bazel test), it can break such workflows.

In addition, while Cargo is the officially supported and gold standard for pure-Rust projects, I’d posit that it has its own deficiencies, and there is no need for Cargo to solve them, when other build systems already have. This includes things like remote caching, selective testing and remote execution. These features can be useful even for open source projects, where things like CI can share cache to be faster and cheaper. Enabling Rust projects to easily adopt build systems like Bazel or Buck and benefit from their head start is a worthy goal.

The Rust team is generally cognizant of and has already put in a lot of work towards working better with other build systems. Nevertheless, there are areas for improvement.

Cargo today does 2 different things: 1. Build system - A cargo build is equivalent to running make or bazel, and Cargo decides which crates need to be rebuilt and then schedules them to optimize compilation time. 2. Package management - It determines what dependencies are required to build a crate, and fetches them, generating a lockfile.

Improvements to building

Using rustc at a crate level is already possible from other build systems. There are some technicalities that make it difficult to build arbitrary crates, and to get the full cross-crate build speedups that Cargo can achieve, that I would like to see fixed.

Move away from build scripts

This is possibly the biggest challenge for those attempting to use the crate ecosystem outside of Cargo. I’m talking about scripts and their ability to do anything. The most common one I hit is from the proc-macro2 crate, which uses a feature that is always emitted from and which has to be worked around. I’d like to see dependencies become declarative. I’d like to see a move away from patterns like rerun-if-changed, which are impossible to express in build systems like Bazel where the dependencies must be known prior to even the first build (except in limited C/C++ cases). When scripts try to find and link against native libraries, there should always be an easy way to short circuit all those checks and simply link against a known library. The documentation and conventions should encourage crate authors to support such workflows.

Document implicit cargo<->rustc contracts

Cargo and rustc rely on some conventions right now that are not well specified or documented. Alternate build implementations have to go read the source code or infer behavior from limited -C help output. This should be better specified. For example, it is not clear exactly what external inputs affect incremental compilation. Pipelined compilation is another area which works with Cargo, but can’t be implemented in language agnostic build systems in the current model. It would be nice to see an interface better suited to these systems.

Improvements to package management

A combination of cargo vendor and cargo metadata have generally solved the package management problem. It is easy for external tools to run these and use the information to generate relevant build configuration. My requests here are only around specification and documentation.

It would be good to have documentation and guides about using these tools with common build systems like CMake or Bazel, including links to existing community tooling like cargo-raze and rules_rust so that newcomers to Rust are aware of their options.

I would like to see the exact semantics of dependency resolution specified so that the same resolution mechanism can be implemented as an API, either in Rust or in the language a build tool is written in, and these can be run independently of Cargo as long as the toml files are present.