Rust doesn’t have namespaces in its package management system. It’s often viewed as a bug. But it’s not a bug, it’s a feature! While there are negative aspects of a flat package registry, there are also real benefits. Stability, continuity, and unity (discourages forks and fragmented identity). Proposals that seek to add namespacing without addressing the positive aspects they remove probably won’t be accepted.
I have noticed the benefits of the current system seem to only get mentioned in passing as objections to proposals and never outlined anywhere. This is an attempt to fix that by summarizing points raised across the various proposals I’ve read. While I don’t represent the crates.io team (I’m not even on the team) I hope to accurately represent trade-offs being considered.
Aspects of Registry Structure
How identity works in package management has far-reaching consequences. Most of the namespace proposals I’ve seen have been motivated by trying to address squatting and/or tweaking the current system of package identity. However, the structure of the crates.io registry affects more than just those areas. But we’ll start with the basics of identity and work from there.
How do you refer to a package? A crate has at least three identities I can think of:
- The name on crates.io - there is exactly one crate per name
- The name used in Cargo.toml - there is exactly one crate per name
- The default name used in code - there is can be more than one per name, which is rare in practice
- The actual name used in code - this can be controlled through Cargo.toml or externs statements, but renaming isn’t required
The first two are called the
package.name in Cargo.toml of the crate being published. The third can be overridden via
lib.name in the package being published. The last is user-controlled. Usually, all of these names match, with the caveat that dashes are underscores in code (and crates.io doesn’t allow two crates with identical crates.io names after normalizing dashes to underscores).
Arguably, self-explanatory identities have a leg up on other identities from a discoverability perspective. E.g.
argparse probably seems more reputable at first glance than
clap if you’re going of name alone.
A flat registry makes identity management (naming a crate) harder. You either have to pick a GUID (haha, please don’t) or some memorable (but probably mostly or completely unrelated) identity. I see this as the main driving force for proposals seeking to add namespaces or otherwise address squatting.
Currently, identity is continuous - a crate’s identity is immutable and that has real benefits. If you want to change the identity system at all you’ve got to ensure that identities don’t change out from under you. This is a strike against any namespace system that allows namespace ownership to unexpectedly change. Discontinuous identity has a couple of issues.
First, if a crate’s name can change, that’s bad for users. They have to go figure out the new name of the package if they want to update.
Second, if an identity’s crate can change (a consequence of the previous point if identities are reusable), then you’ve introduced a security vulnerability. Updating to a new package version with different content under different ownership is a real security risk. Doubly so if you don’t ban new minor versions on the last major version after an unintentional ownership change. Should people audit their crate? Yes! But the fewer foot-guns we have the better.
In addition to preventing security issues, proposals need to encourage transitions over transactions. Gradual moves over all-or-nothing moves. This could be seen more as compatibility than continuity. This drives things like the rust editions and the need for namespace proposals to be backward compatible.
A core tenet of Rust is stability. The obvious definition is that things that compile yesterday should compile today (even with a new compiler).
A less obvious definition is that adding new dependencies shouldn’t stop you from compiling already working code. This is a major motivation for the orphan rule (though the orphan rule is more nuanced than that). This is a strike against schemes that encourage multiple distinct crates to have the same default name in code. I don’t think any proposal that encourages this could be approved. It also suggests that we ought to ban new instances of a crates default code name deviating from its package/Cargo.toml name.
In addition to code stability, crates.io should be stable too. It should be able to isolate itself from outside services. It currently depends on Github, but it doesn’t have to. This is a strike against any system that weds namespace identity to any externally managed system.
The current identity system encourages squatting. I would define squatting as reserving a crate without actually using it. This is a natural outcome in the Rust ecosystem for a couple of reasons:
- Crates are easy to publish, so it’s easy to reserve a crate by publishing an empty crate
- We have de-facto namespaces using prefixes -
serde-*is one example.
- There can only ever be one version of a package name. There is only one
httpcrate for example. So package names are a scarce resource.
There’s a lot of squatting on crates.io. I don’t have any hard numbers though.
We don’t have any structured support for squatting either, which makes it hard to separate bad actors from good actors. I think separating them out would require manual intervention, and the crates.io team is small and doesn’t have a lot of time to put towards that.
So what’s a bad actor? I consider someone who squats a bunch of crates to make or point or prevent their names from being used to be a bad actor. Crates.io has a policy against using automation to claim ownership of crates, but I haven’t heard much about it being enforced. Again, time is an issue. And this would probably extend to namespace ownership as well.
I do think there are legitimate uses. Reserving a set of extension crates for a project you’re working on is one example. My other example is reserving a crate you genuinely intend to work on (this one is a little more debatable).
It’s about the Trade-offs
With this system in mind, it’s hard to come up with a namespace proposal that doesn’t hurt the current system in some way.
I’d say the biggest tension is in code-level identities. They become either overlapping by chopping the namespace portion off or much longer by including the namespace portion. The previous points show why overlapping is generally considered a non-starter. And for longer identities, you need to come up with some new way to reference namespaces in code that’s doesn’t break things.
More to the point though, from what I’ve seen, most people proposing identity schemes propose it so that the ecosystem can support the overlapping crate identities (e.g. multiple http crates). And that’s exactly what the current system avoids doing.
Here’s an on-point quote from CAD97 summing up (hopefully fairly accurately) what namespaces mean to the parties involved:
To the crates team, it seems to primarily mean that a project can put multiple packages together under an umbrella, such that you know the packages are for-sure by the project.
To the people who feel most slighted by the crates team’s approach here, namespaces primarily mean the ability to publish a crate with a desired name even if there’s already a package published that provides a crate with that name, by putting the package into a different namespace such that the names do not clash.
If the latter party asks about “namespaces” and means the latter, and the team answers and means the former, you can see where the miscommunication enters, especially since the crates team has now communicated here the position that generally, the
package.name == lib.namefalsehood should not be made more false; i.e., the latter group’s goal is an explicitly non-desired property from the crates team.
Where to Now?
There have been other proposals for namespaces that are less about overlapping identity and more about curation and grouping related crates together. There are multiple proposals there, each with their trade-offs. Expect a follow-up post discussing some of those.
Appendix A: Highlights
There’s a nice set of a dozen or so comments I save as I reviewed past discussions:
Carol10Cents (of the crates.io team):
sgrif (on the crates.io team):
withoutboats (on the crates.io team):
ag_dubs (on the crates.io team):
pietroalbini (on the crates.io team):
Random meeting notes:
Appendix B: Previous Discussions
There have been several attempts at this:
- https://internals.rust-lang.org/t/pre-rfc-idea-cratespaces-crates-as-namespace-take-2-or-3/11320 (This was mine from last year after reading through the other proposals; my take on things have shifted a little since then)
And the subjects of namespacing and squatting been broached in major threads about crates.io policies:
2020-09-10 23:10 +0000