* RFC 140
Initialized from 01948e0551
* Minor improvements
* Minor nits
* Update co-authors and add pre-RFC reviewers
* pkg-fun.nix -> package.nix
* Mid-sized refactor for improved clarity and incorporating feedback
In addition to some more minor changes and incorporating feedback, the
major changes are:
- Restructure the RFC into two separate parts, one to introduce
the convention and one to migrate packages to it when possible
- Remove the restriction that files inside a unit directory can only be
referenced by the corresponding `pkgs.${name}`. It feels very
unnatural to have this restriction and it's hard to reason about it.
Files inside a unit directory still can't reference anything _outside_
the unit directory, which is very similar to Nix's concept of allowed-uris,
which may be used to implement this check in the future.
- Remove the special case of allowing custom arguments. By not having
this one exception, users viewing a unit directory can be sure that
there's no hidden semantics anywhere (overriding arguments) and that the
functions arguments correspond directly to attributes in `pkgs`, no
exceptions that would require looking at `all-packages.nix`.
And it would be weird just to allow this one exception of
`callPackage` with custom arguments, when there's a lot of other
similarly small exceptions we could make, like allowing
`python3Packages.callPackage`.
- Remove the requirement that new packages must use this standard.
Especially with the above exception removed, this standard is now more
strict and less packages satisfy it by default.
A scenario could be that a user adds a new package, initially not
needing custom arguments, so CI requires it to be in `pkgs/unit`, but
then a custom argument is needed, so it must be moved out of there and
added to `all-packages.nix`. But then the custom argument can be removed,
so it _must_ be in `pkgs/unit` again. This sucks.
So let's keep `all-packages.nix` unrestricted, so a package won't have
to be moved back and forth like this.
* Re-add custom argument exception and various minor improvements
* shard distribution stats, cleanup, more uniformity
* Readd accidentally removed definition
* Mention package variants
* Minor moving and formatting
* Changes from feedback in the meeting
* Link to demonstration of cherry-picking without problems
* Link to demonstrates of problematic/non-problematic Git operations
* Names must be unique when lowercased
* Properly close invisible anchor
* Update summary to mention Nixpkgs more explicitly
* Include more arguments and counter-arguments for pkgs/unit alternatives
* Add shepherd team and nicks
* Convert frontmatter to a table
* Fix table rendering
* Minor fixups
Co-Authored-By: Robert Hensing <robert@roberthensing.nl>
* Explain unit and add more alternatives
* unit -> by-name, remove "standard"
And some very minor changes
* Apply suggestions from code review
Remove the barely used term "base directory"
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
---------
Co-authored-by: Robert Hensing <robert@roberthensing.nl>
Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
28 KiB
| feature | simple-package-paths |
|---|---|
| start-date | 2022-09-02 |
| author | Silvan Mosberger (@infinisil) |
| co-authors | Robert Hensing (@roberth) |
| pre-RFC reviewers | Thomas Bereknyei (@tomberek), John Ericson (@Ericson2314), Alex Ameen (@aakropotkin) |
| shepherd-team | @phaer @06kellyjac @aakropotkin @piegamesde |
| shepherd-leader | - |
| related-issues | https://github.com/NixOS/nixpkgs/pull/237439, https://github.com/NixOS/nixpkgs/pull/211832 |
Summary
Auto-generate trivial top-level attribute definitions in Nixpkgs' pkgs/top-level/all-packages.nix from a directory structure that matches the attribute name.
This makes it much easier to contribute new packages, since there's no more guessing needed as to where the package should go, both in the ad-hoc directory categories and in all-packages.nix.
Motivation
- It is not obvious to package contributors where to add files or which ones to edit. These are very common questions:
- Which directory should my package definition go in?
- What are all the categories and do they matter?
- What if the package has multiple matching categories?
- Why can't I build my package after adding the package file?
- Where in
all-packages.nixshould my package go?
- Figuring out where an attribute is defined is a bit tricky:
- First one has to find the definition of it in
all-packages.nixto see what file it refers to- On GitHub this is even more problematic, as the
all-packages.nixfile is too big to be displayed by GitHub
- On GitHub this is even more problematic, as the
- Then go to that file's definition, which takes quite some time for navigation (unless you have a plugin that can jump to it directly)
- It also slows down or even locks up editors due to the file size
nix edit -f . package-attrworks, though that's not yet stable (it relies on thenix-commandfeature being enabled) and doesn't work with packages that don't setmeta.positioncorrectly).
- First one has to find the definition of it in
all-packages.nixfrequently causes merge conflicts. It's a point of contention for all new packages
Detailed design
This RFC consists of two parts, each of which is implemented with a PR to Nixpkgs. These PR's should be done after a release to maximize the testing period and minimize merge conflicts.
PR 1: The directory structure
This part establishes the new directory structure in Nixpkgs. This directory structure is internal to Nixpkgs and not exposed as public interface. This directory structure must be documented in the Nixpkgs manual. This PR will be backported to the stable release in order to ensure that backports of new packages work.
File structure
Create the initially-empty pkgs/by-name directory in Nixpkgs, and migrate the hello package into it.
Check the following using CI:
pkgs/by-namemust only contain subdirectories of the form${shard}/${name}, called package directories.- The
name's of package directories must be unique when lowercased nameis a string only consisting of the ASCII charactersa-z,A-Z,0-9,-or_.shardis the lowercased first two letters ofname, expressed in Nix:shard = toLower (substring 0 2 name).- Each package directory must contain a
package.nixfile and may contain arbitrary other files.
Semantics
Introduce code to automatically define pkgs.${name} for each package directory as a value equivalent to
pkgs.callPackage pkgs/by-name/${shard}/${name}/package.nix { }
Optionally there may also be an overriding definition of pkgs.${name} in pkgs/top-level/all-packages.nix equivalent to
pkgs.callPackage pkgs/by-name/${shard}/${name}/package.nix args
with an arbitrary args.
Check the following using CI for each package directory:
pkgs.${name}is defined as above, either automatically or with someargsinpkgs/top-level/all-packages.nix.pkgs.${name}is a derivation.- The
package.nixfile evaluated frompkgs.${name}must not access files outside its package directory.
PR 2: Automated migration
Automatically migrate to new directory structure for all satisfiying definitions pkgs.${name}, meaning derivations defined as above using callPackage.
However automatic migration is only possible if:
- Files don't need to be changed, only moved, with the exception of
pkgs/top-level/all-packages.nix - The Nixpkgs package evaluation result does not change
All satisfying definitions that can't be automatically migrated due to the above restrictions will be added to a CI exclusion list. CI is added to ensure that all satisfying definitions except the CI exclusion list must be using the new directory structure. This means that the new directory structure becomes mandatory for new satisfying definitions after this PR. The CI exclusion list should be removed eventually once the non-automatically-migratable satisfying definitions have been manually migrated. Only in very limited circumstances is it allowed to add new entries to the CI exclusion list.
Non-automatic updates may also be done to ensure further correctness, such as
- GitHub's CODEOWNERS
- Update scripts like this
- The Nixpkgs manual like here
This PR will cause merge conflicts with all existing PRs that modify moved files, however they can trivially be rebased using git rebase && git push -f.
Because of this, merging of this PR should be widely announced with a pinned issue on the Nixpkgs issue tracker and a Discourse post.
Additionally this PR can benefit from being merged after a release due to the decreased PR count, leading to less conflicts.
Examples
To add a new package pkgs.foobar to Nixpkgs, one only needs to create the file pkgs/by-name/fo/foobar/package.nix.
No need to find an appropriate category nor to modify all-packages.nix anymore.
With some packages, the pkgs/by-name directory may look like this:
pkgs
└── by-name
├── _0
│ ├── _0verkill
│ └── _0x
┊
├── ch
│ ├── ChowPhaser
│ ├── CHOWTapeModel
│ ├── chroma
│ ┊
┊
├── t
│ └── t
┊
Interactions
Shard distribution
The sharded structure leads to a distribution as follows:
- There's 17305 total non-alias top-level attribute names in Nixpkgs revision 6948ef4deff7
- These are split into 726 shards
- The top three shards are:
- "li": 1092 values, coming from the common
libprefix - "op": 260 values
- "co": 252 values
- "li": 1092 values, coming from the common
- There's only a single directory with over 1 000 entries, which is notably GitHub's display limit, so this means only 92 attributes would be harder to see on GitHub
These stats are also similar for other package sets for if directory structure were to be adopted for them in the future.
Migration size
Due to the limitations of the new directory structure, only a limited set of top-level attributes can be automatically migrated:
- No attributes that aren't derivations like
pkgs.fetchFromGitHuborpkgs.python3Packages - No attributes defined using non-
pkgs.callPackagefunctions likepkgs.python3Packages.callPackageorpkgs.haskellPackages.callPackage. In the future we might consider having a separate namespace for such definitions.
Concretely this can be computed to be 81.2% (14036) attributes out of the 17280 total non-alias top-level Nixpkgs attributes in revision 6948ef4deff7.
And the initial automatic migration will be a bit more limited due to the additional constraints:
- No attributes that share common files with other attributes like
pkgs.readline - No attributes that references files from other packages like
pkgs.gettextThese attributes will need to be moved to the new directory structure manually with some arguably-needed refactoring to improve reusability of common files.
Package locations
nix edit and search.nixos.org will automatically point to the new location without problems, since they rely on meta.position to get the file to edit, which still works.
Git and NixOS release
- Backporting changes to moved files won't be problematic
git blamelocally and on GitHub is unaffected, since it follows file moves properly.
callPackage with nix-build --expr
A commonly recommended way of building current package directories in Nixpkgs is to use nix-build --expr 'with import <nixpkgs> {}; callPackage pkgs/applications/misc/hello {}'.
Since the path changes package.nix is now used, this becomes like nix-build --expr 'with import <nixpkgs> {}; callPackage pkgs/by-name/he/hello/package.nix {}', which is harder for users.
However, calling a path like this is an anti-pattern anyway, because it doesn't use the correct Nixpkgs version and it doesn't use the correct argument overrides.
The correct way of doing it was to add the package to all-packages.nix, then calling nix-build -A hello.
This nix-build --expr workaround is partially motivated by the difficulty of knowing the mapping from attributes to package paths, which is what this RFC improves upon.
By teaching users that pkgs/by-name/<shard>/<name> corresponds to nix-build -A <name>, the need for such nix-build --expr workarounds should disappear.
Manual removal of custom arguments
While this RFC allows passing custom arguments, doing so means that all-packages.nix will have to be maintained for that package.
In specific cases where attributes of custom arguments are of the form name = value and name isn't a package attribute, they can be avoided without breaking the API.
To do so, ensure that the function in the called file has value as an argument and set the default of the name argument to value.
This notably doesn't work when name is already a package attribute or when such a package is added later, because then the default is never used and instead overridden.
Package variants
Sometimes there's a need to create a variant of a package with different callPackage arguments. This can be achieved using .override as follows:
{
graphviz_nox = graphviz.override { withXorg = false; };
}
However this can cause problems with an overlay that tries to make the variant the default as follows:
self: super: {
# Oops, infinite recursion!
graphviz = self.graphviz_nox;
}
Because of this, there's the pattern of duplicating the callPackage call with the custom arguments as such:
{
graphviz_nox = callPackage ../tools/graphics/graphviz { withXorg = false; };
}
The semantics of how package directories are checked by CI do allow the definition of package variants from package directories:
{
graphviz_nox = callPackage ../by-name/gr/graphviz/package.nix { withXorg = false; };
}
Drawbacks
- This directory structure can only be used for top-level packages using
callPackage, so not for e.g.python3Packages.requestsor a package defined usinghaskellPackages.callPackage - It's not possible anymore to be a GitHub code owner of category directories.
- The existing categorization of packages gets lost. Counter-arguments:
- It was never that useful to begin with.
- The categorization was always incomplete, because packages defined in the language package sets often don't get their own categorized file path.
- It was an inconvenient user interface, requiring a checkout or browsing through GitHub
- Many packages fit multiple categories, leading to multiple locations to search through instead of one
- There's other better ways of discovering similar packages, e.g. Repology
- It was never that useful to begin with.
- This breaks
builtins.unsafeGetAttrPos "hello" pkgs. Counter-arguments:- We have to draw a line as to what constitutes the public interface of Nixpkgs. We have decided that making attribute position information part of that is not productive. For context, this information is already accepted to be unreliable at the language level, noting the
unsafepart of the name. - Support for this could be added to Nix (make
builtins.readDirpropagate file as a position)
- We have to draw a line as to what constitutes the public interface of Nixpkgs. We have decided that making attribute position information part of that is not productive. For context, this information is already accepted to be unreliable at the language level, noting the
Alternatives
An alternative to the pkgs/by-name location
Context: this directory contains the shards, which contain the package directories. We could move the shards to a different location.
Alternatives:
- Use
by-namein the root directory instead- (+) This is future proof in case we want to make the directory structure more general purpose
- (-) We don't yet know if we want that, so this is out of scope for now
- (+) This is future proof in case we want to make the directory structure more general purpose
- Use
pkgsinstead, so that the${shard}'s are siblings to the other current directories inpkgssuch astop-level, with the intention that the other directories would be hopefully removed at some point, then only leaving the shards inpkgs- (+) If we remove the other directories at some point, only the
${shard}'s will be left inpkgs - (-) This leads to ambiguities between the directories from the new directory structure and the other directories, requiring special handling in the code and CI, leading to complexities.
- (-) This makes it hard to pick out the few non-shard directories in directory listings since they will be interleaved with the ~700 shards.
- (-) This would be harder to document and explain to people, since one always has to disregard all non-sharded directories, with no obvious justification
- (-) Currently we cannot apply this directory structure to all definitions in
pkgs, in particular nested packages likepythonPackages.*, non-callPackage'd definitions likecopyDesktopItemsand non-derivations likefetchFromGitHub. Depending on how we want to handle those, it might make more sense to keeppkgs/by-nameor to usepkgsdirectly once all legacy paths are migrated away to another top-level directory, we don't yet know.pkgs/by-namewill be easier to migrate topkgsthan the other way around though. - (-) Causes poor auto-completion for the existing directories
- (+) If we remove the other directories at some point, only the
- A variation of the above that improves on this is altering the shards to be prefixed with
_so that they're always ordered together and not interleaved with non-shards. Non-shards would still be at the bottom of file listings though, but at least together. It shares the same other problems however. pkgs/unit: This was the name initially used by the RFC untilby-namewas proposed and favored.- (+) It's not associated with any pre-existing assumptions about what it means, which should cause people unfamiliar with this directory structure to read the documentation.
- (-) This is however also a disadvantage, the name doesn't inform people anything about what it does
- (-) Systemd also has the term "unit", which could be confused with this
- (+) It makes sense to view package directories as units, because they are discrete entities distinct from other entities of the same type
- (+) We envision that in the future we could extend the directory structure to not just include a package definition for each directory, but also other parts such as NixOS modules, library components, tests, etc. In this case
unitwould fit even better and could be described asA collection of standardized files related to the same software component
- (+) It's not associated with any pre-existing assumptions about what it means, which should cause people unfamiliar with this directory structure to read the documentation.
- Various other proposals:
pkgs/auto,pkgs/pkg,pkgs/mod,pkgs/component,pkgs/part,pkgs/comp,pkgs/app,pkgs/simple,pkgs/default,pkgs/shards,pkgs/top,pkgs/main- (-) Generally all of these names have some pre-existing assumptions about them, causing potential confusion when used for this concept
pkgs/default: Could be interpreted to be some Nix-builtin magic that defaults to that folder. Could also be interpreted as "this is where the default packages go", which then raises the question "which packages are part of the default ones?"pkgs/shards: The sharding is a self-evident implementation detail, it shouldn't be repeatedpkgs/simple: Implies that there's a complicated way to declare packages, which there currently is, but it's something we should get away from. If we migrate everything, simple wouldn't mean anything anymore.pkgs/top: Easily confusable withpkgs/top-level, thoughtopwould make sense otherwise if we eventually moved all top-level packages to there.- We could consider moving
pkgs/top-levelto another location then, e.g.pkgs/package-sets.
- We could consider moving
pkgs/main: "If these are the main packages, where do the others go? What even is a main package?". Also could be confused with an entry-point
packages/${shard}- (+) Provides a clean starting point without having to be close to the legacy structure
- (-) This would be very confusing to newcomers because there's now both a
pkgsand apackagesdirectory in the Nixpkgs root, both spelled the same but very different contents.
pkgs/_- (+) Very short, fast to type (though that can depend on the keyboard layout)
- (+) Avoids naming discussions, because there is no name
- (-) Naming things is hard, but we shouldn't avoid the problem by giving it no name, which is arguably the worst name
- (-) Looks hacky and internal
- (+) Looks temporary, intention to move to
pkgsitself once everything is sharded- (-) It shouldn't be temporary. While we do hope to migrate all packages to some sharded form at some point, this may never happen, or the direction is completely changed, and this may take years to form.
Alternate shard structure
Context: The structure is pkgs/by-name/${shard}/${name} with shard being the lowercased two-letter prefix of name.
Alternatives:
- A flat directory, where
pkgs.hellowould be inpkgs/by-name/hello.- (+) Simpler for the user and code.
- (-) The GitHub web interface only renders the first 1 000 entries when browsing directories, which would make most packages inaccessible in this way.
- (+) This feature is not used often.
- (-) A poll showed that about 41% of people rely on this feature every week.
- (+) This feature is not used often.
- (-) Bad because it makes
gitand file listings slower.
- Use three-letter or four-letter prefixes.
- (-) Also leads to directories containing more than 1 000 entries, see above.
- Use multi-level structure, e.g. a two-level two-letter prefix structure where
hellois inpkgs/by-name/he/ll/hello- (+) This would allow virtually a unlimited number of packages without performance problems
- (-) It's hard to understand, type and implement, needs a special case for packages with few characters
- E.g.
xcould go inpkgs/by-name/x-/--/x
- E.g.
- (-) There's not enough packages even in Nixpkgs that a two-level 4-letter structure would make sense. Most of the structure would only be filled by a couple entries.
- (-) Even Git only uses 2-letter prefixes for its objects hex hashes
- Use two-letter prefixes split into two directories, like
pkgs/by-name/h/e/hello- (+) Allows easy traversal by clicking on GitHub file listings, shard directories being limited to under 40 children
- (-) Requires special-casing single-letter attribute names
- (+) There's currently only 6 such cases, which could be handled on a one-off basis
- (-) Makes auto-completion worse, having to tab-complete once more
- (-) Makes it harder to create shards: if a shard doesn't exist yet, it has to be created with either one or two
mkdir's, or amkdir -p
- Use a dynamic structure where directories are rebalanced when they have too many entries.
E.g.
pkgs.foobarcould be inpkgs/by-name/f/foobarinitially. But when there's more than 1 000 packages starting withf, all packages starting withfare distributed under 2-letter prefixes, movingfoobartopkgs/by-name/fo/foobar.- (-) The structure depends not only on the name of the package then, making it harder to find packages again and figure out where they should go
- (-) Complex to implement
Alternate package.nix filename
Context: The only file that has to exist in package directories is package.nix, it must contain a function suitable for callPackage.
Alternatives:
default.nix- (+)
default.nixis already a convention most people are used to. - (-) We don't benefit from the usual
default.nixbenefits:- Removing the need to specify the file name in expressions, but this does not apply because this file will be imported automatically by the code that replaces definitions from
all-packages.nix.- (+) But there's still some support for
all-packages.nixfor custom arguments, which requires people to type out the name- (-) This is hopefully only temporary, in the future we should fully get rid of
all-packages.nix
- (-) This is hopefully only temporary, in the future we should fully get rid of
- (+) But there's still some support for
- Removing the need to specify the file name on the command line, but this does not apply because a package function must be imported into an expression before it can be used, making
nix build -f pkgs/by-name/hell/helloequally broken regardless of file name.
- Removing the need to specify the file name in expressions, but this does not apply because this file will be imported automatically by the code that replaces definitions from
- (-) Not using
default.nixfrees updefault.nixfor an expression that is actually buildable, e.g.(import ../.. {}).hello, although we don't yet have a use case for this that isn't covered bynix-build ../.. -A <attrname>. - (-) Using
default.nixwould tempt users to invokenix-build ., which wouldn't work and making package functions auto-callable is a known anti-pattern.
- (+)
pkg-fun[c].nix- (+) Makes a potential transition to a non-function form of packages in the future easier.
- (-) There's no problem with introducing versioning later with different filenames.
- (-) We don't even know if we actually want to have a non-function form of packages.
- (-) Abbreviations are a bit jarring.
- (+) Makes a potential transition to a non-function form of packages in the future easier.
Filepath backwards-compatibility
Context: The migration moves files around without providing any backwards compatibility for those moved paths.
Alternative:
- Have a backwards-compatibility layer for moved paths, such as a symlink pointing from the old to the new location, or for Nix files even a
builtins.trace "deprecated" (import ../new/path).- (-) It would give precedent to file paths being a stable API interface, which definitely shouldn't be the case (bar some exceptions).
- (-) Leads to worse merge conflicts as the transition is happening, since Git would have to resolve a merge conflict between a symlink and a changed file.
Don't allow custom arguments
Context: It's possible to override the default { } argument to callPackage by manually specifying the full definition in all-packages.nix
The alternative is to not allow that, requiring that pkgs.${name} corresponds directly to callPackage pkgs/by-name/${shard}/${name}/package.nix { }.
- (-) It's harder to explain to beginners whether their package can use the new directory structure or not
- (+) The direct correspondance ensures that the package directory contains all information about the package, which is very intuitive
- (-) We're not at the point where we can have that though, custom arguments don't have a good replacement yet
- (-) If a package previously didn't need custom arguments, it would be moved to the new directory structure. But when the need for a custom argument arises, it then requires moving it out from new directory structure and into the freeform structure of
pkgs/again. - (+) It's easier to relax restrictions than to impose new ones
Reference check
Context: There's a requirement to check that package directories can't access paths outside themselves.
Alternatives:
- Don't have this requirement
- (-) Doesn't discourage the use of file paths as an API.
- (-) Makes further migrations to different file structures harder.
- Make the requirement also apply the other way around: Files outside the package directory cannot access files inside it, with
package.nixbeing the only exception, and only for the one attribute inall-packages.nix- (-) Enforcing this requires a global view of Nixpkgs, which is nasty to implement
- (-) Package variants would not be possible to define
Allow callPackage arguments to be specified in args.nix
Context: Custom callPackage arguments have to be added to all-packages.nix
Alternative: Expand the auto-calling logic according to: Package directories are automatically discovered and transformed to a definition of the form
# If args.nix doesn't exist
pkgs.${name} = pkgs.callPackage ${packageDir}/package.nix {}
# If args.nix does exists
pkgs.${name} = pkgs.callPackage ${packageDir}/package.nix (import ${packageDir}/args.nix pkgs);
- (+) It makes another class of packages uniform, by picking a solution with restricted expressive power.
- (-) It does not solve the contributor experience problem of having too many rules.
args.nixis another pattern that contributors need to learn how to use, as we have seen that it is not immediately obvious to everyone how it works.- (+) A CI check can mitigate the possible lack of uniformity, and we see a simple implementation strategy for it.
- (-) Complicates the directory structure with an optional file
Unresolved questions
Future work
All of these questions are in scope to be addressed in future discussions in the Nixpkgs Architecture Team:
- Expose an API to get access to the package functions directly, without calling them
- Add a meta tagging or categorization system to packages as a replacement for the package categories. Maybe
meta.tagswithsearch.nixos.orgintegration. Maybe https://repology.org/ integration. See also https://github.com/NixOS/rfcs/pull/146. - Making the filetree more human-friendly by grouping files together by "topic" rather than technical delineations. For instance, having a package definition, changelog, package-specific config generator and perhaps even NixOS module in one directory makes work on the package in a broad sense easier.
- This RFC only addresses the top-level attribute namespace, aka packages in
pkgs.<name>, it doesn't do anything about package sets likepkgs.python3Packages.<name>,pkgs.haskell.packages.ghc942.<name>, which may or may not also benefit from a similar auto-calling - Improve the semantics of
callPackageand/or apply a better solution, such as a module-like solution - Potentially establish an updateScript standard to avoid problems like, relates to Flakes too
- What to do with different versions, e.g.
wlroots = wlroots_0_14? This goes into version resolution, a different problem to fix - What to do about e.g.
python3Packages.callPackage? This goes into overrides, a different problem to fix - What about aliases like
jami-daemon = jami.jami-daemon? - What about
recurseIntoAttrs? Not single packages, package sets, another problem