Should be pretty self-explanatory. We didn't really have unit tests
for the filesystem source accessor. Now we do and this will be immensely
useful for implementing a unix-only smarter accessor that doesn't suffer
from TOCTOU on symlinks.
Error messages now include suggestions like:
error: unknown compression method 'bzip'
Did you mean one of bzip2, gzip, lzip, grzip or lrzip?
Also a bit of progress on making the compression code use less stringly
typed compression type, which is good because it's easy to confuse
which strings are accepted where (e.g. Content-Encoding should be able
to accept x-gzip, but it shouldn't be exposed in NAR decompression and
so on). An enum cleanly separates the concerns of parsing strings / handling
libarchive write/read filters.
Good to explicitly declare things to not accidentally do twice the work by
preventing that kind of misuse.
This is essentially just cppcoreguidelines-special-member-functions lint
in clang-tidy.
Fetching gcc-15.2.0.tar.gz I get a warning about UTF8 archive names. This
now mentions problematic pathnames.
warning: getting archive member 'gcc-15.2.0/gcc/testsuite/go.test/test/fixedbugs/issue27836.dir/Äfoo.go': Pathname can't be converted from UTF-8 to current locale.
warning: getting archive member 'gcc-15.2.0/gcc/testsuite/go.test/test/fixedbugs/issue27836.dir/Ämain.go': Pathname can't be converted from UTF-8 to current locale.
Also apparently libarchive depends on locale (yikes). Fixing reproducibility issues
that stem from this is a separate issue. At least having the warning actually mention
the pathname should be useful enough even though it's not actionable.
At least using the default locale yields something sane:
builtins.readDir "${gcc}/gcc/testsuite/go.test/test/fixedbugs/issue27836.dir"
{
"Äfoo.go" = "regular";
"Ämain.go" = "regular";
}
The fact that we were introducing a conversion from the output of `nix
path-info` into the input of `builtins.fetchTree` was the deciding
factor. We want scripting outputs into inputs like that to be easy.
Since JSON strings and objects are trivially distinguishable, we still
have the option of introducing the JSON format as an alternative input
scheme in the future, should we want to. (The output format would still
be SRI in that case, presumably.)
This was lost after 2.32 while making the accessor lazy. We can restore the support
for it pretty easily. Also this is significant optimization for nix nar cat.
E.g. with a NAR of a linux repo this speeds up by ~3x:
Benchmark 1: nix nar cat /tmp/linux.nar README
Time (mean ± σ): 737.2 ms ± 5.6 ms [User: 298.1 ms, System: 435.7 ms]
Range (min … max): 728.6 ms … 746.9 ms 10 runs
Benchmark 2: build/src/nix/nix nar cat /tmp/linux.nar README
Time (mean ± σ): 253.5 ms ± 2.9 ms [User: 56.4 ms, System: 196.3 ms]
Range (min … max): 248.1 ms … 258.7 ms 12 runs
The whole NarAccessor -> listing -> lazy NarAccessor is very weird. Source
can now be seek-ed over when supported, so we can support it pretty easily.
Alternatively we could also make it single-pass very easily with a custom
FileSystemObjectSink. It will get removed in a follow-up commit anyway.
We have the machinery to make a more informative error, telling the
user what format was actually encountered, and not just that it is not
the format that was requested.
Fix#14532.
As discussed on the call today:
1. We'll stick with `format = "base16"` and `hash = "<hash>"`, not do
`base16 = "<hash>"`, in order to be forward compatible with
supporting more versioning formats.
The motivation we discussed for someday *possibly* doing this is
making it easier to write very slap-dash lang2nix tools that create
(not consume) derivations with dynamic derivations.
2. We will remove support for non-base16 (and make that the default, not
base64) in `Hash`, so this is strictly forward contingency, *not*
yet something we support. (And also not something we have concrete
plans to start supporting.)
The interrupting code is no longer relevant. Since
054be50257 logging no longer checks for interrupts
and in general logging should be noexcept.
Co-authored-by: Alois Wohlschlager <alois1@gmx-topmail.de>
Cherry-picked-from: https://gerrit.lix.systems/c/lix/+/1097
Instead we can just seek back in the file - duh. Also this makes use
of the anonymous temp file facility, since that is much safer (no need
window where the we don't have an open file descriptor for it).
Implements a safe no symlink following primitive operation for opening file descriptors.
This is unix-only for the time being, since windows doesn't really suffer from symlink
races, since they are admin-only.
Tested with enosys --syscall openat2 as well.