This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.
- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support
Fixes: #13084, #12671, #11748, #12403, #5947
Instead of specifying env variables all the time
we can instead embed the __asan_default_options symbol
in all executables / shared objects. This reduces code
duplication.
Add a new S3BinaryCacheStore implementation that inherits from
HttpBinaryCacheStore.
The implementation is activated with NIX_WITH_CURL_S3, keeping the
existing NIX_WITH_S3_SUPPORT (AWS SDK) implementation unchanged.
Introduce a new build option 'curl-s3-store' for the curl-based S3
implementation, separate from the existing AWS SDK-based 's3-store'.
The two options are mutually exclusive to avoid conflicts.
Users can enable the new implementation with:
-Dcurl-s3-store=enabled -Ds3-store=disabled
Realisations are conceptually key-value pairs, mapping `DrvOutputs` (the
key) to information about that derivation output.
This separate the value type, which will be useful in maps, etc., where
we don't want to denormalize by including the key twice.
This matches similar changes for existing types:
| keyed | unkeyed |
|--------------------|------------------------|
| `ValidPathInfo` | `UnkeyedValidPathInfo` |
| `KeyedBuildResult` | `BuildResult` |
| `Realisation` | `UnkeyedRealisation` |
Move ParsedS3URL from s3.cc/.hh into dedicated s3-url.cc/.hh files.
This separates URL parsing utilities (which are protocol-agnostic) from
the AWS SDK-specific S3Helper implementation, making the code cleaner
and enabling reuse by future curl-based S3 implementation.
Fewer macros is better!
Introduce a new `JsonChacterizationTest` mixin class to help with this.
Also, avoid some needless copies with `GetParam`.
Part of my effort shoring up the JSON formats with #13570.
These stragglers have been accidentally left out when implementing the StoreConfig::getReference.
Also HttpBinaryCacheStore::getReference now returns the actual store parameters, not the cacheUri
parameters.
Old code is now just used for `nix build` --- there is no CLI breaking
change.
Test the new format, too.
The new format is not currently used, but will be used going forward,
for example in the C API.
Progress on #13570
This brings them in line with the other tests, and furthers my goals of
separating unit test data from code.
Doing this cleanup as part of my #13570 effort, but strictly-speaking,
this is separate as these data types' JSON never contained and store
paths or store dirs, just simple output name strings.
Enables builds with ASAN to catch memory corruption
bugs faster and in CI. This is an incredibly valuable
instrument that must be used as much as possible.
Somewhat based on jade's work from Lix, though there's a lot that
we have to do differently:
19ae87e5ce
Co-authored-by: Jade Lovelace <lix@jade.fyi>
See #13570 for details --- the idea is that included the store dir in
store paths makes systematic JSON parting with e.g. Serde, Aeson,
nlohmann, or similiar harder.
After talking to Eelco, we are changing the `Derivation` format right
away because not only is `nix derivation` technically experimental, we think it is
also less widely used in practice than, say, `nix path-info`.
Progress on #13570
This implements a special back-compat shim to specifically allow
unbracketed IPv6 addresses in store references. This is something
that is relied upon in the wild and the old parsing logic accepted
both ways (brackets were optional). This patch restores this behavior.
As always, we didn't have any tests for this.
Addresses #13937.
This is relied upon (specifically the `local` store) by existing
tooling [1] and we broke this in 3e7879e6df (which
was first released in 2.31).
To lessen the scope of the breakage we should not normalize "auto" references
and explicitly specified references like "local" or "daemon". It also makes
sense to canonicalize local://,daemon:// to be more compatible with prior
behavior.
[1]: 05e1b3cba2/lib/NOM/Builds.hs (L60-L64)
Looking at perf:
0.21 │ push %rbp
0.99 │ mov %rsp,%rbp
│ push %r15
0.25 │ push %r14
│ push %r13
0.49 │ push %r12
0.66 │ push %rbx
1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
Seems like 64K is too much to have on the stack for each invocation, considering
that only a minuscule number of allocations are actually larger than 4K.
There's actually no good reason this function should use so much stack space. Or
use small_string at all. Everything can be done in small chunks that don't require
any memory allocations and use up 2K bytes on the stack.
This patch also adds a microbenchmark for tracking the unparsing performance. Here
are the results for this change:
(Before)
BM_UnparseRealDerivationFile/hello 7275 ns 7247 ns 96093 bytes_per_second=232.136Mi/s
BM_UnparseRealDerivationFile/firefox 40538 ns 40376 ns 17327 bytes_per_second=378.534Mi/s
(After)
BM_UnparseRealDerivationFile/hello 3228 ns 3218 ns 215671 bytes_per_second=522.775Mi/s
BM_UnparseRealDerivationFile/firefox 39724 ns 39584 ns 17617 bytes_per_second=386.101Mi/s
This translates into nice evaluation performance improvements (compared to 18c3d2348f):
Benchmark 1: GC_INITIAL_HEAP_SIZE=8G old-nix/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.111 s ± 0.021 s [User: 2.513 s, System: 0.580 s]
Range (min … max): 3.083 s … 3.143 s 10 runs
Benchmark 2: GC_INITIAL_HEAP_SIZE=8G result/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.037 s ± 0.038 s [User: 2.461 s, System: 0.558 s]
Range (min … max): 2.960 s … 3.086 s 10 runs
See the new extensive doxygen in `url.hh`.
This fixes fetching gitlab: flakes.
Paths are now stored as a std::vector of individual path
segments, which can themselves contain path separators '/' (%2F).
This is necessary to make the Gitlab's /projects/ API work.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
This systematizes the way our s3:// URLs are parsed in filetransfer.cc.
Yoinked out and refactored out of [1].
[1]: https://github.com/NixOS/nix/pull/13752
Co-authored-by: Bernardo Meurer Costa <beme@anthropic.com>
Compilers in nixpkgs have caught up and major distros
should also have recent enough compilers. It would be
nice to have newer features like more full featured
ranges and deducing this.
The problem with old code was that it used getUri for both the `diskCache`
as well as logging. This is really bad because it mixes the textual human
readable representation with the caching.
Also using getUri for the cache key is really problematic for the S3 store,
since it doesn't include the `endpoint` in the cache key, so it's totally broken.
This starts separating the logging / cache concerns by introducing a
`getHumanReadableURI` that should only be used for logging. The caching
logic now instead uses `getReference().render(/*withParams=*/false)` exclusively.
This would need to be fixed in follow-ups, because that's really fragile and
broken for some store types (but it was already broken before).
Rather than having store implementations return a free-form URI string,
have them return a `StoreReference`. This reflects that fact that this
method is supposed to invert `resolveStoreConfig`, which goes from a
`StoreReference` to some `StoreConfig` concrete derived class (based on
the registry).
`StoreConfig::getUri` is kept only as a convenience for the common case
that we want to immediately render the `StoreReference`.
A few tests were changed to use `local://` not `local`, since
`StoreReference` does not encode the `local` and `daemon` shorthands
(and instead desugars them to `local://` and `unix://` right away). I
think that is fine. `local` and `daemon` still work as input.
Previously `getUri` didn't include store query parameters,
`ssh-ng` didn't include any information at all and the local
store didn't have the path:
```
$ nix store info --store "local?root=/tmp/aaa&require-sigs=false"
Store URL: local
Version: 2.31.0
Trusted: 1
$ nix store info --store "ssh-ng://localhost?remote-program=nix-daemon"
Store URL: ssh-ng://
Version: 2.31.0
Trusted: 1
$ nix store info --store "ssh://localhost?remote-program=nix-store"
Store URL: ssh://localhost
```
This commit changes this to:
```
$ nix store info --store "local?root=/tmp/aaa&require-sigs=false"
Store URL: local?require-sigs=false&root=/tmp/aaa
Version: 2.31.0
Trusted: 1
$ nix store info --store "ssh-ng://localhost?remote-program=nix-daemon"
Store URL: ssh-ng://localhost?remote-program=nix-daemon
Version: 2.31.0
Trusted: 1
$ nix store info --store "ssh://localhost?remote-program=nix-store"
Store URL: ssh://localhost?remote-program=nix-store
```
The implicit dependency on refLength (which is the StorePath::HashLen)
is not good. Also the companion tests and benchmarks are already in libstore-tests.