S3 buckets support object versioning to prevent unexpected changes,
but Nix previously lacked the ability to fetch specific versions of
S3 objects. This adds support for a `versionId` query parameter in S3
URLs, enabling users to pin to specific object versions:
```
s3://bucket/key?region=us-east-1&versionId=abc123
```
A few changes had cropped up with `_NIX_TEST_ACCEPT=1`:
1. Blake hashing test JSON had a different indentation
2. Store URI had improper non-quoted spaces
(1) was is just fixed, as we trust nlohmann JSON to parse JSON
correctly, regardless of whitespace.
For (2), the existing URL was made a read-only test, since we very much
wish to continue parsing such invalid URLs directly. And then the
original read/write test was updated to properly percent-encode the
space, as the normal form should be.
I realized that we can actually do this thing, even though it is not
what nlohmann expects at all, because the extra parameter has a default
argument so nlohmann doesn't need to care. Sneaky!
Realisations are conceptually key-value pairs, mapping `DrvOutputs` (the
key) to information about that derivation output.
This separate the value type, which will be useful in maps, etc., where
we don't want to denormalize by including the key twice.
This matches similar changes for existing types:
| keyed | unkeyed |
|--------------------|------------------------|
| `ValidPathInfo` | `UnkeyedValidPathInfo` |
| `KeyedBuildResult` | `BuildResult` |
| `Realisation` | `UnkeyedRealisation` |
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Move S3 URL parsing, store configuration, and public bucket support
outside of NIX_WITH_S3_SUPPORT guards. Only AWS credential resolution
remains gated, allowing builds with withAWS = false to:
- Parse s3:// URLs
- Register S3 store types
- Access public S3 buckets (via HTTPS conversion)
- Use S3-compatible services without authentication
The setupForS3() function now always performs URL conversion, with
authentication code conditionally compiled based on NIX_WITH_S3_SUPPORT.
The aws-creds.cc file (only code using AWS CRT SDK) is now conditionally
compiled by meson.
This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.
- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support
Fixes: #13084, #12671, #11748, #12403, #5947
Instead of specifying env variables all the time
we can instead embed the __asan_default_options symbol
in all executables / shared objects. This reduces code
duplication.
Add a new S3BinaryCacheStore implementation that inherits from
HttpBinaryCacheStore.
The implementation is activated with NIX_WITH_CURL_S3, keeping the
existing NIX_WITH_S3_SUPPORT (AWS SDK) implementation unchanged.
Introduce a new build option 'curl-s3-store' for the curl-based S3
implementation, separate from the existing AWS SDK-based 's3-store'.
The two options are mutually exclusive to avoid conflicts.
Users can enable the new implementation with:
-Dcurl-s3-store=enabled -Ds3-store=disabled
Realisations are conceptually key-value pairs, mapping `DrvOutputs` (the
key) to information about that derivation output.
This separate the value type, which will be useful in maps, etc., where
we don't want to denormalize by including the key twice.
This matches similar changes for existing types:
| keyed | unkeyed |
|--------------------|------------------------|
| `ValidPathInfo` | `UnkeyedValidPathInfo` |
| `KeyedBuildResult` | `BuildResult` |
| `Realisation` | `UnkeyedRealisation` |
Move ParsedS3URL from s3.cc/.hh into dedicated s3-url.cc/.hh files.
This separates URL parsing utilities (which are protocol-agnostic) from
the AWS SDK-specific S3Helper implementation, making the code cleaner
and enabling reuse by future curl-based S3 implementation.
Fewer macros is better!
Introduce a new `JsonChacterizationTest` mixin class to help with this.
Also, avoid some needless copies with `GetParam`.
Part of my effort shoring up the JSON formats with #13570.
These stragglers have been accidentally left out when implementing the StoreConfig::getReference.
Also HttpBinaryCacheStore::getReference now returns the actual store parameters, not the cacheUri
parameters.
Old code is now just used for `nix build` --- there is no CLI breaking
change.
Test the new format, too.
The new format is not currently used, but will be used going forward,
for example in the C API.
Progress on #13570
This brings them in line with the other tests, and furthers my goals of
separating unit test data from code.
Doing this cleanup as part of my #13570 effort, but strictly-speaking,
this is separate as these data types' JSON never contained and store
paths or store dirs, just simple output name strings.
Enables builds with ASAN to catch memory corruption
bugs faster and in CI. This is an incredibly valuable
instrument that must be used as much as possible.
Somewhat based on jade's work from Lix, though there's a lot that
we have to do differently:
19ae87e5ce
Co-authored-by: Jade Lovelace <lix@jade.fyi>
See #13570 for details --- the idea is that included the store dir in
store paths makes systematic JSON parting with e.g. Serde, Aeson,
nlohmann, or similiar harder.
After talking to Eelco, we are changing the `Derivation` format right
away because not only is `nix derivation` technically experimental, we think it is
also less widely used in practice than, say, `nix path-info`.
Progress on #13570
This implements a special back-compat shim to specifically allow
unbracketed IPv6 addresses in store references. This is something
that is relied upon in the wild and the old parsing logic accepted
both ways (brackets were optional). This patch restores this behavior.
As always, we didn't have any tests for this.
Addresses #13937.
This is relied upon (specifically the `local` store) by existing
tooling [1] and we broke this in 3e7879e6df (which
was first released in 2.31).
To lessen the scope of the breakage we should not normalize "auto" references
and explicitly specified references like "local" or "daemon". It also makes
sense to canonicalize local://,daemon:// to be more compatible with prior
behavior.
[1]: 05e1b3cba2/lib/NOM/Builds.hs (L60-L64)
Looking at perf:
0.21 │ push %rbp
0.99 │ mov %rsp,%rbp
│ push %r15
0.25 │ push %r14
│ push %r13
0.49 │ push %r12
0.66 │ push %rbx
1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
Seems like 64K is too much to have on the stack for each invocation, considering
that only a minuscule number of allocations are actually larger than 4K.
There's actually no good reason this function should use so much stack space. Or
use small_string at all. Everything can be done in small chunks that don't require
any memory allocations and use up 2K bytes on the stack.
This patch also adds a microbenchmark for tracking the unparsing performance. Here
are the results for this change:
(Before)
BM_UnparseRealDerivationFile/hello 7275 ns 7247 ns 96093 bytes_per_second=232.136Mi/s
BM_UnparseRealDerivationFile/firefox 40538 ns 40376 ns 17327 bytes_per_second=378.534Mi/s
(After)
BM_UnparseRealDerivationFile/hello 3228 ns 3218 ns 215671 bytes_per_second=522.775Mi/s
BM_UnparseRealDerivationFile/firefox 39724 ns 39584 ns 17617 bytes_per_second=386.101Mi/s
This translates into nice evaluation performance improvements (compared to 18c3d2348f):
Benchmark 1: GC_INITIAL_HEAP_SIZE=8G old-nix/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.111 s ± 0.021 s [User: 2.513 s, System: 0.580 s]
Range (min … max): 3.083 s … 3.143 s 10 runs
Benchmark 2: GC_INITIAL_HEAP_SIZE=8G result/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.037 s ± 0.038 s [User: 2.461 s, System: 0.558 s]
Range (min … max): 2.960 s … 3.086 s 10 runs
See the new extensive doxygen in `url.hh`.
This fixes fetching gitlab: flakes.
Paths are now stored as a std::vector of individual path
segments, which can themselves contain path separators '/' (%2F).
This is necessary to make the Gitlab's /projects/ API work.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>