The URL should not be normalized before handing it off to cURL, because
builtin fetchers like fetchTarball/fetchurl are expected to work with
arbitrary URLs, that might not be RFC3986 compliant. For those cases
Nix should not normalize URLs, though validation is fine. ParseURL and
cURL are supposed to match the set of acceptable URLs, since they implement
the same RFC.
Looking at perf:
0.21 │ push %rbp
0.99 │ mov %rsp,%rbp
│ push %r15
0.25 │ push %r14
│ push %r13
0.49 │ push %r12
0.66 │ push %rbx
1.23 │ lea -0x10000(%rsp),%r11
0.23 │ 15: sub $0x1000,%rsp
1.01 │ orq $0x0,(%rsp)
59.12 │ cmp %r11,%rsp
0.27 │ ↑ jne 15
Seems like 64K is too much to have on the stack for each invocation, considering
that only a minuscule number of allocations are actually larger than 4K.
There's actually no good reason this function should use so much stack space. Or
use small_string at all. Everything can be done in small chunks that don't require
any memory allocations and use up 2K bytes on the stack.
This patch also adds a microbenchmark for tracking the unparsing performance. Here
are the results for this change:
(Before)
BM_UnparseRealDerivationFile/hello 7275 ns 7247 ns 96093 bytes_per_second=232.136Mi/s
BM_UnparseRealDerivationFile/firefox 40538 ns 40376 ns 17327 bytes_per_second=378.534Mi/s
(After)
BM_UnparseRealDerivationFile/hello 3228 ns 3218 ns 215671 bytes_per_second=522.775Mi/s
BM_UnparseRealDerivationFile/firefox 39724 ns 39584 ns 17617 bytes_per_second=386.101Mi/s
This translates into nice evaluation performance improvements (compared to 18c3d2348f):
Benchmark 1: GC_INITIAL_HEAP_SIZE=8G old-nix/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.111 s ± 0.021 s [User: 2.513 s, System: 0.580 s]
Range (min … max): 3.083 s … 3.143 s 10 runs
Benchmark 2: GC_INITIAL_HEAP_SIZE=8G result/bin/nix-instantiate ../nixpkgs -A nixosTests.gnome --readonly-mode
Time (mean ± σ): 3.037 s ± 0.038 s [User: 2.461 s, System: 0.558 s]
Range (min … max): 2.960 s … 3.086 s 10 runs
Now we have better separation of the core logic --- an integral part of
the store layer spec even --- from the goal mechanism and other
minutiae.
Co-authored-by: Jeremy Kolb <kjeremy@gmail.com>
See the new extensive doxygen in `url.hh`.
This fixes fetching gitlab: flakes.
Paths are now stored as a std::vector of individual path
segments, which can themselves contain path separators '/' (%2F).
This is necessary to make the Gitlab's /projects/ API work.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Instead of passing them around separately, or doing finicky logic in a
try-catch block to recover them, just make `BuildError` always contain a
status, and make it the thrower's responsibility to set it. This is much
more simple and explicit.
Once that change is done, split the `done` functions of `DerivationGoal`
and `DerivationBuildingGoal` into separate success and failure
functions, which ends up being easier to understand and hardly any
duplication.
Also, change the handling of failures in resolved cases to use
`BuildResult::DependencyFailed` and a new message. This is because the
underlying derivation will also get its message printed --- which is
good, because in general the resolved derivation is not unique. One dyn
drv test had to be updated, but CA (and dyn drv) is experimental, so I
do not mind.
Finally, delete `SubstError` because it is unused.
The commit says it was added for CA testing --- manual I assume, since
there is no use of this in the test suite. I don't think we need it any
more, and I am not sure whether it was ever supposed to have made it to
`master` either.
This reverts commit 2eec2f765a.
We currently just use this during the build of a derivation, but there is no
reason we wouldn't want to use it elsewhere, e.g. to check the outputs
of someone else's build after the fact.
Moreover, I like pulling things out of `DerivationBuilder` that are
simple and don't need access to all that state. While
`DerivationBuilder` is unix-only, this refactor also make the code more
portable "for free".
The header is private, at Eelco's request.
This allows us to replace some very hacky and not correct string
concatentation in `HttpBinaryCacheStore`. It will especially be useful
with #13752, when today's hacks started to cause problems in practice,
not just theory.
Also make `fixGitURL` returned a `ParsedURL`.
It is suppposed to be "post build" not "during the build" after all. Its
location now matches that for the hook case (see elsewhere in
`DerivationdBuildingGoal`).
It was in a try-catch before, and now it isn't, but I believe that it is
impossible for it to throw `BuildError`, which is sufficient for this
code motion to be correct.
This is a nicer separation of concerns --- `DerivationBuilder` just
mounts the extra paths you tell it too, and the outside world is
responsible for making sure those extra paths make sense.
Since the closure only depends on global settings, and not
per-derivation information, we also have the option of moving this up
further and caching it across all local builds. (I only just realized
this after having done this refactor. I am not doing that change at this
time, however.)
Now, `DerivationBuilder` only concerns itself with `finalEnv` and
`extraFiles`, in straightforward unconditional code. All the fancy
desugaring logic is consolidated in `DerivationBuildingGoal`.
We should better share the pulled-out logic with `nix-shell`/`nix
develop`, which would fill in some missing features, arguably fixing
bugs.
I think this is a better separation of concerns. `DerivationBuilder`
doesn't need to to the final, query-heavy details about how these things
are constructed. It just operates on the level of "simple, stupid" files
and environment variables.
As much as I prefer rewriting the parsed rather than unparsed JSON for
elegance, this gets in the way of the separation of concerns that I am
trying to do.
As a practical matter, any rewriting that this did will also be done by
the second round of rewriting that remains below, so removing this code
should have no effect.
This is needed to rearrange include order, but I also think it is a good
thing anyways, as we seek to reduce the use of global settings variables
over time.
This avoids problems with older versions of Nix that don't put the
caches in WAL mode. That's generally not a problem, until you do something like
nix build --print-out-paths ... | cachix
which deadlocks because cachix tries to switch the caches to truncate
mode, which requires exclusive access. But the first process cannot
make progress because the cachix process isn't reading from the pipe.
With "truncate" mode, if we try to write to the database while another
process has an active write transaction, we'll block until the other
transaction finishes. This is a problem for the evaluation cache in
particular, since it uses long-running transactions.
WAL mode does not have this issue: it just returns "busy" right away,
so Nix will print
error (ignored): SQLite database '/home/eelco/.cache/nix/eval-cache-v5/...' is busy
and stop trying to write to the evaluation cache. (This was the
intended/original behaviour, see AttrDb::doSQLite().)
This systematizes the way our s3:// URLs are parsed in filetransfer.cc.
Yoinked out and refactored out of [1].
[1]: https://github.com/NixOS/nix/pull/13752
Co-authored-by: Bernardo Meurer Costa <beme@anthropic.com>
Compilers in nixpkgs have caught up and major distros
should also have recent enough compilers. It would be
nice to have newer features like more full featured
ranges and deducing this.
This caused RemoteStore::queryPathInfoUncached() to mark the
connection as invalid (see
RemoteStore::ConnectionHandle::~ConnectionHandle()), causing it to
disconnect and reconnect after every lookup of an invalid path. This
caused huge slowdowns in conjunction with
19f89eb684 and lazy-trees.