It is better to avoid null termination for performance and memory
safety, wherever possible.
These are good cleanups extracted from the Pascal String work that we
can land by themselves first, shrinking the diff in that PR.
Co-Authored-By: Aspen Smith <root@gws.fyi>
Co-Authored-By: Sergei Zimmerman <sergei@zimmerman.foo>
Refactor `ExprConcatStrings::eval` by inlining two only-called-once
closures into the call-site, so that the code is easier to reason about
locally (especially since the variables that were closed over were
mutated all over the place within this function).
Also use curly braces with each branch for consistency in the the
resulting code.
This is a pure refactor, but also arguably causes us to depend less on
the optimizer; now, we don't have to make sure that this closure is
inlined.
3a3c062982 introduced a buffer overflow for the
case when there are more than 65535 formal arguments. It is a perfectly reasonable
limitation, but we *must* not crash, corrupt memory or otherwise crash the process.
Add a test for the graceful behavior and switch to using an explicit uninitialized_copy_n
to further guard against buffer overflows.
We were getting this flex lexer warning during build:
```
../src/libexpr/lexer.l:333: warning, -s option given but default rule can be matched
```
The lexer uses `%option nodefault` but the `PATH_START` state only had
rules for specific patterns (`PATH_SEG` and `HPATH_START`) without a
catch-all rule to handle unexpected input.
Added a catch-all rule with `unreachable()`. This code path should never
be reached in normal operation since `PATH_START` is only entered after
matching `PATH_SEG` or `HPATH_START`, and we immediately rewind to
re-parse those same patterns. The catch-all exists solely to satisfy
flex's `%option nodefault` requirement.
With #14314, in some places in the parser we started using C++ objects
directly rather than pointers. In those places lines like `$$ = $1` now
imply a copy when we don't need one. This commit changes those to `$$ =
std::move($1)` to avoid those copies.
Instead of iterating over the newly built bindings we can
do a cheaper set_intersection to count duplicates or fall back
to a per-element binary search over the "base" bindings.
This speeds up `hello` evaluation by around 10ms (0.196s -> 0.187s) and
`nixos.closures.ec2.x86_64-linux` by 140ms (2.744s -> 2.609s).
This addresses a somewhat steep performance regression from 82315c3807
that reduced memory requirements of attribute set merges. With this patch
we get back around to 2.31 level of eval performance while keeping the memory
usage optimization.
Also document the optimization a bit more.
The previous message was vague about what "deprecated" meant and why
unlocked inputs with NAR hashes "may not be reproducible". It also
used "verifiable" which was confusing.
The new message makes it clear that the NAR hash provides verification
(is checked by NAR hash) and explicitly states the failure modes:
garbage collection and sharing.
This forces the code to go through proper abstractions instead of the raw filesystem
API.
This issue is evident from this reproducer:
nix eval --expr 'builtins.fetchurl { url = "https://example.com"; sha256 = ""; }' --json --eval-store "dummy://?read-only=false"
error:
… while calling the 'fetchurl' builtin
at «string»:1:1:
1| builtins.fetchurl { url = "https://example.com"; sha256 = ""; }
| ^
error: opening file '/nix/store/r4f87yrl98f2m6v9z8ai2rbg4qwlcakq-example.com': No such file or directory
Firstly, this is now available on darwin where the default in llvm 19.
Secondly, this leads to very weird segfaults when building with newer nixpkgs for some reason.
(It's UB after all).
This appears when building with the following:
mesonComponentOverrides = finalAttrs: prevAttrs: {
mesonBuildType = "debugoptimized";
dontStrip = true;
doCheck = false;
separateDebugInfo = false;
preConfigure = (prevAttrs.preConfigure or "") + ''
case "$mesonBuildType" in
release|minsize|debugoptimized) appendToVar mesonFlags "-Db_lto=true" ;;
*) appendToVar mesonFlags "-Db_lto=false" ;;
esac
'';
};
And with the following nixpkgs input:
nix build ".#nix-cli" -L --override-input nixpkgs "https://releases.nixos.org/nixos/unstable/nixos-25.11pre870157.7df7ff7d8e00/nixexprs.tar.xz"
Stacktrace:
#0 0x00000000006afdc0 in ?? ()
#1 0x00007ffff71cebb6 in _Unwind_ForcedUnwind_Phase2 () from /nix/store/41ym1jm1b7j3rhglk82gwg9jml26z1km-gcc-14.3.0-lib/lib/libgcc_s.so.1
#2 0x00007ffff71cf5b5 in _Unwind_Resume () from /nix/store/41ym1jm1b7j3rhglk82gwg9jml26z1km-gcc-14.3.0-lib/lib/libgcc_s.so.1
#3 0x00007ffff7eac7d8 in std::basic_ios<char, std::char_traits<char> >::~basic_ios (this=<optimized out>, this=<optimized out>)
at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/bits/basic_ios.h:286
#4 std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream (this=<optimized out>, this=<optimized out>)
at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/sstream:806
#5 nix::SimpleLogger::logEI (this=<optimized out>, ei=...) at ../logging.cc:121
#6 0x00007ffff7515794 in nix::Logger::logEI (this=0x675450, lvl=nix::lvlError, ei=...) at /nix/store/bkshji3nnxmrmgwa4n2kaxadajkwvn65-nix-util-2.32.0pre-dev/include/nix/util/logging.hh:144
#7 nix::handleExceptions (programName=..., fun=...) at ../shared.cc:336
#8 0x000000000047b76b in main (argc=<optimized out>, argv=<optimized out>) at /nix/store/82kmz7r96navanrc2fgckh2bamiqrgsw-gcc-14.3.0/include/c++/14.3.0/bits/new_allocator.h:88