Git URI can also support scp style links similar to git itself.
This change augments the function fixGitURL to better handle the scp
style urls through a minimal parser rather than regex which has been
found to be brittle.
* Support for IPV6 added
* New test cases added for fixGitURL
* Clearer documentation on purpose and goal of function
* More `std::string_view` for performance
* Update URL tests
Fixes#5958
Mostly undoes revert 4757487110599bbe9a287ead75741bba5436d52f
Adapted from commit 04ad66af5f
I (@Ericson2314) messed up. We were supposed to test the status quo
before landing any new chnages, and also there is one change that is not
quite right (relative paths).
I am reverting for now, and then backporting the test suite to the old
situation.
This reverts commit 04ad66af5f.
Git URI can also support scp style links similar to git itself.
This change augments the function fixGitURL to better handle the scp
style urls through a minimal parser rather than regex which has been
found to be brittle.
* Support for IPV6 added
* New test cases added for fixGitURL
* Clearer documentation on purpose and goal of function
* More `std::string_view` for performance
* A few more URL tests
Fixes#5958
Old versions of nix happily accepted a lot of weird flake references,
which we didn't have tests for, so this was accidentally broken in
c436b7a32a.
This patch restores previous behavior and adds a plethora of tests
to ensure we don't break this in the future.
These test cases are aligned with how 2.18/2.28 parsed flake references.
See the new extensive doxygen in `url.hh`.
This fixes fetching gitlab: flakes.
Paths are now stored as a std::vector of individual path
segments, which can themselves contain path separators '/' (%2F).
This is necessary to make the Gitlab's /projects/ API work.
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
This allows us to replace some very hacky and not correct string
concatentation in `HttpBinaryCacheStore`. It will especially be useful
with #13752, when today's hacks started to cause problems in practice,
not just theory.
Also make `fixGitURL` returned a `ParsedURL`.
I need this for some `ParseURL` improvements, but I figure this is
better to send as its own PR.
I changed the tests willy-nilly to sometimes use
`std::list<std::string_view>` instead of `Strings` (which is
`std::list<std::string>`).
Co-Authored-By: Sergei Zimmerman <sergei@zimmerman.foo>
Compilers in nixpkgs have caught up and major distros
should also have recent enough compilers. It would be
nice to have newer features like more full featured
ranges and deducing this.
Turns out we didn't have tests for some of the important behavior introduced
for flake reference fragments and url queries [1]. This is rather important
and is relied upon by existing tooling. This fixes up these exact cases before
handing off the URL to the Boost.URL parser.
To the best of my knowledge this implements the same behavior as prior regex-based
parser did [2]:
> fragmentRegex = "(?:" + pcharRegex + "|[/? \"^])*";
> queryRegex = "(?:" + pcharRegex + "|[/? \"])*";
[1]: 9c0a09f09f
[2]: https://github.com/NixOS/nix/blob/2.30.2/src/libutil/include/nix/util/url-parts.hh
The implicit dependency on refLength (which is the StorePath::HashLen)
is not good. Also the companion tests and benchmarks are already in libstore-tests.
This patch allows users to specify the connection port
in the store URLS like so:
```
nix store info --store "ssh-ng://localhost:22" --json
```
Previously this failed with: `error: failed to start SSH connection to 'localhost:22'`,
because the code did not distinguish the port from the hostname. This
patch remedies that problem by introducing a ParsedURL::Authority type
for working with parsed authority components of URIs.
Now that the URL parsing code is less ad-hoc we can
add more long-awaited fixes for specifying SSH connection
ports in store URIs.
Builds upon the work from bd1d2d1041.
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Make it separate from Hash, since other things can be base-encoded too.
This isn't really needed for Nix, but it makes the code easier to read
e.g. for someone reimplementing this stuff in a different language. (Of
course, Base16/Base64 should be gotten off-the-shelf, but now the hash
code, which is more bespoke, is less cluttered with the parts that would
be from some library.)
Many reimplementations of "Nix32" and our hash type already exist, so
this cleanup is coming years too late, but I say better late than never
/ it is always good to nudge the code in the direction of being a
"living spec".
Co-authored-by: Sergei Zimmerman <sergei@zimmerman.foo>
SHA-256 is Git's next hash algorithm. The world is still basically stuck
on SHA-1 with git, but shouldn't be. We can at least do our part to get
ready.
On the C++ implementation side, only a little bit of generalization was
needed, and that was fairly straight-forward. The tests (unit and
system) were actually bigger, and care was taken to make sure they were
all cover both algorithms equally.
Boost.URL is a significantly more RFC-compliant parser
than what libutil currently has a bundle of incomprehensible
regexes.
One aspect of this change is that RFC4007 ZoneId IPv6 literals
are represented in URIs according to RFC6874 [1].
Previously they were represented naively like so: [fe80::818c:da4d:8975:415c\%enp0s25].
This is not entirely correct, because the percent itself has to be pct-encoded:
> "%" is always treated as
an escape character in a URI, so, according to the established URI
syntax [RFC3986] any occurrences of literal "%" symbols in a URI MUST
be percent-encoded and represented in the form "%25". Thus, the
scoped address fe80::a%en1 would appear in a URI as
http://[fe80::a%25en1].
[1]: https://datatracker.ietf.org/doc/html/rfc6874
Co-authored-by: Jörg Thalheim <joerg@thalheim.io>
The myriad of hand-rolled URL parsing and validation code
is a constant source of problems. Regexes are not a great way
of writing parsers and there's a history of getting them wrong.
Boost.URL is a good library we can outsource most of the heavy
lifting to.
* It is tough to contribute to a project that doesn't use a formatter,
* It is extra hard to contribute to a project which has configured the formatter, but ignores it for some files
* Code formatting makes it harder to hide obscure / weird bugs by accident or on purpose,
Let's rip the bandaid off?
Note that PRs currently in flight should be able to be merged relatively easily by applying `clang-format` to their tip prior to merge.
Unlike std::sort and std::stable_sort, this implementation
does not lead to out-of-bounds memory reads or other undefined
behavior when the predicate is not strict weak ordering.
This makes it possible to use this function in libexpr for
builtins.sort, where an incorrectly implemented comparator
in the user nix code currently can crash and burn the evaluator
by invoking C++ UB.
Before the change `nix` was stripping warning flags
reported by `gcc-14` too eagerly:
$ nix build -f. texinfo4
error: builder for '/nix/store/i9948l91s3df44ip5jlpp6imbrcs646x-texinfo-4.13a.drv' failed with exit code 2;
last 25 log lines:
> 1495 | info_tag (mbi_iterator_t iter, int handle, size_t *plen)
> | ~~~~~~~~^~~~
> window.c:1887:39: error: passing argument 4 of 'printed_representation' from incompatible pointer type []
> 1887 | &replen);
> | ^~~~~~~
> | |
> | int *
After the change the compiler flag remains:
$ ~/patched.nix build -f. texinfo4
error: builder for '/nix/store/i9948l91s3df44ip5jlpp6imbrcs646x-texinfo-4.13a.drv' failed with exit code 2;
last 25 log lines:
> 1495 | info_tag (mbi_iterator_t iter, int handle, size_t *plen)
> | ~~~~~~~~^~~~
> window.c:1887:39: error: passing argument 4 of 'printed_representation' from incompatible pointer type [-Wincompatible-pointer-types]
> 1887 | &replen);
> | ^~~~~~~
> | |
> | int *
Note the difference in flag rendering around the warning.
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda has a
good sumamry of why it happens. Befomre the change `nix` was handling
just one form or URL separator:
$ printf '\e]8;;http://example.com\e\\This is a link\e]8;;\e\\\n'
Now it also handled another for (used by gcc-14`):
printf '\e]8;;http://example.com\aThis is a link\e]8;;\a\n'
While at it fixed accumulation of trailing escape `\e\\` symbol.
For example, instead of doing
#include "nix/store-config.hh"
#include "nix/derived-path.hh"
Now do
#include "nix/store/config.hh"
#include "nix/store/derived-path.hh"
This was originally planned in the issue, and also recent requested by
Eelco.
Most of the change is purely mechanical. There is just one small
additional issue. See how, in the example above, we took this
opportunity to also turn `<comp>-config.hh` into `<comp>/config.hh`.
Well, there was already a `nix/util/config.{cc,hh}`. Even though there
is not a public configuration header for libutil (which also would be
called `nix/util/config.{cc,hh}`) that's still confusing, To avoid any
such confusion, we renamed that to `nix/util/configuration.{cc,hh}`.
Finally, note that the libflake headers already did this, so we didn't
need to do anything to them. We wouldn't want to mistakenly get
`nix/flake/flake/flake.hh`!
Progress on #7876
There are two big changes:
1. Public and private config is now separated. Configuration variables
that are only used internally do not go in a header which is
installed.
(Additionally, libutil has a unix-specific private config header,
which should only be used in unix-specific code. This keeps things a
bit more organized, in a purely private implementation-internal way.)
2. Secondly, there is no more `-include`. There are very few config
items that need to be publically exposed, so now it is feasible to
just make the headers that need them just including the (public)
configuration header.
And there are also a few more small cleanups on top of those:
- The configuration files have better names.
- The few CPP variables that remain exposed in the public headers are
now also renamed to always start with `NIX_`. This ensures they should
not conflict with variables defined elsewhere.
- We now always use `#if` and not `#ifdef`/`#ifndef` for our
configuration variables, which helps avoid bugs by requiring that
variables must be defined in all cases.
The short answer for why we need to do this is so we can consistently do
`#include "nix/..."`. Without this change, there are ways to still make
that work, but they are hacky, and they have downsides such as making it
harder to make sure headers from the wrong Nix library (e..g.
`libnixexpr` headers in `libnixutil`) aren't being used.
The C API alraedy used `nix_api_*`, so its headers are *not* put in
subdirectories accordingly.
Progress on #7876
We resisted doing this for a while because it would be annoying to not
have the header source file pairs close by / easy to change file
path/name from one to the other. But I am ameliorating that with
symlinks in the next commit.
- Some headers were completely redundant and have been removed.
- Other headers have been turned private.
- Unnecessary meson.build code has been removed.
- libutil-tests now has a private config header, where previously
it had none. This removes the need to expose a package version
macro publicly.
On https://github.com/NixOS/nix/issues/8946, we faced a surprising
behaviour wrt. exception when using pthread_cancel. In a nutshell when
a thread is inside a catch block and it's getting pthread_cancel by
another one, then the original exception is bubbled up and crashes the
process.
We now poll on the notification pipe from the thread and exit when the
main thread closes its end. This solution does not exhibit surprising
behaviour wrt. exceptions.
Co-authored-by: Mic92 <joerg@thalheim.io>
Fixes https://github.com/NixOS/nix/issues/8946
See also Lix https://gerrit.lix.systems/c/lix/+/1605 which is very
similar by coincidence. Pulled a comment from that.