* lib/strings: optimise hasInfix function
* lib/strings: optimise hasInfix further using regex
* rstudio: call hasInfix with a string
* lib/strings: remove let from hasInfix
Co-authored-by: pennae <82953136+pennae@users.noreply.github.com>
Co-authored-by: pennae <82953136+pennae@users.noreply.github.com>
Adds some functions related to string similarity:
- lib.strings.commonPrefixLength
- lib.strings.commonSuffixLength
- lib.strings.levenshtein
- lib.strings.levenshteinAtMost
The current implementation of the concatStringsSep fallback references
concatStrings whcih is just a partial application of concatStringsSep,
forming a circular dependency. Although this will almost never be
encountered as (assuming the user does not explicitly trigger it):
1. the or operator will short circuit both in lazy and strict
evaluation
2. this can only occur in Nix versions prior to 1.10
which is not compatible with various nix operations as of 2.3.15
However it is still important if scopedImport is used or the builtins
have been overwritten. lib.foldl' is used instead of builtins.foldl'
as the foldl' primops was introduced in the same release as concatStringsSep.
When a list is passed to isStorePath this is most likely a mistake and
it is therefore better to just return false. There is one case where
this theoretically makes sense (if a list contains a single element for
which isStorePath elem), but since that case is also probably seldomly
intentional, it may save someone from debbuging unclear evaluation
errors.
Nix can perform static scope checking, but whenever code is inside
a `with` expression, the analysis breaks down, because it can't
know statically what's in the attribute set whose attributes were
brought into scope. In those cases, Nix has to assume that
everything works out.
Except it doesnt. Removing `with` from lib/ revealed an undefined
variable in an error message.
If that doesn't convince you that we're better off without `with`,
I can tell you that this PR results in a 3% evaluation performance
improvement because Nix can look up local variables by index.
This adds up with applications like the module system.
Furthermore, removing `with` makes the binding site of each
variable obvious, which helps with comprehension.
> NOTE: This function is not performant and should be avoided.
It's not used at all in-tree now, so we can remove it completely after
any remaining users are given notice.
In 87a19e9048 I merged staging-next into master using the GitHub gui as intended.
In ac241fb7a5 I merged master into staging-next for the next staging cycle, however, I accidentally pushed it to master.
Thinking this may cause trouble, I reverted it in 0be87c7979. This was however wrong, as it "removed" master.
This reverts commit 0be87c7979.
I merged master into staging-next but accidentally pushed it to master.
This should get us back to 87a19e9048.
This reverts commit ac241fb7a5, reversing
changes made to 76a439239e.
Updates documentation comments with extra information for nixdoc[1]
compatibility.
Some documentation strings have additionally been reworded for
clarity.
"Faux types" are added where applicable, but some functions do things
that are not trivially representable in the type notation used so they
were ignored for this purpose.
[1]: https://github.com/tazjin/nixdoc
- moved function into strings.nix
- renamed function from makePerl5Lib
- removed duplicates entries in the resulting value
- rewrote the function from scratch after learning a few things (much cleaner now)
toPath has confusing semantics and is never necessary; it can always
either just be omitted or replaced by pre-concatenating `/.`. It has
been marked as "!!! obsolete?" for more than 10 years in a C++
comment, hopefully removing it will let us properly deprecate and,
eventually, remove it.
This does break the API of being able to import any lib file and get
its libs, however I'm not sure people did this.
I made this while exploring being able to swap out docFn with a stub
in #2305, to avoid functor performance problems. I don't know if that
is going to move forward (or if it is a problem or not,) but after
doing all this work figured I'd put it up anyway :)
Two notable advantages to this approach:
1. when a lib inherits another lib's functions, it doesn't
automatically get put in to the scope of lib
2. when a lib implements a new obscure functions, it doesn't
automatically get put in to the scope of lib
Using the test script (later in this commit) I got the following diff
on the API:
+ diff master fixed-lib
11764a11765,11766
> .types.defaultFunctor
> .types.defaultTypeMerge
11774a11777,11778
> .types.isOptionType
> .types.isType
11781a11786
> .types.mkOptionType
11788a11794
> .types.setType
11795a11802
> .types.types
This means that this commit _adds_ to the API, however I can't find a
way to fix these last remaining discrepancies. At least none are
_removed_.
Test script (run with nix-repl in the PATH):
#!/bin/sh
set -eux
repl() {
suff=${1:-}
echo "(import ./lib)$suff" \
| nix-repl 2>&1
}
attrs_to_check() {
repl "${1:-}" \
| tr ';' $'\n' \
| grep "\.\.\." \
| cut -d' ' -f2 \
| sed -e "s/^/${1:-}./" \
| sort
}
summ() {
repl "${1:-}" \
| tr ' ' $'\n' \
| sort \
| uniq
}
deep_summ() {
suff="${1:-}"
depth="${2:-4}"
depth=$((depth - 1))
summ "$suff"
for attr in $(attrs_to_check "$suff" | grep -v "types.types"); do
if [ $depth -eq 0 ]; then
summ "$attr" | sed -e "s/^/$attr./"
else
deep_summ "$attr" "$depth" | sed -e "s/^/$attr./"
fi
done
}
(
cd nixpkgs
#git add .
#git commit -m "Auto-commit, sorry" || true
git checkout fixed-lib
deep_summ > ../fixed-lib
git checkout master
deep_summ > ../master
)
if diff master fixed-lib; then
echo "SHALLOW MATCH!"
fi
(
cd nixpkgs
git checkout fixed-lib
repl .types
)
* lib: introduce imap0, imap1
For historical reasons, imap starts counting at 1 and it's not
consistent with the rest of the lib.
So for now we split imap into imap0 that starts counting at zero and
imap1 that starts counting at 1. And imap is marked as deprecated.
See c71e2d4235 (commitcomment-21873221)
* replace uses of lib.imap
* lib: move imap to deprecated.nix
Quoting various characters that the shell *may* interpret specially is a
very fragile thing to do.
I've used something more robust all over the place in various Nix
expression I've written just because I didn't trust escapeShellArg.
Here is a proof of concept showing that I was indeed right in
distrusting escapeShellArg:
with import <nixpkgs> {};
let
payload = runCommand "payload" {} ''
# \x00 is not allowed for Nix strings, so let's begin at 1
for i in $(seq 1 255); do
echo -en "\\x$(printf %02x $i)"
done > "$out"
'';
escapers = with lib; {
current = escapeShellArg;
better = arg: let
backslashEscapes = stringToCharacters "\"\\ ';$`()|<>\r\t*[]&!~#";
search = backslashEscapes ++ [ "\n" ];
replace = map (c: "\\${c}") backslashEscapes ++ [ "'\n'" ];
in replaceStrings search replace (toString arg);
best = arg: "'${replaceStrings ["'"] ["'\\''"] (toString arg)}'";
};
testWith = escaper: let
escaped = escaper (builtins.readFile payload);
in runCommand "test" {} ''
if ! r="$(bash -c ${escapers.best "echo -nE ${escaped}"} 2> /dev/null)"
then
echo bash eval error > "$out"
exit 0
fi
if echo -n "$r" | cmp -s "${payload}"; then
echo success > "$out"
else
echo failed > "$out"
fi
'';
in runCommand "results" {} ''
echo "Test results:"
${lib.concatStrings (lib.mapAttrsToList (name: impl: ''
echo " ${name}: $(< "${testWith impl}")"
'') escapers)}
exit 1
''
The resulting output is the following:
Test results:
best: success
better: success
current: bash eval error
I did the "better" implementation just to illustrate that the method of
quoting only "harmful" characters results in madness in terms of
implementation and performance.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @edolstra, @zimbatm
For example, this allows writing
nix.package = /nix/store/786mlvhd17xvcp2r4jmmay6jj4wj6b7f-nix-1.10pre4206_896428c;
Also, document types.package in the manual.