Hello, a couple years ago I participated in a contest to count word frequencies and generate a sorted histogram. There's a cool post about it featuring a video discussing the tricks used by some participants. https://easyperf.net/blog/2022/05/28/Performance-analysis-an...
Some other participants said that they measured 0 difference in runtime between pshufb+eq and eqx3+orx2, but i think your problem has more classes of whitespace, and for the histogram problem, considerations about how to hash all the words in a chunk of the input dominate considerations about how to obtain the bitmasks of word-start or word-end positions.
For the particular case of the 5 delimiters '\n', '.', '?', '!', and ';', it just happens to be so that you can do this as a single shuffle instruction, replacing the explicit lookup table.
You can do this whenever `c & 0x0F` is unique for the set of characters you're looking for.
This is a really neat technique, well explained at your link.
Now that I understand it, I'd describe it as: For each byte, based on its bottom 4 bits, map it to either the unique "target" value that you're looking for that has those bottom 4 bits, or if there is no such target value, to any value that is different from what it is right now. Then simply check whether each resulting byte is equal to its corresponding original byte!
Not sure if the above will help people understand it, but after you understand it, I think you'll agree with the above description :)
Apart from Daniel Sternberg's frequent complaints about AI slop, he also writes [1]
> A new breed of AI-powered high quality code analyzers, primarily ZeroPath and Aisle Research, started pouring in bug reports to us with potential defects. We have fixed several hundred bugs as a direct result of those reports – so far.
> On POSIX platforms, platlib directories will be created if needed when creating virtual environments, instead of using lib64 -> lib symlink. This means purelib and platlib of virtual environments no longer share the same lib directory on platforms where sys.platlibdir is not equal to lib.
Sigh. Why can't they just be the same in virtual environments. Who cares about lib64 in a venv? Just another useless search path.
The compiler doesn't know the implementation of strlen, it only has its header. At runtime it might be different than at compile time (e.g. LD_PRELOAD=...). For this to be optimized you need link time optimization.
Sadness. Tons of functions from the standard library are special cases by the compiler. The compiler can elide malloc calls if it can prove it doesn't need them, even though strictly speaking malloc has side effects by changing the heap state. Just not useful side effects.
memcpy will get transformed and inlined for small copies all the time.
What happens when you change random functions in your C compiler? The C standard library and compiler are not independent, both make up a C implementation, which behaviour is described by the C standard.
Yes, though it's worth stating that it's a little more nuanced than that, since (for historical, path-dependent reasons) the compiler and libc are often independent projects (and libc often includes a bunch of other stuff beyond what the standard/compiler need).
This is the case, for example, on macOS, FreeBSD, and Linux.
You are right, it depends, whether you write C (from the standard) or a specific dialect from your vendor (everybody does in practice). In the latter case, you need to know about the rules of the compiler. But to allow optimization, these are often similar, so that the compiler assumes these have the behaviour of the implementation, that the compiler is tailored against.
> Cute username, BTW.
Thanks, I was to lazy to think of a real name, so this is the timestamp, I created the account.
The most common reason is to do optimizations such as replacing strlen("hello") with 5 or open-coding strlen (or, more commonly, memcpy or memcmp). If you're linking with a non-conformant strlen (or memcpy or whatever) the usual thing that happens is that you get standards-compliant behavior when the compiler optimizes away the call, but you get the non-conformant behavior you presumably wanted when the compiler compiles a call to your non-conformant function.
But the orthodox answer to such questions is that demons fly out of your nose.
It does. The meaning of certain functions are prescribed by the C standard and the compiler is allowed to expect them to have certain implementations. It can replace them with intrinsics or even remove them entirely. It is of course different for a freestanding implementation.
For people who enjoy these blogs, you would definitely like the Julia REPL as well. I used to play with this a lot to discover compiler things.
For example:
$ julia
julia> function f(n)
total = 0
for x in 1:n
total += x
end
return total
end
julia> @code_native f(10)
...
sub x9, x0, #2
mul x10, x8, x9
umulh x8, x8, x9
extr x8, x8, x10, #1
add x8, x8, x0, lsl #1
sub x0, x8, #1
ret
...
it shows this with nice colors right in the REPL.
In the example above, you see that LLVM figured out the arithmetic series and replaced the loop with a simple multiplication.
This and add_v3 in the OP fall into the general class of Scalar Evolution optimizations (SCEV). LLVM for example is able to handle almost all Brainfuck loops in practice---add_v3 indeed corresponds to a Brainfuck loop `[->+<]`---, and its SCEV implementation is truly massive: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Anal...
The metaschemas are useful but not strict enough. They don't set `additionalProperties: false`, which is great if you wanna extend the schema with your own properties, but it won't catch simple typos.
For example, the following issues pass under the metaschema.
This is intentional because unknown keywords are permitted with JSON Schema 2020-12 and prior. We are changing this with the upcoming version, which means we'll be updating the meta-schema to enforce it as well.
[1] https://github.com/spack/spack/blob/develop/lib/spack/spack/...
reply