More

stabbles · 2026-01-06T10:40:45 1767696045

Nice, you don't see clingo mentioned often. We use it in the Spack package manager for resolving dependencies [1]

[1] https://github.com/spack/spack/blob/develop/lib/spack/spack/...

stabbles · 2026-01-06T07:39:34 1767685174

Author here. There is a part 2 to this: https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottlenec...

anonymoushn · 2026-01-06T12:39:22 1767703162

Hello, a couple years ago I participated in a contest to count word frequencies and generate a sorted histogram. There's a cool post about it featuring a video discussing the tricks used by some participants. https://easyperf.net/blog/2022/05/28/Performance-analysis-an...

Some other participants said that they measured 0 difference in runtime between pshufb+eq and eqx3+orx2, but i think your problem has more classes of whitespace, and for the histogram problem, considerations about how to hash all the words in a chunk of the input dominate considerations about how to obtain the bitmasks of word-start or word-end positions.

stabbles · 2026-01-06T14:01:56 1767708116

Awesome! The slides with roofline analysis are great! https://docs.google.com/presentation/d/16M90It8nOK-Oiy7j9Kw2...

imtringued · 2026-01-06T11:32:07 1767699127

If this is on a single core then the "6GB/s" guy is disproven not just in theory but also in practice.

stabbles · 2026-01-05T19:35:42 1767641742

For the particular case of the 5 delimiters '\n', '.', '?', '!', and ';', it just happens to be so that you can do this as a single shuffle instruction, replacing the explicit lookup table.

You can do this whenever `c & 0x0F` is unique for the set of characters you're looking for.

See https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottlenec... for details.

akoboldfrying · 2026-01-06T00:47:22 1767660442

This is a really neat technique, well explained at your link.

Now that I understand it, I'd describe it as: For each byte, based on its bottom 4 bits, map it to either the unique "target" value that you're looking for that has those bottom 4 bits, or if there is no such target value, to any value that is different from what it is right now. Then simply check whether each resulting byte is equal to its corresponding original byte!

Not sure if the above will help people understand it, but after you understand it, I think you'll agree with the above description :)

bhavnicksm · 2026-01-05T21:05:47 1767647147

Hey! Author of the blog here.

This is pretty cool~ Thanks for suggesting this, I will read this in detail and add it to the next (0.5.0) release of memchunk.

CyberDildonics · 2026-01-05T23:04:47 1767654287

Why does your title not have any context?

dataflow · 2026-01-05T21:58:00 1767650280

Note your compiler might turn that _mm256_set_epi64x into a load from memory, so there might still be memory accesses you don't expect.

stabbles · 2025-12-30T14:23:37 1767104617

Apart from Daniel Sternberg's frequent complaints about AI slop, he also writes [1]

> A new breed of AI-powered high quality code analyzers, primarily ZeroPath and Aisle Research, started pouring in bug reports to us with potential defects. We have fixed several hundred bugs as a direct result of those reports – so far.

[1] https://daniel.haxx.se/blog/2025/12/23/a-curl-2025-review/

molf · 2025-12-30T14:49:15 1767106155

That's very interesting! It links to:

https://daniel.haxx.se/blog/2025/10/10/a-new-breed-of-analyz...

and its HN discussion:

https://news.ycombinator.com/item?id=45449348

p2detar · 2025-12-30T14:48:26 1767106106

So? Those are automated analysis tools and by "slop" he seems to refer to careless reports crafted using AI, solely for collecting bounties:

https://gist.github.com/bagder/07f7581f6e3d78ef37dfbfc81fd1d...

stabbles · 2025-12-21T08:38:01 1766306281

> On POSIX platforms, platlib directories will be created if needed when creating virtual environments, instead of using lib64 -> lib symlink. This means purelib and platlib of virtual environments no longer share the same lib directory on platforms where sys.platlibdir is not equal to lib.

Sigh. Why can't they just be the same in virtual environments. Who cares about lib64 in a venv? Just another useless search path.

stabbles · 2025-12-10T07:55:58 1765353358

A C compiler is easier to bootstrap than a Rust compiler.

stabbles · 2025-12-03T16:54:25 1764780865

The compiler doesn't know the implementation of strlen, it only has its header. At runtime it might be different than at compile time (e.g. LD_PRELOAD=...). For this to be optimized you need link time optimization.

dzaima · 2025-12-03T16:58:10 1764781090

Both clang and gcc do optimize it though - https://godbolt.org/z/cGG9dq756. You need -fno-builtin or similar to get them to not.

valleyer · 2025-12-03T18:00:33 1764784833

No, the compiler may assume that the behavior of standard library functions is standards-conformant.

SpaceManNabs · 2025-12-03T22:29:51 1764800991

> No, the compiler may assume that the behavior of standard library functions is standards-conformant.

Why?

What happens if it isn't?

MindSpunk · 2025-12-03T23:32:57 1764804777

Sadness. Tons of functions from the standard library are special cases by the compiler. The compiler can elide malloc calls if it can prove it doesn't need them, even though strictly speaking malloc has side effects by changing the heap state. Just not useful side effects.

memcpy will get transformed and inlined for small copies all the time.

1718627440 · 2025-12-05T01:03:25 1764896605

What happens when you change random functions in your C compiler? The C standard library and compiler are not independent, both make up a C implementation, which behaviour is described by the C standard.

valleyer · 2025-12-06T22:57:31 1765061851

Yes, though it's worth stating that it's a little more nuanced than that, since (for historical, path-dependent reasons) the compiler and libc are often independent projects (and libc often includes a bunch of other stuff beyond what the standard/compiler need).

This is the case, for example, on macOS, FreeBSD, and Linux.

Cute username, BTW.

1718627440 · 2025-12-07T14:32:40 1765117960

You are right, it depends, whether you write C (from the standard) or a specific dialect from your vendor (everybody does in practice). In the latter case, you need to know about the rules of the compiler. But to allow optimization, these are often similar, so that the compiler assumes these have the behaviour of the implementation, that the compiler is tailored against.

> Cute username, BTW.

Thanks, I was to lazy to think of a real name, so this is the timestamp, I created the account.

valleyer · 2025-12-04T03:57:59 1764820679

> What happens if it isn't?

§6.4.2.1: "If the program defines a reserved identifier [...] the behavior is undefined."

kragen · 2025-12-04T03:37:20 1764819440

The most common reason is to do optimizations such as replacing strlen("hello") with 5 or open-coding strlen (or, more commonly, memcpy or memcmp). If you're linking with a non-conformant strlen (or memcpy or whatever) the usual thing that happens is that you get standards-compliant behavior when the compiler optimizes away the call, but you get the non-conformant behavior you presumably wanted when the compiler compiles a call to your non-conformant function.

But the orthodox answer to such questions is that demons fly out of your nose.

DannyBee · 2025-12-03T22:50:18 1764802218

Because that's what it means to compile a specific dialect of a specific programming language?

If you want a dialect where they aren't allowed to assume that you would have to make your own

1718627440 · 2025-12-05T01:01:36 1764896496

It does. The meaning of certain functions are prescribed by the C standard and the compiler is allowed to expect them to have certain implementations. It can replace them with intrinsics or even remove them entirely. It is of course different for a freestanding implementation.

abainbridge · 2025-12-03T16:58:47 1764781127

Hmmm, really? Switching compiler seems sufficient: https://godbolt.org/z/xnevov5d7

BTW, the case of it not optimizing was MSVC targetting Windows (which doesn't support LD_PRELOAD, but maybe has something similar?).

stabbles · 2025-12-03T13:13:18 1764767598

For people who enjoy these blogs, you would definitely like the Julia REPL as well. I used to play with this a lot to discover compiler things.

For example:

    $ julia
    julia> function f(n)
             total = 0
             for x in 1:n
               total += x
             end
             return total
           end
    julia> @code_native f(10)
        ...
        sub    x9, x0, #2
        mul    x10, x8, x9
        umulh    x8, x8, x9
        extr    x8, x8, x10, #1
        add    x8, x8, x0, lsl #1
        sub    x0, x8, #1
        ret
        ...

it shows this with nice colors right in the REPL.

In the example above, you see that LLVM figured out the arithmetic series and replaced the loop with a simple multiplication.

lifthrasiir · 2025-12-03T13:29:01 1764768541

This and add_v3 in the OP fall into the general class of Scalar Evolution optimizations (SCEV). LLVM for example is able to handle almost all Brainfuck loops in practice---add_v3 indeed corresponds to a Brainfuck loop `[->+<]`---, and its SCEV implementation is truly massive: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Anal...

Someone · 2025-12-03T13:32:11 1764768731

LLVM can do more complex sums, too. See https://kristerw.blogspot.com/2019/04/how-llvm-optimizes-geo...

eigenspace · 2025-12-04T23:06:38 1764889598

Another nice thing in julia is that if you dont want the optimizer to delete something, you can just ask it nicely :)

    julia> function f(n)
             total = 0
             for x in 1:n
               total += Base.donotdelete(x)
             end
             return total
           end

will keep the loop

stabbles · 2025-11-29T15:24:32 1764429872

Question is whether the space is competitive enough. Other SOTA models are made by companies whose business model already is selling ads.

stabbles · 2025-11-28T21:11:32 1764364292

The metaschemas are useful but not strict enough. They don't set `additionalProperties: false`, which is great if you wanna extend the schema with your own properties, but it won't catch simple typos.

For example, the following issues pass under the metaschema.

    {"foo": {"bar": { ... }}}  # wrong
    {"foo": {"type": "object", "properties": {"bar": { ... }}}} # correct

    {"additional_properties": false} # wrong
    {"additionalProperties": false} # correct

gregsdennis · 2025-11-28T21:20:30 1764364830

This is intentional because unknown keywords are permitted with JSON Schema 2020-12 and prior. We are changing this with the upcoming version, which means we'll be updating the meta-schema to enforce it as well.

eternityforest · 2025-11-30T18:28:17 1764527297

Will the new meta-schema still have any mechanism for things like the special keywords that editor UIs often need?

stabbles · 2025-11-28T22:29:55 1764368995

Excellent!