Hacker Newsnew | past | comments | ask | show | jobs | submit | stabbles's commentslogin

Nice, you don't see clingo mentioned often. We use it in the Spack package manager for resolving dependencies [1]

[1] https://github.com/spack/spack/blob/develop/lib/spack/spack/...



Hello, a couple years ago I participated in a contest to count word frequencies and generate a sorted histogram. There's a cool post about it featuring a video discussing the tricks used by some participants. https://easyperf.net/blog/2022/05/28/Performance-analysis-an...

Some other participants said that they measured 0 difference in runtime between pshufb+eq and eqx3+orx2, but i think your problem has more classes of whitespace, and for the histogram problem, considerations about how to hash all the words in a chunk of the input dominate considerations about how to obtain the bitmasks of word-start or word-end positions.


Awesome! The slides with roofline analysis are great! https://docs.google.com/presentation/d/16M90It8nOK-Oiy7j9Kw2...

If this is on a single core then the "6GB/s" guy is disproven not just in theory but also in practice.

For the particular case of the 5 delimiters '\n', '.', '?', '!', and ';', it just happens to be so that you can do this as a single shuffle instruction, replacing the explicit lookup table.

You can do this whenever `c & 0x0F` is unique for the set of characters you're looking for.

See https://stoppels.ch/2022/11/30/io-is-no-longer-the-bottlenec... for details.


This is a really neat technique, well explained at your link.

Now that I understand it, I'd describe it as: For each byte, based on its bottom 4 bits, map it to either the unique "target" value that you're looking for that has those bottom 4 bits, or if there is no such target value, to any value that is different from what it is right now. Then simply check whether each resulting byte is equal to its corresponding original byte!

Not sure if the above will help people understand it, but after you understand it, I think you'll agree with the above description :)


Hey! Author of the blog here.

This is pretty cool~ Thanks for suggesting this, I will read this in detail and add it to the next (0.5.0) release of memchunk.


Why does your title not have any context?

Note your compiler might turn that _mm256_set_epi64x into a load from memory, so there might still be memory accesses you don't expect.

Apart from Daniel Sternberg's frequent complaints about AI slop, he also writes [1]

> A new breed of AI-powered high quality code analyzers, primarily ZeroPath and Aisle Research, started pouring in bug reports to us with potential defects. We have fixed several hundred bugs as a direct result of those reports – so far.

[1] https://daniel.haxx.se/blog/2025/12/23/a-curl-2025-review/



So? Those are automated analysis tools and by "slop" he seems to refer to careless reports crafted using AI, solely for collecting bounties:

https://gist.github.com/bagder/07f7581f6e3d78ef37dfbfc81fd1d...


> On POSIX platforms, platlib directories will be created if needed when creating virtual environments, instead of using lib64 -> lib symlink. This means purelib and platlib of virtual environments no longer share the same lib directory on platforms where sys.platlibdir is not equal to lib.

Sigh. Why can't they just be the same in virtual environments. Who cares about lib64 in a venv? Just another useless search path.


A C compiler is easier to bootstrap than a Rust compiler.


The compiler doesn't know the implementation of strlen, it only has its header. At runtime it might be different than at compile time (e.g. LD_PRELOAD=...). For this to be optimized you need link time optimization.


Both clang and gcc do optimize it though - https://godbolt.org/z/cGG9dq756. You need -fno-builtin or similar to get them to not.


No, the compiler may assume that the behavior of standard library functions is standards-conformant.


> No, the compiler may assume that the behavior of standard library functions is standards-conformant.

Why?

What happens if it isn't?


Sadness. Tons of functions from the standard library are special cases by the compiler. The compiler can elide malloc calls if it can prove it doesn't need them, even though strictly speaking malloc has side effects by changing the heap state. Just not useful side effects.

memcpy will get transformed and inlined for small copies all the time.


What happens when you change random functions in your C compiler? The C standard library and compiler are not independent, both make up a C implementation, which behaviour is described by the C standard.


Yes, though it's worth stating that it's a little more nuanced than that, since (for historical, path-dependent reasons) the compiler and libc are often independent projects (and libc often includes a bunch of other stuff beyond what the standard/compiler need).

This is the case, for example, on macOS, FreeBSD, and Linux.

Cute username, BTW.


You are right, it depends, whether you write C (from the standard) or a specific dialect from your vendor (everybody does in practice). In the latter case, you need to know about the rules of the compiler. But to allow optimization, these are often similar, so that the compiler assumes these have the behaviour of the implementation, that the compiler is tailored against.

> Cute username, BTW.

Thanks, I was to lazy to think of a real name, so this is the timestamp, I created the account.


> What happens if it isn't?

§6.4.2.1: "If the program defines a reserved identifier [...] the behavior is undefined."


The most common reason is to do optimizations such as replacing strlen("hello") with 5 or open-coding strlen (or, more commonly, memcpy or memcmp). If you're linking with a non-conformant strlen (or memcpy or whatever) the usual thing that happens is that you get standards-compliant behavior when the compiler optimizes away the call, but you get the non-conformant behavior you presumably wanted when the compiler compiles a call to your non-conformant function.

But the orthodox answer to such questions is that demons fly out of your nose.


Because that's what it means to compile a specific dialect of a specific programming language?

If you want a dialect where they aren't allowed to assume that you would have to make your own


It does. The meaning of certain functions are prescribed by the C standard and the compiler is allowed to expect them to have certain implementations. It can replace them with intrinsics or even remove them entirely. It is of course different for a freestanding implementation.


Hmmm, really? Switching compiler seems sufficient: https://godbolt.org/z/xnevov5d7

BTW, the case of it not optimizing was MSVC targetting Windows (which doesn't support LD_PRELOAD, but maybe has something similar?).


For people who enjoy these blogs, you would definitely like the Julia REPL as well. I used to play with this a lot to discover compiler things.

For example:

    $ julia
    julia> function f(n)
             total = 0
             for x in 1:n
               total += x
             end
             return total
           end
    julia> @code_native f(10)
        ...
        sub    x9, x0, #2
        mul    x10, x8, x9
        umulh    x8, x8, x9
        extr    x8, x8, x10, #1
        add    x8, x8, x0, lsl #1
        sub    x0, x8, #1
        ret
        ...
it shows this with nice colors right in the REPL.

In the example above, you see that LLVM figured out the arithmetic series and replaced the loop with a simple multiplication.


This and add_v3 in the OP fall into the general class of Scalar Evolution optimizations (SCEV). LLVM for example is able to handle almost all Brainfuck loops in practice---add_v3 indeed corresponds to a Brainfuck loop `[->+<]`---, and its SCEV implementation is truly massive: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Anal...



Another nice thing in julia is that if you dont want the optimizer to delete something, you can just ask it nicely :)

    julia> function f(n)
             total = 0
             for x in 1:n
               total += Base.donotdelete(x)
             end
             return total
           end
will keep the loop


Question is whether the space is competitive enough. Other SOTA models are made by companies whose business model already is selling ads.


The metaschemas are useful but not strict enough. They don't set `additionalProperties: false`, which is great if you wanna extend the schema with your own properties, but it won't catch simple typos.

For example, the following issues pass under the metaschema.

    {"foo": {"bar": { ... }}}  # wrong
    {"foo": {"type": "object", "properties": {"bar": { ... }}}} # correct

    {"additional_properties": false} # wrong
    {"additionalProperties": false} # correct


This is intentional because unknown keywords are permitted with JSON Schema 2020-12 and prior. We are changing this with the upcoming version, which means we'll be updating the meta-schema to enforce it as well.


Will the new meta-schema still have any mechanism for things like the special keywords that editor UIs often need?


Excellent!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: