A container class that needs cooperation from the contained items, usually with special data fields. For example, a doubly linked list where the forward and back pointers are regular member variables of the contained items. Intrusive containers can avoid memory allocations (which can be a correctness issue in a kernel) and go well with C's lack of built-in container classes. They are somewhat common in C and very rare in C++ and Rust.
At least for a double linked list you can probably get pretty far in terms of performance in the non-intrusive case, if your compiler unboxes the contained item into your nodes? Or are there benefits left in intrusive data structures that this doesn't capture?
Storing the data in nodes doesn't work if the given structure may need to be in multiple linked lists, which iirc was a concern for the kernel?
And generally I'd imagine it's quite a weird form for data structures for which being in a linked list isn't a core aspect (no clue what specifically the kernel uses, but I could imagine situations where where objects aren't in any linked list for 99% of time, but must be able to be chained in even if there are 0 bytes of free RAM ("Error: cannot free memory because memory is full" is probably not a thing you'd ever want to see)).
> Storing the data in nodes doesn't work if the given structure may need to be in multiple linked lists, which iirc was a concern for the kernel?
That's a great point! A language almost like C plus a smart enough compiler could do the unboxing, but this technique doesn't work for multiple structures.
> And generally I'd imagine it's quite a weird form for data structures for which being in a linked list isn't a core aspect (no clue what specifically the kernel uses, but I could imagine situations where where objects aren't in any linked list for 99% of time, but must be able to be chained in even if there are 0 bytes of free RAM ("Error: cannot free memory because memory is full" is probably not a thing you'd ever want to see)).
I think you are right in practice, though in principle you could pre-allocate the necessary memory when you create the items? When you have the intrusive links, you pay for their allocation already anyway, too. In terms of total storage space, you wouldn't pay for more.
(And you don't have to formally alloc them via a call to kmalloc or so. In the sense that you don't need to find a space for them: you just need to make sure that the system keeps a big enough buffer of contiguous-enough space somewhere. Similar to how many filesystems allow you to reserve space for root, but that doesn't mean any particular block is reserved for root up-front.)
But as I thought, that's about in-principle memory usage. A language like C makes the alternative of intrusive data structures much simpler.
> you just need to make sure that the system keeps a big enough buffer of contiguous-enough space somewhere. Similar to how many filesystems allow you to reserve space for root
Said file system reserving can fill up (I've experienced that :) ). Could perhaps engineer some super-equation for a global reserved memory size that can actually guarantee being sufficient; but, unless you mark allocations as needing reserved space vs not and compute based on that (which is back to overhead), that'll necessarily waste a good amount of memory.
And such a global equation of course can result in global failure if just one subsystem miscalculates its required reserved amount, or the calculations improperly account for worst-case non-contiguousness or interleaved different-size-linked-list-alloc-requests or whatever. (never mind that you now need to alloc when adding elements to linked lists, an action that would otherwise be 1-4 inlined stores)
Probably (?) possible to handle, but way way way more easy to break or get wrong (with very-difficult-to-debug consequences) than just some pointer fields.
Oh and also of course, if you store list data in a separate allocation from the main object, you lose the ability to do O(1) (and extremely-cheap at that) "remove self from linked list just given my own pointer", which is probably quite common (and you can't even O(1) with an arraylist instead of a linkedlist).
The main thing is that he object can be a member of various structures. It can be in big general queue and in priority queue for example. Once you find it and deal with it you can remove it from both without needing to search for it.
Same story for games where it can be in the list of all objects and in the list of objects that might be attacked. Once it's killed you can remove it from all lists without searching for it in every single one.
Haskell's GHC partially does it. LLVM can do it in principle, if your frontend gives enough information. Some JVMs can partially do some of it.
The above is about the optimiser figuring out whether to box or unbox by itself.
If you are willing to give the compiler a hand: Rust can do it just fine and it's the default when you define data structures. If you need boxing, you need to explicitly ask for it, eg via https://doc.rust-lang.org/std/boxed/struct.Box.html