I think the model I described is more precise than madvise. I think madvise woul...

gpderetta · 2025-10-24T11:38:57 1761305937

you can do fine grained madvise via io_uring, which indeed uses a queue. But at that point why use mmap at all, just do async reads via io_uring.

vlovich123 · 2025-10-24T11:57:59 1761307079

The entire point I was trying to make at the beginning of the thread is that mmap gives you memory pages in the page cache that the OS can drop on memory pressure. Io_uring is close on the performance and fine-grained access patterns front. It’s not so good on the system-wide cooperative behavior with memory front and has a higher cost as either you’re still copying it from the page cache into a user buffer (non trivial performance impact vs the read itself) + trashing your CPU caches or you’re doing direct I/O and having to implement a page cache manually (and risks duplicating page data inefficiently in userspace if the same file is accessed by multiple processes.

gpderetta · 2025-10-24T15:17:17 1761319037

Right, so zero copy IO but still having the ability to share the pagecache across process and allow the kernel to drop caches on high mempressure. One issue is that when under pressure, a process might not really be able to successfully read a page and keep retyring and failing (with an LRU replacement policy it is unlikely and probably self-limiting, but still...).

kragen · 2025-10-24T16:57:07 1761325027

To take advantage of zero-copy I/O, which I believe has become much more important since the shift from spinning rust to Flash, I think applications often need to adopt a file format that's amenable to zero-copy access. Examples include Arrow (but not compressed Feather), HDF5, FlatBuffers, Avro, and SBE. A lot of file formats developed during the spinning-rust eon require full parsing before the data in them can be used, which is fine for a 1KB file but suboptimal for a 1GB file.