MADV_FREE is also extremely expensive. CPU vendors have finally simplified TLB shootdown in recent CPUs with both AMD and Intel now having instructions to broadcast TLB flushes in hardware, which gets rid of one of the worst sources of performance degradation in threaded multicore applications (oh the pain of IPIs mixed with TLB flushing!). However, it's still very expensive to walk page tables and free pages.
Hardware reference counting of memory allocations would be very interesting. It would be shockingly simple to implement compared to many other features hardware already has to tackle.
It's quite expensive to free pages under memory pressure (though it's not clear that there's any other choice to be made), but if the pages are never freed it should be cheap, AIUI.
Hardware reference counting of memory allocations would be very interesting. It would be shockingly simple to implement compared to many other features hardware already has to tackle.