Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

doing these sorts of benchmarks is actually quite tricky. you must clear the page cache by allocating >1x physical ram before each attempt.

moreover, mmap by default will load lazy, where mmap with MAP_POPULATE will prefetch. in the former case, reporting average operation times is not valid because the access time distributions are not gaussian (they have a one time big hit at first touch). with MAP_POPULATE (linux only), there is long loading delay when mmap is first called, but then the average access times will be very low. when pages are released will be determined by the operating system page cache eviction policy.

the data structure on top is best chosen based on desired runtime characteristics. if it's all going in ram, go ahead and use a standard randomized hash table. if it's too big to fit in ram, designing a structure that is aware of lru style page eviction semantics may make sense (ie, a hash table or other layout that preserves locality for things that are expected to be accessed in a temporally local fashion.)



> you must clear the page cache

In Linux there is a /proc/sys/vm/drop_caches pseudo file that does this. Look how great Linux is compared to other OSes.


that's super cool! live and learn. even better would be the capability to drop caches from a supplied point in the filesystem hierarchy.


People would run it from cron to "free memory", believe it or not.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: