The performance difference was larger than I had expected. But this is good.
To fix a recent crash [1] that was happening due to a particular case of reentrancy, which only showed up when I implemented virtual tables and queried other tables to implement one (e.g.: Go calls sqlite3_step to execute a query, which calls Go because it's a query on virtual table, which calls sqlite3_step to scan another table) I introduced a performance regression.
The fix [2] was not to reuse some objects I was allocating once per connection. A mitigation for the regression was (very naive) caching [3].
TLDR: my caching is just not good enough. Simply caching more will go a long way (confirmed already by doubling cache size), but now that I have a good benchmark, I'll do better.
I expect to cut numbers for CPU bound tests in half due to this mishap.
W.r.t. benchmark results.
wazero's current compiler is somewhat naive, which may explain a large performance delta in CPU bound tests. A new compiler is in the works [1].
OTOH it seems interesting that in the (IO bound?) large test I'm doing better than modernc. I wonder why.
I'll dig deeper into the results.
[1]: https://github.com/tetratelabs/wazero/pull/1869