From reading the TIL, it doesn't appear as if Simon used LLM for a large portion of what he did; only the initial suggestion to check the archive, and the web tool to make his process reproducible. Also, if you read the script from his chat with Claude code, the prompt really does the heavy lifting.
Sure, the LLM fills in all the boilerplate and makes an easy-to-use, reproducible tool with loads of documentation, and credit for that. But is it not more accurate to say that Simon is absurdly efficient, LLM or sans LLM? :)
Nothing smart with HTTP range requests yet - I have https://lite.datasette.io which runs the full Python server app in the browser via WebAssembly and Pyodide but it still works by fetching the entire SQLite file at once.
https://github.com/simonw/sqlite-s3vfs
This comment was helpful in figuring out how to get a full Git clone out of the heritage archive: https://news.ycombinator.com/item?id=37516523#37517378
Here's a TIL I wrote up of the process: https://til.simonwillison.net/github/software-archive-recove...