A bit more advanced than this post, but for calculating Jacobians and Hessians, the Julia folks have done some cool work recently building on classical automatic differentiation research: https://iclr-blogposts.github.io/2025/blog/sparse-autodiff/
Have you tried using Enzyme (https://enzyme.mit.edu/)? It operates on the LLVM IR, so it's available in any language that breaks down into LLVM (e.g., Julia, where I've used it for surface gradients) and it produces highly optimized AD code. Pretty cool stuff.
Yeah I've used it (cool project indeed!), albeit mostly just in a project I and others in the autodiff community maintain which benchmarks many different autodiff tools against each other: https://github.com/gradbench/gradbench