Yes of course. These are calculators - they are meant to reliably calculate things.
I think the difference is that building 60 interactive calculators manually would force you to do a lot of manual testing. If someone built up that many interactive calculators I would imagine a lot of attention has gone on each one. Why would they spend so much time on something and not test it?
Would you be asking the same question if it's written without AI? How can any software be always working will all edge cases?