There's actually more than one, though Kaitai probably has the most maturity of any of them.
Various hex editors have their own formats. 010 Editor has C-style binary templates, imhex has a binary pattern language as well. Okteta has Okteta Structure Definitions which can be declared using XML or with JS.
Kaitai Struct is the most complete system that has code generation for multiple programming languages and isn't tied to a hex editor or anything else for that matter. That said, I think there's still a ton of room for improvement and innovation. Kaitai has a lot of useful tooling, but I think as it is today it falls a bit short: the code gen is not at the same support level for all languages (most languages are fairly limited), and I think serialization is still mostly experimental. That and there's probably a lot you could do to still make it more expressive and powerful.
An adjacent or complementary field is description of data in transit. Wireshark dissectors come to mind. I think it'd be quite useful to unify these fields.
I had been trying to make a Kaitai to Wireshark Dissector compiler in my third party Kaitai implementation[1]. However, the Wireshark emitter is still basically useless for now. It only supports basic structs with basic attrs.
I mainly started a third-party Kaitai implementation to experiment a bit with supporting new features in Go, and also just to have a native Go implementation for convenience, since I'm still not very good at Scala. However, once an approach is developed for how exactly to handle emitting to Wireshark it should be purely mechanical to graft on a Wireshark emitter to the upstream Kaitai Struct compiler, too.
In addition to languages, there's a Python library called "construct" that's been around for a long time. It uses a declarative style to make it surprisingly easy to make binary parsers and emitters.
Completely different problem, completely different solution.
Protobuf and its ilk (ASN.1, Cap’n Proto, etc.) have you describe a tree structure, then map that to bytes according to their own sensibilities. Kaitai and its ilk (Wireshark might be a more familliar member of the group) have you describe a bunch of data structures as well as somebody else’s pretty much arbitrary ideas as to how they are to map to bytes, then deal with the results.
You can’t use a Protobuf implementation to get EXIF data out of JPEGs, but then you can’t get format evolution guarantees out of Kaitai either.
(I hear ASN.1 can somewhat cross the gap using ECN, but as far as I can tell literally nobody uses that in public.)