It's fundamental in that it's harder because there's less information per token. But we know it's not impossible because they can get nesting right at all, it's just a question of where the boundary is today. And if different models have different crapping-out points, then there's a gradient there and future models can do better.
In token terms it's more like the fingers problem than the strawberry problem. ")" is a single token, but the model gets confused by several repeats of the same thing.
In token terms it's more like the fingers problem than the strawberry problem. ")" is a single token, but the model gets confused by several repeats of the same thing.