In theory, yes. In practice the shit data you are working with (descriptions that are one or two words or the same word with ref id) really benefit from a) an agent that understands who you are and are likely spending money on b) has access to tool calls to dig deeper into what `01-03 PAYPAL TX REFL6RHB6O` actually is by cross referencing an export of PayPal transactions.
I think the smarter play is having an agent take the first crack at it, and build up a high confidence regex rule set. And then from there handle things that don't match and do periodic spot checks to maintain the rule set.
I think the smarter play is having an agent take the first crack at it, and build up a high confidence regex rule set. And then from there handle things that don't match and do periodic spot checks to maintain the rule set.