Ah, I remember reading this paper! Essentially, they created synthetic data by solving search problems using A*. Trained on this data, transformers unsurprisingly learned to solve these problems. They then improved their synthetic data by repeatedly solving a given problem many times with A*, and keeping only the shortest step solution. Transformers learned to be competitive with this improved search heuristic too!
Pretty remarkable stuff. Given the obsession over chat bots, folk often miss how revolutionary transformer sequence modeling is to...well, any sequence application that isn't a chat bot. Looking solely at speeding up scientific simulations by ~10x, it's a watershed moment for humanity. When you include the vast space of other applications, we're in for one wild ride, y'all.
SAT appears to be rather impervious to statistical methods because it is very non-linear and non-local (besides, of course, being NP-complete). Flip one single literal anywhere and poof!, your formula mght go from SAT to UNSAT, or vice versa, or you might enable a ton of simplifications, etc. So it might be very hard to train a network on such a riotous state space.
Additionally, in practice SAT solvers are also supposed to provide witnesses (i.e., a satisfying assignment or an unsatisfiability core) to back their verdicts, and that may be an ever taller order for ML.
Since all of these combinatorial optimization problems are, in full generality, are as difficult as possible in the complexity theoretic sense, there is little prospect of AI/ML helping to solve these problems in general. (?)
But people hope that some subclass of "real-life" instances of one of these optimization problems has enough structure that AI/ML may be able to learn useful heuristics for it. (This is not my field. This is just my take-away from reading some AI/ML-for-CO papers.)
Seems kind of clear to me we’re driving towards as little statefulness as possible, compressed symbolic logic, that’s able to compute as many states as possible. Less resource use on RAM and storage as available chip cycles continues to grow
Human biology is a min/max engine, evolved for years to “enough” food, shelter… and we see daily there is not static state to reality. Our social behavior to try and preserve state is some cognitive dissonant “War is Peace.” insanity
Math is a compressed form of communication. “Add two apples to the set of all apples” is literally more glyphs than the same statement in usual math syntax. Paper and written literacy was hard to come by back in the day, but verbal literacy was cheap and ubiquitous.
The economy is built on physical statistical measures. Such a framework to keep TP and food on shelves is the minimal state needed to keep people from rioting. Economy has nothing to do with historical story and political memes at this point.
We could look further into computing in other substrates, like mycelia. Biochemically shape structure to represent certain states.
Computing is a concept not coupled to contemporary ignorant business machines. We made a social monolith around computing by propping up big tech
what are those other applications? so now everyone has SoTA warehouse scheduling in their pockets no one can be expected to do anything or go anywhere inefficiently.
I want to see an optimal (2P) speedrun with Sonic the Hedgehog and Tails. As a fun proof. There are two players working together; also it has more degrees-of-freedom than a Sokoban.
Pretty remarkable stuff. Given the obsession over chat bots, folk often miss how revolutionary transformer sequence modeling is to...well, any sequence application that isn't a chat bot. Looking solely at speeding up scientific simulations by ~10x, it's a watershed moment for humanity. When you include the vast space of other applications, we're in for one wild ride, y'all.