Well, aren't Beam Search and other searches also used and more sophisticated tha...

techbruv · on Feb 14, 2023

ChatGPT and other LLMs for that matter are most definitely not using beam search or greedy sampling.

Greedy sampling is prone to repetition and just in general gives pretty subpar results that make no sense.

While beam search is better than greedy sampling, it's too expensive (beam search with a beam width of 4 is 4x more expensive) and performs worse than other methods.

In practice, you probably just wanna sample from the distribution directly after applying something like top-p: https://arxiv.org/pdf/1904.09751.pdf

anothernewdude · on Feb 14, 2023

They barely use beam search. It requires running multiple parts of the generation, and so is expensive.