Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, aren't Beam Search and other searches also used and more sophisticated than greedy selection?


ChatGPT and other LLMs for that matter are most definitely not using beam search or greedy sampling.

Greedy sampling is prone to repetition and just in general gives pretty subpar results that make no sense.

While beam search is better than greedy sampling, it's too expensive (beam search with a beam width of 4 is 4x more expensive) and performs worse than other methods.

In practice, you probably just wanna sample from the distribution directly after applying something like top-p: https://arxiv.org/pdf/1904.09751.pdf


They barely use beam search. It requires running multiple parts of the generation, and so is expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: