Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As much ChatGPT says I’m basically a genius for asking it a good Vegan cake recipes, I don’t think that is providing it any data it doesn’t already have that makes it anyway better. Also at this point the massive increases in data and computing power seem to bring ever decreasing improvements (and sometimes just decline), so it seems we are simply hitting a limit this kind of architecture can achieve no matter what you throw at it.


ChatGPT chat logs contain massive amount of data teased out of people’s brains. But much of it is lore, biases, misconceptions, memes. There are nuggets of gold in there but it’s not at all clear if there’s a good way to extract them, and until then chat logs will make things worse, not better.

I’m thinking they eventually figure out who is the source of good data for a given domain, maybe.

Even if that is solved, models are terrible at long tail.


When I say models will plateau I don't mean there will be no progress. I mean progress will slow down since we'll be scraping the bottom of the barrel for training data. We might never quite run out but once we've sampled every novel, web site, scientific paper, chat log, broadcast transcript, and so on, we've exhausted the rich sources for easy gains.


Chat logs don’t run out. We may run out of novelty in those logs, at which point we may have ran out of human knowledge.

Or not - there still knowledge in people heads that is not bleeding into ai chat.

One implication here is that chats will morph to elicit more conversation to keep mining that mine. Which may lead to the need to enrage users to keep engagement.


The necessity of higher quality data from vetted experts is why Mercor just raised at 10B


I’mafraid I don’t share your optimism. I think we are more or less seeing the limitations of the transformer architecture.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: