Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Now imagine you aren’t flipping coins. Imagine you are all running a model on a competition test set. Instead of wondering if your coin is magic, you instead are hoping that your model is the best one, about to earn you $25,000.

> Of course, you can’t submit more than one model. That would be cheating. One of the models could perform well, the equivalent of getting 8 heads with a fair coin, just by chance.

> Good thing there is a rule against it submitting multiple models, or any one of the other 99 participants and their 99 models could win, just by being lucky

I wonder what the author must think of Poker tournaments. Even assuming there is luck involved, unless all of the models are equally bad (which would be surprising) teams that produce better models should win much more than their fair share, where fair share is 1/N when N is the total number of submissions.

But lets say that the author is correct that ML competitions are mostly luck. That is a testable hypothesis - in particular, we would expect little to no correlation between the credentials of the competitors and their ranking in the competition. Is that actually the case? Do unknown individuals who just started doing Machine Learning win on their first Kaggle submission? If the author's hypothesis is correct, one would expect that to happen fairly often, and one would expect that even highly expert competitors should win approximately (only) their fair share.



I don't think the author is saying that the best team only wins 1/N of the time, where N is the total number of participants. Far from it.

What they're saying, as far as I understand it, is that the best k teams (where k << N) each win roughly 1/k of the time.


while there is variance in how my models do against a test set, its highly unlikely my rank 10 model is going to dethrone a #1 rank model, nor a rank 1k model going to beat mine. Its possible, but only because I over fit the public leader board, and good ML practices help prevent that (such as if your cross validation improves a model but it does worse on the public leader board, be inclined to trust your cross validation).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: