That's an obvious exaggeration. The competition is using smaller weights already... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Dylan16807 7 months ago \| parent \| context \| favorite \| on: Bit is all we need: binary normalized neural netwo... That's an obvious exaggeration. The competition is using smaller weights already, some of which are floating point and some of which aren't. And they use full size floats for training.

imtringued 7 months ago [–]

That means their paper is actually worse than SOTA, which is concerned with training in fp4 natively without full precision [0] for QAT.

[0] "full precision" in ML usually means 16 bit floats like bfloat16

Dylan16807 7 months ago | [–]

I wouldn't say "worse". It's focusing on inference cost and leaving training at a default for now.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact