Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Copyright violation isn't just when you can output 100% exact copies of books. And don't forget, they also violated copyright internally billions of times during training. If any of us had been caught making copies of corporate-owned content for AI training use five years ago, we'd be in for zillion-dollar lawsuits that would make any grandma who downloaded a song from Napster blush.


There is a very good argument to be made that training AI is fair use, as it is both transformative and does not compete with the original work. This has yet to be tested in court.


If you copy your cd for backup with no resale future, no one would waste time to sue you.


Because they wouldn't catch me. But if they did, especially if they caught me making a copy of every CD at the CD store as a backup, especially if they caught me making a copy of every bootleg CD I could get my hands on (as a backup), I'd be in big trouble.

Did you know a lot of LLM training data is scraped from illegal pirate libraries such as Anna's Archive?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: