Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Only Google is allowed to scrape the web."

If I'm not mistaken, the plaintiffs in the US v Google antitrust litigation in the DC Circuit tried to argue that website operators are biased toward allowing Google to crawl and against allowing other search engines to do the same

The Court rejected this argument because the plaintiffs did not present any evidence to support it

For someone who does not follow the web's history, how would one produce direct evidence that the bias exists



> For someone who does not follow the web's history, how would one produce direct evidence that the bias exists

Take a bunch of websites, fetch their robots.txt file and check how many allow GoogleBot but not others?


Common Crawl provides gzipped robots.txt collections




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: