Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are many patches of almost-identical sites.

Some of them are due to many people using the same theme.

Some of them are expired or parked domains, which I reckon should be detected and excluded.





Yeah those clusters are interesting. They stand out, so they are the first thing I zoomed in on, then I realized they're all just stock resume sites. Quickly realize the clusters are something to avoid. Turns out to be an effective visualization method.

The thing I find interesting is where the grouping is robust to colour variations: one of the bigger groups is around 25% from left, 20% from bottom, all one theme but in a wide variety of colours.

Yeah, I wonder why parked domains are included. Are there not at least 1 million actual websites?

>Some of them are due to many people using the same theme.

Teeming masses of sites using what probably seems to the authors as a fresh, unconventional look but ends up being Yet Another.


I doubt anyone selecting a popular theme is confused by the fact that it’s popular. I use the default Mediawiki theme for mine, for instance.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: