Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have to say I don't really get the website either. If the author is against scraper why not serve massive dummy content that it bloats their storage? Why all this linking? Maybe it's used to build (fake) page rank credibility and sometimes a link to one of the content farm pages is referenced on other pages, so these get boosted then?


Presumably he would be paying for egress of those massive files?


So render it clientside and hope the crawler understands javascript?

Maybe run your own training in javascript, too, and use OpenAI's crawlers' compute for it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: