December 11, 2012

Is following roboto.txt illegal? How about nofollow links?

Can a /robots.txt be used in a court of law?

There is no law stating that /robots.txt must be obeyed, nor does it constitute a binding contract between site owner and user, but having a /robots.txt can be relevant in legal cases.
Obviously, IANAL, and if you need legal advice, obtain professional services from a qualified lawyer.
Some high-profile cases involving /robots.txt:
And you'll find lots of discussion over at Groklaw.


Can a robots.txt file be enforced legally?


Neil Mansilla, Builder
3 votes by Anon User, Mark Hope, and Stephen McInerney

I believe you're asking if a Web site owner has any chance of winning a case against a party that uses a Web crawler that downloaded data automatically despite the presence of the robots.txt entry requesting that crawler not to download/follow/index data.. correct?

As Jascha Wanger mentioned, there are no laws to enforce regarding the Robot Exclusion Standard.  There are many well known measures that can be taken by the Web site owner to restrict crawler access, such as .htaccess basic authorization/authentication [1] and CAPTCHA [2]. The only argument that the Web site owner might make is trespassing [3], and without any actual safeguards in place to actually keep unwanted users/crawlers out, an every day user with a standard Web browser, aided without any special expertise, could accomplish the same task of downloading that Web site data.

There is case law to review [4] around crawling.  But probably the most important thing to consider is any legal contract/agreement between the parties.  So, for instance, if a Web site owner has an agreement with a third party with provisions about obeying robots.txt, the Web site owner has firmer legal ground if the third party does not comply with provision.

3.  http://www.freelegaladvicehelp.c...

Neil Mansilla

Jascha Wanger, Managing Partner @ Loomic Labs

There is no actual laws on the books stating people must obey a robots.txt file. But I have heard of sites trying to use for example a harsh crawler that does not obey the crawl delay to threaten legal action since they can then prove that the crawler adversely impacted their site's performance. This can be then give way to the argument that it caused lost revenue and other damages. Not sure how well that would hold up in court. But in the cases I have seen the threat of legal action sufficed. The robots.txt would just be valuable evidence since it can be proven to be a web standard. 

No comments:

Post a Comment