Now News Corp Has Started Blocking Some Sites, No Excuse Not To Block Google
UK News Corp newspaper The Times has started blocking some sites from indexing its content using the Robots.txt standard, demonstrating that not only do they know how to do it, there’s no longer any excuse when it comes to them not blocking Google.
Some time last week, the Robots.txt file for timesonline.co.uk was amended to block a number of sites, among them the UK based news aggregator NewsNow and (weirdly) the Alexa archiver.
The singling out of NewsNow seems to be primarily motivated by a dispute about commercial use, with a Times spokesman telling PaidContent UK that
“NewsNow has been using Times Online content as part of its paid-for, commercial as well as free services. They have continued to do so despite our direct requests for them to stop. As a result, we have taken the decision to disallow their indexing of our content. News International makes a significant investment in journalism and we believe that it is entirely appropriate for us to ask that our rights are respected. NewsNow has acknowledged that they require our permission to use our content and, in the absence of our permission, has ceased to do so.”?
Of note is mention in the robots.txt file that “spidering” (that is, indexing) of the site is not allowed under the paper’s terms and conditions.
#Spidering is not allowed by our terms and conditions
#Authorised spidering is subject to permission
#For authorisation please contact us – see
As of the time of writing, Times content was still being indexed by Google, and the paper has made no attempt to block Google from spidering the site, despite it being as easy as adding two lines to the file (or simpler, implementing a block to all sites.) The inclusion of the T&C clause might be the start of laying the ground work for a legal challenge; after all, why not block Google upfront if you don’t want your site indexed unless you want to bring about an eventual showdown.
The ban of some sites via robots.txt by The Times proves that despite bizarre and unfounded stealing allegations made by News Corp execs and their supporters, they know how easy it is to get their content out of Google. There’s no longer any excuses: if they were serious about the alleged “content theft” from Google, they should block Google today, not tomorrow or next month, or in three months time. Failing to do so just demonstrates again that this is a game where they’re trying to milk money out of Google by talking big, while failing to act on their alleged convictions.