Wildcards in robots.txt
Words by Daniel Aleksandersen on 2007-04-20
A list ofsearch engine crawlers that support wildcards, and the Allow: extension in robots.txt configuration files.
The example below will block /public_information/file.html?hidden_value=something, but will allow crawling of other pages in the /public_information/ directory.
# Wildcase example blocking ?hidden_value in
# the /public_information/ directory.
User-Agent: *
Disallow: /public_information/*?hidden_value
The wollowing engines support the above:
- Googlebot (Google)
- msnbot (Live Search)
- Slurp (Yahoo! Search)
- teoma (Ask!)
All the major search engines' crawlers does indeed support wildcards! And all of the above support the Allow: extension as well.

Leave your comment