[pmwiki-users] slurp is broken

christian.ridderstrom at gmail.com christian.ridderstrom at gmail.com
Wed Jul 19 14:34:44 CDT 2006


On Wed, 19 Jul 2006, Patrick R. Michaud wrote:

> On Wed, Jul 19, 2006 at 11:36:53AM -0500, JB wrote:
>> PM,
>>
>> Can I please get a copy of your robots.txt file?
>
> Also, for any who are interested, here's the relevant
> sections of my root .htaccess file, which denies certain
> user agents at the webserver level instead of waiting
> for PmWiki to do it:
>
>    # HTTrack and MSIECrawler are just plain annoying
>    RewriteEngine On
>    RewriteCond %{HTTP_USER_AGENT} HTTrack [OR]
>    RewriteCond %{HTTP_USER_AGENT} MSIECrawler
>    RewriteRule ^wiki/ - [F,L]
>
>    # block ?action= requests for these spiders
>    RewriteCond %{QUERY_STRING} action=[^rb]
>    RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
>    RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
>    RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
>    RewriteCond %{HTTP_USER_AGENT} Teoma [OR]
>    RewriteCond %{HTTP_USER_AGENT} ia_archive
>    RewriteRule .* - [F,L]

The obvious solution: Add this to some PmWiki page?  Perhaps something 
about administrative tasks? Or something related to robots.txt?

/Christian

-- 
Christian Ridderström, +46-8-768 39 44               http://www.md.kth.se/~chr


More information about the pmwiki-users mailing list