[pmwiki-users] Google local site search

Patrick R. Michaud pmichaud at pobox.com
Mon Dec 26 18:48:20 CST 2005


On Tue, Dec 27, 2005 at 12:11:59AM +0000, Hans wrote:
> Monday, December 26, 2005, 8:48:15 PM, Patrick wrote:
> >     (:input form action="http://www.google.com/custom" method=get:)
> >     (:input hidden sitesearch pmwiki.org:)
> >     (:input text q:)
> >     (:input submit value="Search via Google":)
> >     (:input end:)
> 
> > See, for example, http://www.pmwiki.org/wiki/Test/GoogleSearch .
> 
> this is great.
> Testing it I notice quite a number of google results which have
> undesirable parameters, like action=source or setcolor=blue.
> How can google be prevented from indexing pages with such parameters,
> cookie settings etc? Will google robots follow blindly any link on the
> page, or can certain links be excluded, to give better google results.

Newer versions of PmWiki (since 2.1.beta8) automatically return 
"403 Forbidden" errors to robots for any action other than 
?action=browse, ?action=rss, or ?action=dc.  However, over the
years Google has built up a fairly large cache of PmWiki pages,
so they'll likely continue to appear in Google's search results
until they are somehow expired from Google's database.

In addition, if $EnableRobotCloakActions is set, then any ?action=
parameters are removed from page links when viewed by a robot,
so that those robots won't blindly follow links to unimportant
pages.  At the moment $EnableRobotCloakActions is disabled by default, 
because some admins may feel that presenting robots with such
modified views of a page might cause their sites to be negatively
rated by search engines.  (I've seen opinions on both sides of the
issue here.)

Pm




More information about the pmwiki-users mailing list