[pmwiki-users] Google local site search

H. Fox haganfox at users.sourceforge.net
Thu Dec 29 12:52:48 CST 2005


On 12/29/05, Patrick R. Michaud <pmichaud at pobox.com> wrote:
> On Wed, Dec 28, 2005 at 03:24:27PM -0700, H. Fox wrote:
> > On 12/28/05, Patrick R. Michaud <pmichaud at pobox.com> wrote:
> > > [...]the approach
> > > I'm using isn't modifying <a ...> tags, it's modifying the url that
> > > appears in the tag.  Specifically, it's removing any unpermitted
> > > "?action=" that appears after "$ScriptUrl".
> >
> > So, for example, a page's Edit and History links become self-referring
> > links, correct?
>
> Yes.

Then it seems like it will mislead the search index.

Every page will be indexed as a link that says "Edit" for instance. 
Nearly every page on pmwiki.org will have eight extra self-referring
links that lack rel='nofollow' attributes?

(I still don't understand why the default skin doesn't use nofollow
attributes in the PageAction links.)

> > If so, I think I'd rather leave the action and include the
> > rel='nofollow' attribute.
> >
> > Bot the Light Skin and Lean Skin include the rel='nofollow'
> > attributes, so now there might be unintentional 'nofollow' links to
> > normal wiki pages -- possibly resulting in page-rank punishment.
>
> Google is quite clear that rel='nofollow' doesn't punish a page,
> it simply causes the link to not be counted at all for page
> ranking purposes.

I see.  So the fact that there's a nofollow link to a page doesn't
affect its rank negatively at all, and that's pretty much guaranteed.

In that case, why cloak for googlebot?  Why not keep the ?action=
parameters intact and use the rel='nofollow' attribute for bots that
understand it?

> And even if the Skins include rel='nofollow' in the templates,
> what about markup...?
>
>     [[OtherPage]]
>     [[OtherPage?action=edit]]
>     [[OtherPage?action=dc]]

It's a separate issue.  The bottom two should get rel='nofollow'
attributes by default.

> > > > I'd expect most search engines to honor "nofollow" by not following the
> > > > link anyway. [...]
> > >
> > > In reality, for the first six months after rel="nofollow" was
> > > introduced, Google followed the links anyway (but didn't weight them).
> > > Google didn't stop following the links until sometime in July 2005.
> > > So, despite the reasons you give, Google chose to follow the
> > > rel="nofollow" links for quite some time after rel="nofollow" was
> > > introduced.
> >
> > Since future PmWiki versions will not be run prior to July 2005 what's
> > important is the fact that they're not following the links now. right?
> >  Maybe I'm missing something...
>
> My point was that for at least six months Google thought it was
> worth the extra code and expense to follow rel="nofollow" links
> but not count the link.  So, if Google chose this path, it's entirely
> likely that other search engines will go ahead and follow rel="nofollow"
> links.
>
> In fact, this is *really* the point:  Many spiders/robots won't
> look for rel="nofollow" at all!  The robots exclusion protocol
> doesn't say anything about rel="nofollow" in links, so robots
> that are following the protocol are still going to follow such
> links.  I've looked at several of the available search engine
> packages (htDig, swish-e, mnoGoSearch, WebGlimpse), and as far
> as I can tell *none* of those honor rel="nofollow" in links.
>
> So, the advantage of cloaking "?action=" is that it will work
> even for robots that don't understand rel="nofollow".

Then it should be used for those, and not for robots that are known to
understand rel='nofollow'.

I'm not entirely convinced it "works" for robots that understand
rel='nofollow', since you're showing the robot a lot of misleading
links (i.e. "transformed pages") for no reason.

Hagan




More information about the pmwiki-users mailing list