[Pmwiki-users] reducing wikispam to size

Patrick R. Michaud pmichaud
Tue Dec 21 09:41:51 CST 2004


This morning I had a revelation for another tactic to control and block
wikispam, which will be available in the next release.

Wikispammers like open wiki sites because it gives them a place to
post links to external sites, in an attempt to increase Google pageranks
and other traffic coming to the site.

PmWiki 2.0 already comes with the 'urlapprove.php' script, that allows
a site to maintain an "approved urls" list.  Urls that appear in a wiki
page and are not in the approved list are not converted to links, and
are given an "approve links" button that makes it easy for someone with
appropriate privileges to add the sites on the page to the approved list.
As a result, wikispammers don't get any benefit from Google and the
like from posting their links to the page.

But this doesn't stop them from posting the links.  Perhaps they don't
realize the spam doesn't help their pagerank -- more likely they just
don't care.  So, we still end up with pages with lots of links to
external sites that have to be cleaned up.

No more.  My revelation this morning was that since spammers tend to
post large quantities of unapproved links, all we have to do is block
any post that contains more than some small number of unapproved links.
The key is to count just the *unapproved links* resulting from a post.
The page can contain as many approved links as desired, and we always
look at the total number of approved links within a page (i.e., a spammer
can't circumvent the restriction by submitting lots of small posts).

Since authors typically only add a small number of links to external 
pages before those links are approved, this limitation doesn't affect
the vast majority of authors' edits.  It only affects authors who 
might need to post large numbers of unapproved links into a page before
they can be approved (say, in a page that contains a directory of
links to external sites).  But even this is solvable -- my solution on
pmwiki.org has been to allow such pages to use unapproved urls; yes,
spammers can attack those pages, but the damage they can cause is
also limited to those pages.

One downside is that this may drive spammers to instead start posting
small changes to lots of different pages, thus making the overall
cleanup job a bit more difficult.  But we can probably find ways to
treat this problem if/when it occurs.

The limit-unapproved-link-posts feature is currently enabled on 
pmwiki.org -- feel free to see what happens if you attempt to add a 
lot of unapproved links into a page (but not the Main.WikiSandbox, 
as it's one of those pages that is currently configured to allow 
unapproved links).

I'll be watching pmwiki.org to see if it reduces the spam we receive,
and I'll release the code for this feature into beta12.

Pm



More information about the pmwiki-users mailing list