Wiki-Spammers (was: [Pmwiki-users] pmwiki.org - version 1 - vandalised)

Ciaran ciaranj
Thu Jan 13 14:49:42 CST 2005


> Instead or additionally to blacklist one could use bayesian networks[1]
> that could learn what texts can be allowed to be saved and which should be
> abandoned.
> 
In principle I quite like the idea of a bayesian network for
classifying good wiki pages from bad, however the amount of data
required to train a bayesian network is generally quite large, i.e.
the wikisite would have to have quite a large growth rate in order to
generate sufficient data to come up with meaningful training data. And
perhaps more importantly initially *every* wiki-page post would need
to manually classified by (a) trusted individual(s) which doesn't seem
a lot better than requiring every wiki-page to be authorised by
someone who is trusted....plus you'd need a set of 'bad posts' which
may be relatively tricky to get hold of as most people are pretty good
at cleaning up their sites <g>. But this of course is just my opinion,
and I'm quite cheerful to be told I'm speaking rubbish ;)
- Ciaran



More information about the pmwiki-users mailing list