[pmwiki-devel] mySQL as a DataStore for large pmwiki with lots of blogs

Patrick R. Michaud pmichaud at pobox.com
Mon Feb 26 15:33:42 CST 2007


On Mon, Feb 26, 2007 at 10:21:20PM +0100, Siegfried Hirsch wrote:
> Hello,
> 
> I have a question regarding pmwiki and using mySQL as a data store. I
> have seen some small hints in the cookbook and pmwiki docs, that this
> could be done.
> ...
> And we think the users will create about 2-4 blog postings
> (wiki-pages) every week for the documentation efforts. So we will have
> about 40000 pages within one year.
> 
> The problem, that I see is with the pagelist and this whole lot a
> category links. 

PmWiki's handling of categories is already fairly quick -- it
already indexes the categories (and other pages) so that PmWiki
can produce pagelists much quicker than scanning all of the
records.

Pagelists are slow only when PmWiki has to scan a lot of pages
in order to produce the list of pages.  When there's an index
already in place, such as trails or the .pageindex, then 
(:pagelist:) is actually quite fast.

> There is also the need to provide RSS-Feeds for every blog (of course)
> and for every category or mixture of categories.

This works naturally if blogs are stored as trails (which is how
I'm implemnting my blog feature).
> 
> My questions:
> - do you think, mysql is needed or is there another way ?
> - i think pagelist is the bottle neck of this system - is this right ?
> - how could I speed up pmwiki otherwise ?

I don't think 40,000 pages is too big for PmWiki to handle in this
case, as long as the indexes are kept up to date.  And in this
case, since each individual category and blog is fairly small, 
it's only large in the aggregate (and PmWiki should be able to
handle that fairly well).

At some point I plan to come up with a more streamlined way to
build indexes even while keeping PmWiki's per-file pagestore...
unfortunately PHP doesn't provide a generic "dbm" style capability
that can be reliably counted on everywhere, so I'll have to have
an interface layer in the middle.  That's part of the improvements
I have listed in the PmWiki.RoadMap.

Lastly, yes, it is possible to use MySQL to store pages and
indexes... but somehow I think that indexing should be orthogonal
from pagestores -- e.g., put indexing in MySQL tables
while keeping some pages as file-based pages.  I know it seems
counter-intuitive to have indexing separate from the page storage,
but I think it preserves the greatest flexibility in the long run.

And yes, it would be possible to implement a PageStore class
that uses MySQL, but since pagelist and page storage are largely
orthogonal already, I don't think having a MySQL-based page storage
will buy much until (:pagelist:) is redesigned to take advantage
of it.

Pm



More information about the pmwiki-devel mailing list