[pmwiki-users] Trouble with .pageindex when too much _new_ data to index (+ sqlite)

Patrick R. Michaud pmichaud at pobox.com
Wed Jan 28 20:11:28 CST 2015


On Thu, Jan 29, 2015 at 12:45:44AM +0100, ABClf wrote:
> As of now, I say : after I import a big amount of new data (in sqlite
> recipe), pmwiki fails to start indexing and something runs out of memory
> before it starts indexing. 

You say it runs out of memory, but the error messages you're posting say that it's running out of time.  They aren't exactly the same, so which is it?


Thereafter it's dead. Please note PmWiki is
> still working fine for printing out pages, for linking, etc., no matter the
> quantity of pages.
> Fails happened several times to me this evening and the only working way
> was to limit the amount of new imported data. (do not import 60 mo of new
> data, but import 6 mo and do it 10 times).

When you say "import 60 mo of new data", is that a single page with 60 mo,
or is it a large number of pages that collectively contain 60 mo?

> What if PmWiki has suddenly 10k new pages to index ? It may take time to
> list all of these new pages, and to get ready to index them in the next
> step. In my simulation case, new pages are very short in size, yet they are
> quite a lot (90000 quotes = 90000 pages ;)). How would I debug a bottleneck
> if ever in the very early task of indexation ?

If PmWiki has 10K pages to index, it indexes as many as it can in $PageIndexTime
seconds (for short pages this should at least one or two thousand pages), and then
the next time it runs it will fewer pages to index.

> Tested just now :
> I alter .pageindex, .flock and .lastmod 's names in wiki.d, so it's like
> there is nothing else the 100 mo sqlite database, and I run a search in the
> browser.
> PmWiki has to index a sql database made of 100000 new pages ;) I see my
> hard drive led running fast, I feel temperature going higher, (really, some
> processing is happening), and finally error message is printed out on
> browser screen : Fatal error: Maximum execution time of 30 seconds exceeded
> in D:\xampp3\htdocs\abclf\scripts\xlpage-utf-8.php on line 75

This isn't an out of memory message, it's an "out of time" message.  What is
likely happening is that there's a very large page that PmWiki is unable to
index before PHP's 30 second timeout takes place.  Is there a very large page
in your database?

If that's not the case, try increasing the execution time limit to 60 seconds.
Add the following to a config.php:

    set_time_limit(60);

> Checking in wiki.d : a new 0 byte .flock file is has been created. No new
> .pageindex

Is there a ".pageindex,new" file created?  If not, then PmWiki must be terminating
long before page indexing is starting -- perhaps when doing the search itself.
One of the first things that the indexing function does is to create the 
".pageindex,new" file.  So if that file isn't being created, then the problem 
is occuring *before* indexing ever gets started, not during.

> I rerun a search. Hard drive rerun, and message error : Fatal error:
> Maximum execution time of 30 seconds exceeded in
> D:\xampp3\htdocs\abclf\pmwiki.php on line 2015
> [...]
> Last rerun, and last error message is : Fatal error: Maximum execution time
> of 30 seconds exceeded in D:\xampp3\htdocs\abclf\cookbook\sqlite.php on
> line 403

These are functions called when reading pages, so again, it could be that PmWiki
is timing out when doing the initial search itself, as opposed to when it's 
building the index (which comes afterwards).

What search are you doing?

Instead of doing a text search, which will take PmWiki a very long time
to complete if there's no index already built (and is probably why things are
timing out), try doing a link= search instead and see if that starts building
the index.  For example, do the following searches:

   link=Word.RecentChanges
   link=Quote.RecentChanges

PmWiki attempts to build an index whenever it encounters reverse-link requests,
but it won't try to do the full text search (which is probably slowing things
down).

Lastly, you may be able to force PageIndexUpdates without doing a search at all.
Create a local/Site.Site.php script with the following:

   <?php

   global $PageIndexTime, $WikiDir, $FarmD;
   include_once("$FarmD/scripts/stdconfig.php");

   set_time_limit(120);
   $PageIndexTime = 60;
   PageIndexQueueUpdate($WikiDir.ls());

Then browse to the page Site.Site .  You'll get the standard Site.Site page
text, but PmWiki should start indexing pages, and will spend up to a minute
doing it (six times longer than normal).  If that's not enough to completely
rebuild the index, reload that page a couple of times and it should be done.

Let us know if any of the above works.

Pm



More information about the pmwiki-users mailing list