Thank you Hans for this updated version of your recipe (and for your last advice about markup=on).<div>The new templated header is fine.</div><div><br></div><div>Gilles.</div><div><font color="#222222"><span style="border-collapse:collapse"><br>
</span></font></div><div><div class="gmail_quote">2009/9/15 Hans <span dir="ltr"><<a href="mailto:design5@softflow.co.uk" target="_blank">design5@softflow.co.uk</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Friday, September 11, 2009, 1:09:42 PM, Hans wrote:<br>
<br>
> For TextExtract I cannot just use PmWiki's search engine,<br>
> because we need to extract text. But thanks to your suggestion I was<br>
> inspired to look at the handling of search terms again, and will<br>
> incorporate the way PmWiki's search handles search terms, so we can<br>
> have input like<br>
> 'abc xyz' => output with 'abc' AND 'xyz' in the page;<br>
> '"abc def" xyz' => output with 'abc def' AND 'xyz' in the page;<br>
> 'abc -xyz' => output with 'abc' but NOT 'xyz' in the page;<br>
> 'abc|xyz' => output with 'abc' OR 'xyz' in the page;<br>
<br>
</div>Now available in the latest release.<br>
<div><a href="http://www.pmwiki.org/wiki/Cookbook/TextExtract" target="_blank">http://www.pmwiki.org/wiki/Cookbook/TextExtract</a><br>
<br>
</div>I also added some template variables for use in parameters<br>
header= , footer= , phead=<br>
for instance a header with a custom title and the search time:<br>
header="%rfloat%{$$time}%%'''Listing'''"<br>
<br>
I split regular expression search from standard search, to allow<br>
easier term input, and added a checkbox for regular expression search<br>
to the search form.<br>
I added a checkbox for 'Match whole words' for whole word searches.<br>
<br>
A note on efficiency:<br>
TextExtract with its in-built pagelist function runs faster than using<br>
PmWiki's pagelist, or MakePageList() function, mainly because<br>
PmWiki's pagelist process opens every page to check if the user is<br>
authorised to see the page, because it does not want to output any<br>
non-authorised pages, for instance read-protected pages. This file<br>
opening can be quite time consuming.<br>
On the other hand TextExtract constructs a pagelist including even<br>
read-protected pages, authorisations are not checked at this stage in<br>
the process. Only later when each page on the source list is opened<br>
will authorisation be checked, before text lines are extracted and<br>
processed. So a lot less pages need to be opened, which makes for<br>
a faster process. That is the main reason I did not use MakePageList()<br>
as a source pagelist generator.<br>
<br>
Still, a possibility remains to use the PmWiki searchbox with a<br>
fmt=#extract option, which will use PmWiki's pagelist functions<br>
and TextExtract formatting functions. Useful if you need to pass<br>
pagelist parameters TextExtract does not understand.<br>
<div><div></div><div><br>
~Hans<br>
<br>
<br>
_______________________________________________<br>
pmwiki-users mailing list<br>
<a href="mailto:pmwiki-users@pmichaud.com" target="_blank">pmwiki-users@pmichaud.com</a><br>
<a href="http://www.pmichaud.com/mailman/listinfo/pmwiki-users" target="_blank">http://www.pmichaud.com/mailman/listinfo/pmwiki-users</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>---------------------------------------<br>| A | de la langue française<br>| B | <a href="http://www.languefrancaise.net/" target="_blank">http://www.languefrancaise.net/</a><br>
| C | <a href="mailto:languefrancaise@gmail.com" target="_blank">languefrancaise@gmail.com</a><br>---------------------------------------<br>
</div>