[pmwiki-users] TextExtract (Search recipe) update

Hans design5 at softflow.co.uk
Wed Sep 9 09:04:19 CDT 2009


Wednesday, September 9, 2009, 2:43:49 PM, ABClf wrote:

> One word to say that what would be nice too (for my own point of
> view), is to get a KWIC format ; then one can use textextract to
> produce lexical analysis inside pmwiki.

i am not familiar with KWIC.
Is that the same than to add a wordcount boundary around each found
term? Like "show term within 25 words before and after".
The number of words could be specified by the option parameter.

And should such boundaries go beyond paragraphs, or stop at
paragraph divisions. i think it would look better if they stop,
otherwise a result may show just a few words from the preceding
paragraph, which may make no sense at all. If we specify unit=para
and say words=20 we would stay automatically within pargraphs,
just restrict the output a bit more.

And what about sentence boundaries? It would be extremely hard to
determine any of those. I don't think I'd like to go that way.


  ~Hans




More information about the pmwiki-users mailing list