[pmwiki-users] Search for terms with ss and ß
Petko Yotov
5ko at 5ko.fr
Wed Feb 8 05:01:04 PST 2023
On 08/02/2023 13:32, Dominique Faure wrote:
> In order to minimize the references sources, what would be the best
> way to use that in ISO8859MakePageNamePatterns cookbook recipe instead
> of relying on another large set of regexp replacement?
>
> Something like below?
> -----
> function cb_unaccent($m) { return UnaccentUTF8($m[1]); }
>
> # standard patterns from pmwiki.php
> SDV($PageNameChars, '-[:alnum:]');
> SDV($MakePageNamePatterns, array(
> "/'/" => '', # strip single-quotes
> "/[^$PageNameChars]+/" => ' ', # convert everything else
> to space
> '/((^|[^-\\w])\\w)/' => 'cb_toupper', # CamelCase
> '/ /' => '', # drop spaces
> '/(.*)/' => 'cb_unaccent'));
> -----
So this is about having the page names like "ChampsElysees" without
diacritics while having links like "[[Champs Élysées]]".
The above may fail to capitalize some words starting with an accented
character, or simply discard it.
I see in scripts/xlpage-utf-8.php that $MakePageNamePatterns removes any
text after a ?question mark or a #hash - I have no idea when this may
happen but I suppose at some point it was needed.
I would probably do something like this:
include_once("scripts/xlpage-utf-8.php");
$MakePageNamePatterns =
array_merge(['/^(.*)$/'=>'cb_unaccent'], $MakePageNamePatterns);
This should call cb_unaccent() first and remove all accents, and from
then it should be business as usual.
Petko
More information about the pmwiki-users
mailing list