[pmwiki-users] Search for terms with ss and ß

Hans Bracker design at softflow.uk
Sun Feb 5 13:07:16 PST 2023


> Someone searching for "champs elysees" will also find the correct "Champs Élysées".

okay, I see it works that way with your recipe. But that is because  "Champs Élysées" is getting saved in pageindex as  words "champs" and  "elysees", so subsequent searches for "champs" and/or  "elysees" will get the pagename with "Champs Élysées" in the text as result.

But then with TextExtract I am stuck, because it looks then through the actual text, row by row, and cannot find the "Élysées", only the "Champs". And there is no way TextExtract can construct  "Élysées" from "elysees". 

Nore can I assume it is safe a German term with 'ss' may well be a term with ß substituted.
But substitute ss for  ß is fine. 

> Your folding should probably be adapted to the language you actually use. I have added 2 lines to the UnaccentUTF8() function in the cookbook, uncomment them to enable the folding ü->ue that is suitable for German.

Yes that works, thank you! What does this do, and what is the connection to German Umlauts?
 $str = preg_replace("/\xcc\x88/", 'e', $str);
I see that \xcc\x88 stands for the character of the two dots, like the dots above o in ö  (Umlaut)

thanks,
Hans




More information about the pmwiki-users mailing list