[pmwiki-users] UTF-8 support and mbstrings

Athan ssb at in.gr
Wed Jun 7 09:48:03 CDT 2006


"Patrick R. Michaud" <pmichaud at pobox.com> wrote in message 
news:20060607142037.GC31893 at host.pmichaud.com...
> Yes, something like that could work; I've been trying to avoid the
> cost of converting the page text to lowercase in the first place.
> Pagelists are slow enough already without adding even more conversion
> overhead.

Yes I know what you mean, even compiled mbstring functions slow down 
pagelists.

> Also, in order for term exclusion to work, we need to fix line 230.
> So, my suggestion would be
>
>      if ($searchterms) {
>        $text = 
> utf8tolower($pn."\n".@$page['targets']."\n".@$page['text']);
>        if ($exclp && preg_match($exclp, $text)) continue;
>        foreach($inclp as $i)
>          if (!preg_match($i, $text))
>            { if ($inclx) $xlist[] = $pn; continue 2; }
>      }
>
> and also to make sure that $exclp and $inclp are converted to
> lowercase when they're initialized.

Really better mod, thanks.

> But all of this needs to be on a switch of some sort; we don't want to
> be calling utf8tolower() for sites that aren't running utf8, nor do
> we want to give them the execution overhead of unnecessary calls to
> convert things to lowercase.

Yeah you're right.
Personally I could use a single byte encoding (8859-7 for Greek) instead 
utf8, it worked great with pmwiki when tried. But I think that utf8 is the 
"right thing". Unfortunately Zend did nothing to support unicode in php5, 
now they say v6 will fix that. Who knows...

Athan 







More information about the pmwiki-users mailing list