[pmwiki-users] UTF-8 support and mbstrings
Patrick R. Michaud
pmichaud at pobox.com
Tue Jun 6 09:39:00 CDT 2006
On Tue, Jun 06, 2006 at 02:55:43PM +0300, Athan wrote:
> Any hope to see such a version of pmwiki ?
> Current version works fine with single byte chars but lacks case insensitive
> search when use non-latin utf-8 strings.
> So, why not an mbstring version? Most hosts support php with mbstrings
> compiled in. Besides that, it is very easy to have it disabled when mbstring
> functions are not available.
PmWiki uses preg_match for its (case-insensitive) text search -- this is
faster than calling the string or mbstring functions. Unfortunately,
there isn't an mbstring version of preg_match available.
(Yes, there's an mb_eregi function that does pattern matching, but
unfortunately it uses a somewhat different syntax from the pcre-based
pattern matching functions.)
So, in the case of search it's not a simple matter of replacing
functions with mbstring equivalents -- it requires reworking the
entire algorithm to be able to use mb_eregi, or avoiding the pattern-match
searches altogether.
However, I'm looking to modularize the pagelist functions anyway, so
perhaps text search can be placed into its own module. Then it would
be much easier to have a mbstring version of text search.
Votes are being recorded at http://www.pmwiki.org/wiki/PITS/00682. :-)
Pm
More information about the pmwiki-users
mailing list