[pmwiki-users] UTF-8 as core default encoding (was: Headers are notsending charset !)

Athan foxx at freemail.gr
Mon Mar 12 10:12:20 CDT 2007


"Patrick R. Michaud" <pmichaud at pobox.com> wrote in message 
news:20070312140329.GD29823 at host.pmichaud.com...
> Many case-insensitive functions cease to be case-insensitive for utf-8;
> in particular, the '/i' flag to preg_match and preg_replace patterns
> doesn't seem to work for non-ASCII letters.

Correct but mbstrings and many other utf8 wrapper libs exist and work. Many 
widely used scripts (e.g. Drupal, Dokuwiki, Typo, TypoLight, CMSMS etc) 
already make use of such libs.

> Another limitation is that some locales (e.g., date and time strings
> returned by PHP's strftime() function) expect to be displayed using an
> iso-8859-1 character set, and thus won't work properly if utf-8 is chosen.

Not if using setlocale() and xx_XX.UTF-8. I didn't notice any problem using 
such locales and strftime().

> Still another problem is dealing with non-ASCII characters in filenames;

UTF-8 filenames is a different (sad) story. IMHO, latin filenames wouldn't 
be a problem if pmwiki was able to handle utf-8 content flawlesly.

> Rather than default to utf-8 and thus hit _every_ site with the workaround 
> performance
> penalty even when utf-8 isn't being used, PmWiki defaults to iso-8859-1
> (where PHP works most efficiently) and lets those sites that need or
> want utf-8 encoding do a simple include to get utf-8 to work.

I'd agree with that, IF utf-8 worked. Issues like lacking case-insensitive 
searches are very annoying for a wiki.

> This isn't to say that PmWiki will never switch to using a utf-8 encoding
> by default...

Nice to hear that!

Regards,
Athan 





More information about the pmwiki-users mailing list