[pmwiki-users] Problem with utf & i18n (French accented characters)

Patrick R. Michaud pmichaud at pobox.com
Sat Jun 9 10:08:19 CDT 2007


On Fri, Jun 08, 2007 at 04:42:51PM -0400, Donald Z. Osborn wrote:
>    I am setting up a wiki (farm) that needs to be in two working languages
>    (English & French) and accommodate texts in some West African languages
>    that use extended Latin scripts - so it should accommodate UTF-8.
> 
>    The early set-up is okay now except that I encountered an odd display
>    issue with the accented French characters in the interface: Basically,
>    although the output is in utf-8 and my browsers are set to utf-8 I am
>    getting the black diamond in Firefox 2 and empty square in MSIE7 for the
>    accented characters. Switch to iso-8859-1 and everything appears normal.
>    This is not what I expected.

For a variety of reasons, the PmWikiFr.* pages (including PmWikiFr.XLPage)
have been built using iso-8859-1 encoding instead of utf-8.  So, they
will tend to not display correctly inside of a utf-8 encoded page.

Thus far PmWiki doesn't have the capability to automatically translate
between character encodings, because many PHP installations don't
provide the necessary translation functions.  In recent versions
of PmWiki 2.2.0-beta I've started storing the character encoding
identification as part of the page so that PmWiki can eventually
do this sort of translation, but we're not quite there yet.

Since my machine _does_ have the necessary translations, it might be
possible for me to come up with utf-8 versions of the PmWikiFr.*
and other pages, and publish them simultaneously with the
iso-8859-1 versions.  But managing all of that -- separate sets of
encodings for each language translations, and trying to explain
to admins when to use each -- is likely to be a real headache.

I'm very much open for suggestions on this topic.  

It would be very cool if we could find a good way to seamlessly
convert existing iso-8859-1 and other sites to using utf-8
(with the option to remain iso-8859-1 for those that want it).

Pm



More information about the pmwiki-users mailing list