[pmwiki-users] i18n and iso-8859-13
akabaila at pcug.org.au
Thu Mar 31 04:06:24 CST 2005
On Thursday 31 March 2005 04:00, pmwiki-users-request at pmichaud.com wrote:
> Message: 5
> Date: Wed, 30 Mar 2005 11:12:06 -0600
> From: "Patrick R. Michaud" <pmichaud at pobox.com>
> Subject: Re: [pmwiki-users] i18n and iso-8859-13
> To: Algis Kabaila <akabaila at pcug.org.au>
> Cc: pmwiki-users at pmichaud.com
> Message-ID: <20050330171206.GA25408 at pmichaud.com>
> Content-Type: text/plain; charset=us-ascii
> The rest of the answers...
> On Wed, Mar 30, 2005 at 05:59:20PM +1000, Algis Kabaila wrote:
> > 2. Implementation.
> > After downloading the Pm's i18n.tgz and expanding it, I "doctored" the
> > iso-8859-9 implementation, renaming it xlpage-iso-8859-16.php. The
> > of this "doctored" file is as follows: [...]
> > $HTTPHeaders = "Content-type: text/html; charset=iso-8859-13;";
> > [...]
> > This enables me to write with Lithuanian diacriticals - and read it with
> > the correct glyphs.
> > 3. Questions.
> > 3.1 Considering that my knowledge of php is very close to zilch, is the
> > procedure "safe"?
> Yes, quite safe, until/unless you want to have Lithuanian diacriticals
> in pagenames. At that point things may not work properly and we'll need
> extra stuff to handle pagename conversions. And there may not be any
> good guarantees about the system being able to handle WikiWord links
> with Lithuanian characters in them. (See the xlpage-iso-8859-2.php
> page for some of the things we have to do to get iso-8859-2 names and
> links to work correctly.)
> > 3.2 I notice that there are some pages waiting for translation into
> > Lithuanian: PmWikiLt.PmWikiLt, PmWikiLt.XLPage. If no one else is doing
> > I would like to start the translation.
> I already mentioned that the easy way to do this is to edit the
> PmWikiLt.XLPage on pmwiki.org -- however, a very important question at
> the outset is whether to use utf-8 or iso-8859-1 (or even windows-1257)
> for the character set, as controlled by the 'xlpage-i18n' phrase.
> Right now I have it set to use utf-8, because utf-8 is generally a
> reasonable and forward-looking default, but someone more versed with
> Lithuanian browser standards would have to tell us the best choice.
> For the time being I'm adding xlpage-iso-8859-13.php to the i18n.tgz
> distribution, in case it's a better choice than utf-8. We can then add any
> character-specific handling that may be needed to that file.
> I've also looked at supporting windows-1257 encodings in the past, but
> since I don't know the languages it's hard for me to know if I'm doing
> things correctly or not. If someone wants to be a tester/evaluator for
> that charset then I'll be glad to work on it as well.
First and foremost - a big thank you to Patrick for his two replies and to
Joachim for his welcome contribution.
Whilst I know Lithuanian fluently, I am prepared to an about turn of 180
degrees. I tried PmWiki site for Lithuanian (with a small translation). As
I understand it, it runs with utf-8. It was able to display all "Lithuanian"
diacriticals correctly. I was convinced that it would not work, as, as far
as I know, my keyboard issues one byte code for each keypress (KDE keyboard
setup, with Lithuanian diacriticals in the top numeric row of keyboard). So
there has to be a mapping performed and, I would think, there is a flag that
signals the type of mapping.
If that sounds confused, it is not surprising - I am confused. I quite like
the idea of utf-8 and I am would give it a "fair go", but I am not at all
clear how the mapping is done and I am not inclined to treat it as a "black
I would counsel to ignore windows-1257 "standard". For 99% of practical
purposes it is the same as iso-8859-13 (also known as The Baltic Rim). All
the characters of the three Baltic languages - Lithuanian, Latvian and
Estonian (and the latter is , as far as I can tell, the same as Finish) have
the same location (numerical code in 128..255 space) in iso-8859-13 and
windows-1257. If iso-8859-13 is supported, then, for practical purposes of
reading text, windows-1257 is also supported.
I must apologise for slowness of thanking you both, but we do have time
differences that throw things out of gear. Also, tomorrow, my ISP is
changing the "backbone" supplier, so we are likely to have some service
interruptions. Let me just repeat - a big thank you - and I am going back to
the "drawing board".
I am familiar with the basics of utf-8, but any clue how and where the mapping
of characters from 8 bit representation to the chunk of bits from two byte
representation is made would be very interesting and helpful. I would very
much like to try utf-8 on a home installation of a PmWiki.
More information about the pmwiki-users