[pmwiki-users] i18n and iso-8859-13

Algis Kabaila akabaila at pcug.org.au
Sat Apr 2 04:01:17 CST 2005


On Friday 01 April 2005 04:00, pmwiki-users-request at pmichaud.com wrote:
> Message: 3
> Date: Thu, 31 Mar 2005 09:09:56 -0600
> From: "Patrick R. Michaud" <pmichaud at pobox.com>
> Subject: Re: [pmwiki-users] i18n and iso-8859-13
> To: Algis Kabaila <akabaila at pcug.org.au>
> Cc: pmwiki-users at pmichaud.com
> Message-ID: <20050331150956.GB25984 at pmichaud.com>
> Content-Type: text/plain; charset=us-ascii
> 
> On Thu, Mar 31, 2005 at 08:06:24PM +1000, Algis Kabaila wrote:
> > 
> > First and foremost - a big thank you to Patrick for his two replies and to 
> > Joachim for his welcome contribution.
> 
> You're welcome!
> 
(small snip...)
> > I know Lithuanian fluently, and 
> > I tried PmWiki site for Lithuanian (with a small 
translation).  As 
> > I understand it, it runs with utf-8.  It was able to display all 
"Lithuanian" 
> > diacriticals correctly.  I was convinced that it would not work, as, as 
far 
> > as I know, my keyboard issues one byte code for each keypress (KDE 
keyboard 
> > setup, with Lithuanian diacriticals in the top numeric row of 
keyboard).  So 
> > there has to be a mapping performed and, I would think, there is a flag 
that 
> > signals the type of mapping.  

 
> Actually, I think the mapping is performed by the browser application 
> (or perhaps one of the OS libraries it uses).  Any keypresses are 
> automatically converted into the character encoding required by the 
> page being displayed.  By the time PmWiki receives the form submission 
> the user's browser has already done the work of converting characters
> to utf-8 (or windows-1257, or iso-8859-*, or whatever the page has 
> specified as the encoding).
> 
> Pm
 

Thank you for the answers.  I suspect that, at least in part, mapping is 
accomplished by the web server, when it is invoked.  I base this opinion (and 
it is only an opinion, not knowledge) on the following observation:

I have a sample of Lithuanian text in my home page 
(http://www.pcug.org.au/~akabaila) on a separate HTML page lituanus.html.  I 
recently edited it at home, specifying iso-8859-13.  I used  SuSE9.2 (kde 
3.3), Konqueror and Kate for editing and testing.  It all went fine - I could 
see the correct glyphs in their correct places.  It confirms your suggestion 
that the browser does the mapping.

 Before uploading, I thought it be worth while to look at the page on my home 
"server" that runs  Apache 2.0.49 as a web page (At present my Apache is 
still "out of the box", without any re-configuration at all.)  All glyphs were 
wrong - I think they were from the iso-8859-1 space.  That suggests that the 
web server does at least influence the mapping.  In desperation, regardless 
of the bad glyphs, I decided to upload lituanus.html to my ISP (TIP), which 
runs Apache 1.xx, probably expertly configured by the web gurus.  And, low and 
behold, all the glyphs were correctly displayed again (the same browser, the 
same lituanus.html,  different Apache web server).  My home OS was unchanged.
ISP (TIP) runs on Linux, different distro than my home network. 

Again, that suggests to me that the web server makes a big contribution to the 
mapping of characters.  The header of the lituanus.html web page looks as  
follows:

  *********
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html; charset=ISO-8859-13" http-equiv="content-type">
  <title>lituanus.html</title>
  
</head>
  *********

I tried the same procedure, using the utf-8 mapping, named the flie 
utf8lituanus.html.  Actually, I changed only one line of the header in this 
new file, compared with the lithuanus.html, as follows:

  *********
<meta content="text/html; charset=utf-8" http-equiv="content-type">
  *********

After eliminating some glitches ( and having concluded that Konqueror 
"prefers" utf-8), I tested it again the same way as lituanus.html with 
iso-8859-13 and  got exactly the same results, which lead me to conclude that 
the correct mapping was dependent on the configuration of Apache.  

The corollary - I should go back to the drawing board, learn about Apache 
configuration and fix it up.

It all sounds logical and reasonable so far.  Now for the "unreasonable" bit: 
I configured the home PmWiki for Lithuanian characters by including a line in 
the .../local/config.php file the following line:

include_once('scripts/xlpage-iso-8859-13.php');

xlpage-iso-8859-13.php is the same as xlpage-iso-8859-9 with 9 
replaced by 13.

Well, in spite of my Apache not being able to display lituanus.html correctly, 
the PmWiki, running on the same unconfigured Apache displays Lithuanian glyphs 
correctly.  I am happy about it, but why is it so?  That is the real question 
that I can not answer and that "blows out of the water" my tentative 
conclusions.  

Alos, I looked with the Konqueror at the code that PmWiki produces on 
pmwiki.org site, but can not see any "charset=xxxx" specification.  Where is 
it? Is it in CSS and if so how can I access it?

I also tried the same cludge configuration for utf-8 on my HAN-PmWiki 
(HAN=Home Area Network) as I used for for PmWiki on HAN with iso-8859-13, but 
that "did not work out" - it showed wrong glyphs. 

I am a somewhat "at sea" how to find the "proper", recommended procedure of 
installing utf-8 and Lithuanian on PmWiki.  I would like to try utf-8 - 
perhaps it would work as well as iso-8859-13. The latter  mysteriously works 
OK.

Any pointers to:

1. How to install utf-8 with another language (for me, Lithuanian) into 
PmWiki;

2. How to configure Apache 2.xx and where and how learn more about Apache web 
server, and configure it with iso-8859-13 and utf-8, preferably not having to 
read 1000 pages of  docs that cover other aspects of the web server.

I am sorry to load you with such a long text, but I did attempt to 
document what I was trying to do and what my tentative conclusions were and 
why.

I do appreciate the patience of all correspondents on this mailing list  and 
admire Patrick M. (Pm) for his considerate, patient responses as well as the 
great work he continues to do - you are a great guy, Patrick!

Al.
-- 
Algis Kabaila
http://www.pcug.org.au/~akabaila




More information about the pmwiki-users mailing list