[pmwiki-users] Greek Utf-8

Seth Cherney sethcherney at yahoo.com
Tue Feb 13 10:08:09 CST 2007


--- "Patrick R. Michaud" <pmichaud at pobox.com> wrote:

> On Tue, Feb 13, 2007 at 03:01:34AM -0800, Seth Cherney wrote:
> > OK, I thought the utf-8 script was only for using utf-8 in pagenames.  I posted to
> > http://www.pmwiki.org/wiki/UTF8/GreekDiacritics and everything checks out.
> > 
> > So, the problem with saxon was that the StaticPageGenerator generates pages in iso-8859-1, not
> > utf-8.  Thus, I had to add a declaration to the top of the file:  <?xml version="1.0"
> > encoding="iso-8859-1"?> (left the rest of decl intact - this is a w3c standard, but broken in
> IE
> > for complex layouts) and all is clear.
> > 
> > I am using Win XP, Firefox 3 Alpha 2. apache server.  I doubt this is browser dependent, but
> > server based.  I can add this declaration, whatever.  
> > 
> > **BUT, will a linux server still output in this encoding?**
> 
> The server generally has little to do with the encoding -- it just outputs
> bytes.  As long as the output HTML contains the utf-8 declaration, the browser
> will interpret the contents as utf-8.


It just does not work.

The page, even if declared, is still *not truly* encoded in utf-8.  saxon will still have an error
(browsers could care less, they dont work on the same low level processing as far as I can tell).

Unless I have a header such as:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html 
    PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
    "xhtml1-transitional.dtd">  \\I keep the dtds on my hdd for indexing purposes
<html>
<head>
  <title>Xrisma.org - Source Texts in Catholic Theology</title>
  <meta http-equiv='Content-Style-Type' content='text/css' />
  <meta http-equiv='Content-Type' content='text/html' charset='utf-8' />
  <link rel='stylesheet' href='$SkinDirUrl/xrisma.css' type='text/css' />
  <link rel='shortcut icon' href='$SkinDirUrl/favicon.ico'/>
  <base target="content"/>
  <!--HTMLHeader-->
</head>

It is full of byte errors. (ie the server is outputing iso bytes, not utf-8).

Delete that first line and you are toast.  That is fine, except that it is a hack.  I will leave
it like this, but this is just a FYI for those on windows boxes at least.  

PS:  any tips on converting between true utf-8 and the #nnn; sequence in ROSpatterns and/or in
markups?  I will write a verbose if necessary, unless someone knows any type of command/shorthand.

> 
> In short, it should work equally well under Linux.
> 
> Pm
> 



 
____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/



More information about the pmwiki-users mailing list