[pmwiki-users] Upgrade to 2.2.35 : problem with some page using apostrophe
Petko Yotov
5ko at 5ko.fr
Sun Nov 13 02:40:14 CST 2011
On Sunday 13 November 2011 01:32:40, Petko Yotov wrote :
> There are indeed problems with some characters such as typographical
> apostrophes and dashes, and yes, they are different from normal
> apostrophes.
...
> For some reason, the browsers don't treat these characters the same way as
> PHP does. The PHP iconv() function, like the `iconv` system program,
> appear unable to convert these characters so that the browsers display
> them correctly.
I should add the utf_encode() function.
These characters appear to be non-standard, or more precisely from a different
standard.
The code points 128-159 (0x80-0x9F) are not denined in the ISO-8859-1 charset,
they are defined in the Windows-1252 charset:
https://en.wikipedia.org/wiki/ISO-8859-1
https://en.wikipedia.org/wiki/Windows-1252 (the special characters are
in the cells with thick green borders)
From Wikipedia:
It is very common to mislabel Windows-1252 text with the charset label
ISO-8859-1. A common result was that all the quotes and apostrophes
(produced by "smart quotes" in Microsoft software) were replaced with
question marks or boxes on non-Windows operating systems, making text
difficult to read. Most modern web browsers and e-mail clients treat the
MIME charset ISO-8859-1 as Windows-1252 in order to accommodate such
mislabeling. This is now standard behavior in the draft HTML 5
specification, which requires that documents advertised as ISO-8859-1
actually be parsed with the Windows-1252 encoding.
So, the PHP conversion functions actually follow the standard, but the text
sent by the browsers is not completely standard.
In order to convert these characters, maybe our automatic conversion from
ISO-8859-1 to UTF-8 should do the same : consider the page text as
Windows-1252. Indeed, if the text contains characters at these code points,
these characters can only be Windows-1252-encoded.
Petko
More information about the pmwiki-users
mailing list