[pmwiki-users] special characters revealing anchors
Patrick R. Michaud
pmichaud at pobox.com
Tue Oct 28 10:03:42 CDT 2008
On Tue, Oct 28, 2008 at 09:42:38AM +0100, Jean-Fabrice [gmail] wrote:
> 2008/10/28 adam overton <a at plus1plus1plus.org>:
> > i recently discovered some broken/visible anchors on a user's page.
> > his use of a special character at the beginning of the anchor seems
> > to be the culprit, as it causes the anchor to become visible (special
> > characters within a word don't seem to cause problems). here is an
> > example;
> > [[#àdroite]]
> afaik, pmwiki respects w3c standards and recommendations while this
> syntax ([[#àdroite]]) does not.
> Take a look at http://www.w3.org/TR/REC-html40/types.html#type-name :
> ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
> followed by any number of letters, digits ([0-9]), hyphens ("-"),
> underscores ("_"), colons (":"), and periods (".").
This is correct -- PmWiki follows the w3c standards here, and only
recognizes A-Za-z, digits, hyphens, underscores, colons, and periods
in anchors. Anything else causes PmWiki to not recognize the [[#...]]
as an anchor.
There has been some discussion of getting PmWiki to automatically fold
non-ASCII characters into the ASCII set, so that [[#àdroite]] would
generate "adroite" in the anchor tag, and thus be valid HTML. But
this would undoubtedly confuse people because a url ending with
...#àdroite would not find the anchor.
It's also possible to redefine the anchor rule so that it
recognizes non-ASCII characters in anchors and uses them in the
output; this of course results in invalid HTML if used (at least
according to the current spec).
More information about the pmwiki-users