[Pmwiki-users] Re: null characters or pattern breaking characters

Christian Ridderström chr
Wed Jan 14 05:56:23 CST 2004


I forgot to send this for a few days, so I hope it's still useful :-)
/Christian

On Mon, 12 Jan 2004, Patrick R. Michaud wrote:

> I've done a bit more research and several comments come to mind:
> 
> 1.  It'd be really handy if the "null character" sequence began
> with a character that's already not considered to be part of a valid
> URI.  In PmWiki that set is currently
>         space  <  >  [  ]  "  '  (  )

Why would it be handy?

(In my original suggestion, the null-token would be substituted with a 
'null-token-character' before any other substitutions are done. Then the 
usual substitutions etc are performed and finally, during the actual 
output of HTML-code the 'null-token-character' is just removed.)

> 2.  After re-reading RFC 2396 and RFC 2732, it's apparent that there are
> a characters that are not allowed in URIs that PmWiki currently
> allows.  In particular, the following characters are not allowed 
> in the path component of URIs:
>         space  <  >  "  {  }  |  \  ^  `
> Of course, this doesn't mean that there aren't people and systems that
> build URIs using these characters (e.g., the vertical brace)--it just
> means that those URIs aren't technically valid.  So, there's a reasonable
> argument to be made that PmWiki should add each of the above to the
> URI delimiter

I agree that pmwiki definitely shouldn't include the characters above in 
URIs (the user will just have to write them using %xx).

>, which would likely eliminate much of the need for the
> null character sequence in the first place (unless I'm missing a case).

If you mean situations where it's *necessary* to use a null-token, that 
might be true, but there are still cases where using a null token is very 
convenient. For instance, I just saw that this is how I 
should quote directives:

	[=[[=]include:...]]

which looks a bit confusing to me and you have to be careful how you 
nest the brackets... OTOH, using some different alternative null tokens, 
any of these alternatives could be used:

token=	&NULL;			&;		    ``
	[[&NULL;include:...]]	[[&;include:...]]   [[``include:...]]
	[&NULL;[include:...]]	[&;[include:...]]   [``[include:...]]
	[[include:...]&NULL;]	[[include:...]&;]   [[include:...]``]

where they have the advantage that you can put the null token anywhere. 

----
A parallell idea: Maybe backticks could escape directives? E.g.

	`[[include:...]]

I'm not sure about the implementation though... what would this produce:

	`[[http://www.bla.org [[http://www.bla.org]]]]
----

> 3.  On the other hand, PmWiki sometimes departs from rigorously following
> a standard in order to be consistent with common practice or meet other 
> goals.  For example, parentheses and single quotes *are* valid characters in
> a URI, but PmWiki excludes them from the URI sequence because they're
> more commonly used in PmWiki as delimiters than as components of other
> URIs.  So, just as PmWiki disallows some characters that the URI spec
> allows, there may be practical reasons that PmWiki should continue to
> allow characters that the URI spec disallows.

I think it's perfectly fine for PmWiki's URI patterns to be *overly* 
restrictive and only swallow a subset of the allowed characters, 
because the user can always write the unusual characters using %nn

(we should add some documentation of this though, and a small table of 
what the codes are for some of the unusual (but valid) characters)

> 4.  Finally, after writing #3 above it occurs to me that we already have
> a null character sequence that would work:  ''''  (four single quotes).
> PmWiki already excludes single quotes from the URI pattern, and four
> single quotes becomes an empty italics sequence.  In fact, this is the
> common "null character" sequence in many existing wikis, which use it
> for pluralization and alternate endings of WikiWord''''s.  
> http://www.pmichaud.com/wiki/Test/InterLinkPattern demonstrates that
> this works as desired.

It works sometimes, but sometimes it fails... see this page 
for examples:
	http://www.pmichaud.com/wiki/Test/NullTokenTests

/Christian


-- 
Dr. Christian Ridderstr?m, +46-8-768 39 44       http://www.md.kth.se/~chr





More information about the pmwiki-users mailing list