[pmwiki-users] New feature proposal - need help with markup selection
Joachim Durchholz
jo at durchholz.org
Wed Mar 16 06:42:56 CST 2005
Hi all,
here's a large-ish feature proposal. I have done all the feasibility
research and will implement it for a Wiki that I set up for a friend,
but I'd like to ask here for improvements and whether it's useful as an
inclusion in the official PmWiki distribution.
Here goes:
I propose two extensions with a very useful synergy: substitutions, and
repeat groups.
Substitutions are supposed to work as follows:
* Writers insert markup that assigns a value to a symbol.
* Writers use the symbol in other places of the page. When rendering,
PmWiki substitutes the values for the symbols.
This has the following advantages:
* Largeish texts can be defined once and for all, without need for
typing the same thing over and over again. (Especially useful if the
full text contains a lot of markup, e.g. some complicated
* The "print" skin could leave the symbols in, and print them as
footnotes at the bottom of the page. (The Wiki installed at the Ubuntu
site does this, and it's a nice idea.) I'm not sure whether this
"outsourcing" should be done just in the "print" skin, or in general.
* It can be used to decouple table layout from table contents, like this:
|| &TH1 || &TH2 ||
|| &TD1a || &TD2a ||
|| &TD1b || &TD2b ||
&TH1 = foo
&TH2 = bar
&TD1a = This is a very lengthy text that goes into the very first table
cell.
&TD2a = This is a very lengthy text that goes into the second cell of
the first line.
&TD1b = This is a very lengthy text that goes into the first cell of the
second line.
&TD2b = This is a very lengthy text that goes into the second cell of
the second line.
Repeat groups are supposed to work as follows:
* There's new markup, say (:do:)...(:default:)...(:done:). It can be
nested (not sure how to implement that best using regexps though...
depends on the details of how rules are applied: if they are applied
repeatedly, and from the beginning of the text, until they don't find an
applicable substitution point, then it's no problem; otherwise I'll
probably have to do some serious programming).
* The stuff between (:do:) and (:default:) is examined for the use of
symbols. For each symbol, the appropriate expansion is substituted in.
This "eats up" one definition for each symbol (it isn't available). This
substitute-the-symbols-and-eat-the-definition process is repeated until
no more definitions are left (yes, I'm assuming there are multiple
definitions for symbols). If one symbol has less definitions than
others, then the last definition is revived until all symbols have run
out of definitions. (The example below will clarify this.)
* If there is no definition for any symbol, then the text between
(:default:) and (:done:) is used. Likewise if the text between (:do:)
and (:default:) doesn't contain any symbols.
Here's an example:
--- snip ---
|| Term || Definition ||
(:repeat:)
|| &T || &D ||
(:default:)
|| &ND ||
(:done:)
&ND = There are no definitions yet.
&T = foo
&D = This is the archetypical [[metasyntactic variable]] used in
[[Computer Science]] lectures.
&T = bar
&D = This is used if the lecturer need a second [[metasyntactic variable]].
&T = baz
&D = This is used if the lecturer needs a third one. Last seen in the
form of "function foo (bar: baz)".
&T = fump
--- snip ---
The nice thing here is that it makes it easy to add additional rows to a
table, with minimal syntactic fuss.
I purposely inserted an error: there's no &D for the last &T. The last
line of the generated table will, hence, contain a repetition of the
last &D definition - an obvious error, but easily corrected by adding a
line that reads
--- snip ---
&D =
--- snip ---
leaving the last table line conveniently empty until the author finds
something witty to write for "fump" :-)
There's an interesting extension, though I have no current plans to
actually implement that: Definitions could refer to a database query,
and PmWiki could iterate down the resulting rows and take the
definitions from there. Of course, this extension won't help those with
no databases, but those who do have one could take a huge advantage from
this kind of functionality.
Another interesting alternative would be taking definitions from text
files. This should please the database-challenged among us :-)
Issues known to need input/clarifications/critique
--------------------------------------------------
* The &VARIABLE markup is a bit ugly. I would have preferred * over &,
but I think some people will want to have the Mozilla mail&news markups
with * for bold, _ for underline, and / for italics. There are plenty of
other special characters that could be used, but I have no idea why one
should be preferrable over another one.
* I have no idea what to allow or forbid in the variable names
themselves. Just uppercase letters is probably too rigid,
* I don't know whether a markup that starts with a & sign in column 1
collides with any other markups - it doesn't collide with those of
PmWiki itself (I have checked), but I haven't looked into other markups.
* I've been thinking about "alternating symbols". E.g. define two
symbols that are used alternatively (say, they might contain the
background color definition for table rows). This would eliminate the
need for some other markup. I don't have an idea how the markup for the
symbol definitions should be though. "&&VARIABLE =" as a "start of the
repeat group" marker maybe?
Implementation
--------------
Easy to do as a cookbook recipe, since it can be done from the markup
functions. (Hats off to PM who chose a rule-based markup system, and
used partial ordering constraints to establish evaluation order - it's
exactly the way such things should be done and are almost never done.
I'm impressed, and that's indeed a rare thing).
Might wander into the core once PM decides to use the mechanism for
other core functionality, but I'm happy to leave it as it is.
Nested loops are a bit hairy. My first idea was to use an inside-out
strategy, finding loop constructs that don't contain other looping
keywords (easy to do as a regex: search for (:do:)...(:end:) that
doesn't contain a (:do:) inside, substitute, rinse and repeat until no
(:do:) is left).
Unfortunately, that doesn't work: an inner loop may get replicated and
filled with variables defined in the outer loop, so outer loops have to
be done first. Which, in turn, means that the regex that finds the loop
must count opening and closing loop keywords and declare "end of loop"
as soon as the open-close keyword count returns to zero. Standard
regexes can't count, so I'm going to have to rely on nonstandard tricks
- and that could become hairy. If anybody already has done something
like that, I'd love to hear about the best techniques...
Comments? Critique? Encouragement? All are most welcome :-)
Regards,
Jo
More information about the pmwiki-users
mailing list