[Pmwiki-users] Markup

Pedro Miller Rabinovitch miller
Wed Aug 27 16:17:16 CDT 2003


<IMHO>
On Wednesday 27 August 2003 18:34, you wrote:
> On Wed, Aug 27, 2003 at 05:50:45PM -0300, Pedro Miller Rabinovitch wrote:
> > It's pretty easy, when you think about it -- we either keep
> > the server-side version in a different, specialized storage format (e.g.,
> > XML, whatever) or convert different input markup into the local flavor.
>
> It's only easy as long as you're able to come up with a reliable
> translation to convert from the specialized storage format back to a
> readable markup-- i.e., you need a *bidirectional* translator that can
> preserve all of the markup semantics.  The need for bidirectional

Right. I've worked with unidirection translation from XML and Lua to HTML, TeX 
and text. Of course, bidirectional *is* tougher, you're correct. However, 
pmwiki.php _already_ implements, say PmW to HTML conversions. Say we changed 
it to read PmW and generate XML -- or any other format, for that matter 
(should be pretty easy, when you think about it. Instead of outputting 
<h1>\1</h1>, just output <heading level=1>\1</heading> or its equivalent. 
<h1></h1>, of course, would still be acceptable!). Converting the XML to HTML 
(display) would be easy, just plug in the original substitutions. Converting 
the XML back to PmW... see below.

> translation is the tough part-- although it's relatively easy to create a
> translator that goes from A to B, in most contexts it's very difficult to
> come up with a reverse-translator that can reliably go from B back to A.

Correct, in lots of contexts that is the case. However, here we need to 
convert things from

<heading level=1>Pay attention!</heading>
<ordered-list>
  <list-item>Look <strong>both</strong> sides before crossing <link 
type="internal">BusyStreets</link>;</list-item>
  <list-item>Always make <emph>sure</emph> you're up-to-date with the <link 
type="internal">Safety Guide</link>.
</ordered-list>

..to...

!Pay attention!

*Look '''both''' sides before crossing BusyStreets;
*Always make ''sure'' you're up-to-date with the {{Safety Guide}}.

...or...

h1:Pay attention!

- Look *both* sides before crossing [BusyStreets];
- Always make _sure_ you're up-to-date with the [Safety Guide].

> Example 1:  Save a Word document as HTML, then load the HTML document back
> into Word.  The document doesn't look the same because the translations
> are imperfect.

Yes. But there's a world of difference between those 2 formats (they serve 
diferent purposes), whereas our XML specs (or whatever is used) would be 
naturally close to Wiki-markup.

> It might be possible in wiki environments, where the markup language is
> somewhat limited, but even here there can be some problems.

There you go.

> Example 2:  User A prefers {{Link}} for links, while user B prefers
> "[Link]". How does A write "[Link]" in her text --i.e., the word "Link" in
> brackets but is not a link?  

The way she'd rather do it, i.e., according to her chosen markup-style. See, 
the system *knows* how to translate her markup from <style> to <intermediate 
language>. Hence, *it* knows whether "[Link]" means, FOR THIS USER, 
"<link>Link</link>" or "[Link]".

> Does the specialized storage format have to go ahead and convert her
> "[Link]" to some sort of internal representation, even though it has no
> special meaning in her markup, so that it doesn't suddenly become treated
> like a link when B starts editing the pages she wrote?

Depends on whether or not "[Link]" has any special meaning in the internal 
representation. In the XML example above, for instance, it doesn't, so it 
isn't. When it's time to output PmW-style, it's preserved. If we're 
outputting another wiki style, it would be converted to \[Link\] (or whatever 
that wiki-style uses to escape characters), in order to avoid creating a link 
where there was none.

> It becomes even more difficult in systems where the languages can be
> customized (as PmWiki's markup language is customizable)--there has
> to be a way to add the customizations (and their reverses) into the
> specialized storage format.

True. However, I do think this work would clear up the internal markup 
representation a lot (i.e., the InlineMatches and DoubleBrackets business), 
actually *helping* people to customize the Wiki markup.

> So, I don't think it'd be all that easy to do.  

Well, I meant "easy" in the Grand Scheme of Things (tm), not as in 
"easy-easy". >:)

(in my hierarchy, 'easy' is pretty close to 'possible'. Just one of the many 
things that have driven to my present closeness to insanity >;) )

> However, if it's done
> I think that clearly one ought to use an XML-like structure for the
> internal format--the trick is to coming up with a system that can
> reliably perform translations between an XML internal representation,
> various custom wiki markups, and XHTML.  

Yeah. On the other hand, it would both be useful as a module for other 
projects, and create opportunities for new "print/publish" actions that could 
generate, say, TeX source, etc.

I know it's a tough cookie. But I think it would push the envelope in the 
whole wiki-area.
</IMHO>

All said and done, I'd love to see this at work. >:)

Cheers,

	Pedro.



More information about the pmwiki-users mailing list