[pmwiki-users] RFC: Core candidate offerings

Patrick R. Michaud pmichaud at pobox.com
Sat Apr 1 02:08:08 CST 2006


On Sat, Apr 01, 2006 at 06:54:06PM +1200, Allister Jenks wrote:
> However, in the case of the treating of *paragraphs* of text I still don't
> believe it is necessary.  I feel strongly about this case.  After all,
> paragraphs are the cornerstone of HTML.

I can agree with the idea that paragraphs are a cornerstone of HTML.
Unfortunately, I also think that HTML's view of paragraphs is,
by far, the most broken part of HTML.

HTML's definition of the <p> tag says that it is not allowed to
contain other block markup, such as <li> or <pre>.  Semantically
this is equivalent to saying that an author is not allowed to
place a list, quotation, or display in the middle of a sentence
or paragraph.

Semantically, such a notion is preposterous, and I don't know why
W3C allowed this brain-damaged interpretation of <p> to persist
into HTML 4 and XHTML 1.0.  Even W3C's own documentation for HTML has
paragraphs that contain lists and displays.  See, for example, the 
bottom of http://www.w3.org/TR/html4/struct/lists.html#h-10.3.1, 
where the text reads (the text itself isn't important here --
just look at the sentence structure):

    Definition lists, created using the DL element, 
    generally consist of a series of term/definition 
    pairs (although definition lists may have other applications). 
    Thus, when advertising a product, one might use a definition list:

        [example of a definition list]

    defined in HTML as:

        [preformatted block of HTML code to produce above list]

The definition of <p> in HTML, and as it's coded in the HTML
of the lists.html page, means that the "paragraph" must 
semantically end at the colon after "list:", and that the 
phrase "defined in HTML as:" is a paragraph unto itself.    
(Yes, it's actually coded in the web page as 
"<p>defined in HTML as:</p>".) 

Baloney.  My freshman English Composition teachers would've 
given me an "F" for something like this.

It may seem a trivial matter, but this stupid restriction of HTML
has a very real and profound impact on PmWiki.  Consider the 
following markup:

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit. 
    Etiam diam tellus, dapibus vitae, sagittis eu, aliquam 
    sit amet, sapien. Donec rutrum ante sed metus. 

    Vestibulum ante ipsum primis in faucibus orci 
    luctus et ultrices posuere cubilia Curae; 
    Pellentesque commodo ultricies nisi. Fusce vitae quam 
    * Duis dolor, 
    * Mauris tincidunt, 
    * nisi sit amet posuere facilisis 
    risus mauris posuere sem, id venenatis magna mi 
    id sem.  Phasellus viverra mollis nibh. 

    Aliquam lobortis sollicitudin ligula. Donec cursus. 
    Suspendisse potenti. Sed laoreet est vitae enim mollis lobortis. 

Although the author intended this to be three paragraphs (as
indicated by the blank lines), HTML doesn't allow the list in
the second paragraph to be embedded inside of <p>...</p>.
Thus, PmWiki can't rely solely on blank lines to figure out
where to place paragraph markers, but must instead also look
for any form of block markup and make sure that any paragraphs
are terminated/restarted around them.

So, HTML requires an artificial <p> tag to be placed where 
"risus mauris posuere..." begins.  And since it's an artificial
(as opposed to semantic) tag, this means that any margin-top
that we may define for <p> is automatically added at that point
as well, even though it's really not the beginning of
a paragraph as the author wrote or intended it.

Now then, some may claim that lists and displays ought to be 
set apart vertically from the surrounding text anyway, but 
such a claim is really beside the point.  Suppose that 
we're using novel-style paragraphs where the first line 
of each paragraph is indented, as in:

    p { margin:0; text-indent:40px; }

Then "risus mauris posuere" (and the "defined in HTML as:"
phrase from the W3C document) end up being indented by 40px
because they had to be encoded in HTML as the first line of a 
paragraph.  Visually and semantically, this is completely wrong 
and wholly an artifact of HTML's broken <p> tag definition.  And 
the only way to "fix" it is for the author to do something even more
unnatural and modify the <p> tag (using class= or style=)
to indicate that it's "not really a new paragraph".

So, although paragraphs may be a cornerstone of HTML, 
they're still really screwed up.

> But, at the end of the day, if there is some way of influencing the amount
> of space even at a page level (but preferably per paragraph) then we will
> have solved the problem and can move on.

There is already a way to do it at the page level -- just 
define a per-page stylesheet that changes the .vspace margin
to whatever you prefer that page to have.  :-)

For that matter, you can do it for just parts of a page using
a >>div<<:

    >>float=right class=smallvspace<<

where smallvspace is defined with

    .smallvspace p.vspace { margin-top:0.5em; }

This says that the vspace elements within the >>smallvspace<< div
should only produce 0.5em of vertical space instead of
whatever the normal default is.

> Perhaps when the successor to HTML arrives we can 
> look at this again.  :-)

Well, XHTML 2.0 is already being defined, and notably it fixes
the <p> tag so that it's allowed to contain block elements,
tacitly admitting the mistake that was made in HTML.

So, what we're really waiting on is widespread adoption of
XHTML 2.0.  But I wouldn't hold my breath waiting for it...

Pm




More information about the pmwiki-users mailing list