[pmwiki-users] Automatically add anchors to heading tags?

H. Fox haganfox at users.sourceforge.net
Thu Jul 26 14:36:00 CDT 2007


[This is a long post due to lack of sufficient time to shorten it.]

On 7/26/07, Mike Shanley <thebarmy at omniversalism.com> wrote:
> H. Fox wrote:
> > On 7/25/07, Mike Shanley <thebarmy at omniversalism.com> wrote:
> >> I really, really, really don't like the idea of using the titles as
> >> anchors, because of how much you lose.
> > How much do we lose? Can you be specific about what is lost by using
> > headings as anchors?  Maybe I'm missing something...
>
> Basically, it's a matter of simplicity.

To me this is a gain, not a loss.

Lets use the PmWiki.Wikistyles page an an example.  Without looking at
the source of the page I can figure out what all of the anchor names
would be.

   #Wiki_style_basics
   #Scopes
   #Wikistyle_attributes
   #Applying_wikistyles_to_block
   #Enabling_Styles
   #Custom_style_shortcuts
   #Predefined_style_shortcuts
   #Examples
   #Known_Issues
   #See_Also

Simple.

> I am under the impression that
> no special characters will transfer,

This doesn't seem like much of a loss compared to numeric links.  A
quick look at the HTML source[1] will reveal the encoded characters if
they exist.

> no special formatting within the
> text will be allowed,

Virtually no loss there.

> and then there's always the danger of just how
> many misspellings and changed headings will ruin our best laid plans.

This is certainly a valid point, but compared to "unintentional link
rot" that flows from calculated numeric links I still think it's a
gain considering the trade-off.

Maybe strtolower() can help with this.  If the above anchors were

   #wiki_style_basics
   #scopes
   #wikistyle_attributes
   #applying_wikistyles_to_block
   #enabling_Styles
   #custom_style_shortcuts
   #predefined_style_shortcuts
   #Examples
   #known_issues
   #see_also

then changing capitalization wouldn't change an anchor name.

> Remember that people are not going to see that the anchor text is
> slightly different from the heading itself...

Only when it *is* slightly different, which I suspect rarely happens.

Even then it's  not a disadvantage compared numeric links, since with
those you need to view the HTML source to see what the calculated link
number happens to be.

Maybe this annoyance can be mitigated by stripping out selected
characters, like ",", "(", and ")" (those being special characters
that appear in headings within the PmWiki documentation pages).  I
think characters like those should be stripped out rather than
converted to HTML entities for at least two reasons:

* It makes it easier to guess the anchor without looking at the HTML source, and
* If a heading starts with e.g. a "(" then the anchor text will not
start with a non-alphanumeric character.

> By itself, this reduces
> it's usefulness to, basically, just people in-the-know.

Isn't this also true of calculated numeric anchor names?  I'd say it's
even more so for those.  With names base on converted heading text at
least an ID won't change unless someone changes an existing heading.

A clued-in author can always manually insert  the old anchor to avoid
the trouble that arises from changing a heading name.  (This doesn't
work for calculated numeric anchors, since it would result in
duplicates.)

Btw, I'd say the usefulness of *most* PmWiki markup is only available
to authors in-the-know.  If that were a deal-killer we wouldn't have
PageList Templates, for example.

> At least that is my impression. I'm somewhat new around here.

Thanks!  That helped a lot.  Now the list posted by Petko Yotov can be
expanded to make anchor names (fragment identifiers to be more
precise) more intuitive (predictable, author-friendly) and less
subject to unintentional change.

* wiki to plain text (all links and styles removed)
* whitespace trimmed ("= Title =" and "=Title=" are the same)
* alpha characters to lower case
* certain special characters stripped (e.g. ",()" and others TBD)
* in case there is a repeated heading in the same page (including
"included" pages), a _2 or _3 etc. is added;
* spaces replaced with underscores;
* the string is "urlencoded";
* "%"-signs are replaced with dots.

Hagan

[1] Marc Cooper posted a Firefox tip to the list some time ago: You
can highlight a portion of a page, right-click it, and select "View
Selection Source".  I use this frequently.  Thanks Marc!



More information about the pmwiki-users mailing list