[Pmwiki-users] DevQ: $PageTitlePattern
Jonathan Scott Duff
duff
Thu May 13 09:06:28 CDT 2004
On Thu, May 13, 2004 at 03:28:08PM -0700, Steven Leite wrote:
> I need this translated please:
>
> $PageTitlePattern="[[:upper:]][[:alnum:]]*(?:-[[:alnum:]]+)*";
$PageTitlePattern="
[[:upper:]] # Note 1
[[:alnum:]]* # Note 2
(?: # Note 3
- # Note 4
[[:alnum:]]+ # Note 5
)* # Note 6
";
1. Matches any single uppercase character as defined by the current
locale (e.g., if you're using a French locale, the set of uppercase
characters will be different than if using an American locale). The
outer [ ] delimit a character class, the [:upper:] is a POSIX
designation for uppercase characters.
2. Matches zero or more alphanumeric characters as defined by the
current locale. A postfix * means "match the preceding thing zero or
more times"
3. Starts a non-capturing group. In perl, when you enclose a piece of a
regular expression in parentheses, once that portion is matched, it
is "captured" by assignment to one of $1, $2, $3, etc. The number
indicates which set of parentheses. In order to turn this behavior
off, you need to put ?: immediately after the opening (. This will
group that part of the regular expression, but not capture it.
4. match a single - character
5. match one or more alphanumeric characters. A postfix + means "match
the preceding thing one or more times"
6. The end of the non-capturing group. Since it is followed by a *, the
entire group is matched zero or more times.
So, some strings that the above pattern will match:
A
A123
Afoo
A-a
A-a-a
A-foo-bar
Afoo-bar-baz
etc.
> Besides the regex translation, I also want to know specifically what
> is included in the character classes: [[:upper:]] and [[:alnum:]],
> (and what do they not include).
The characters included in [[:upper:]] and [[:alnum:]] depend on the
character set you use.
-Scott
--
Jonathan Scott Duff
duff at pobox.com
More information about the pmwiki-users
mailing list