[Pmwiki-users] DevQ: $PageTitlePattern

Jonathan Scott Duff duff
Thu May 13 09:06:28 CDT 2004


On Thu, May 13, 2004 at 03:28:08PM -0700, Steven Leite wrote:
> I need this translated please:
> 
> $PageTitlePattern="[[:upper:]][[:alnum:]]*(?:-[[:alnum:]]+)*";

	$PageTitlePattern="
		[[:upper:]]		# Note 1
		[[:alnum:]]*		# Note 2
		(?:			# Note 3
		   -			# Note 4
		   [[:alnum:]]+		# Note 5
		)*			# Note 6
	";

1. Matches any single uppercase character as defined by the current
   locale (e.g., if you're using a French locale, the set of uppercase
   characters will be different than if using an American locale). The
   outer [ ] delimit a character class, the [:upper:] is a POSIX
   designation for uppercase characters.
2. Matches zero or more alphanumeric characters as defined by the
   current locale. A postfix * means "match the preceding thing zero or
   more times"
3. Starts a non-capturing group. In perl, when you enclose a piece of a
   regular expression in parentheses, once that portion is matched, it
   is "captured" by assignment to one of $1, $2, $3, etc. The number
   indicates which set of parentheses. In order to turn this behavior
   off, you need to put ?: immediately after the opening (. This will
   group that part of the regular expression, but not capture it.
4. match a single - character
5. match one or more alphanumeric characters. A postfix + means "match
   the preceding thing one or more times"
6. The end of the non-capturing group. Since it is followed by a *, the
   entire group is matched zero or more times.

So, some strings that the above pattern will match:

	A
	A123
	Afoo
	A-a
	A-a-a
	A-foo-bar
	Afoo-bar-baz

etc.

> Besides the regex translation, I also want to know specifically what
> is included in the character classes: [[:upper:]] and [[:alnum:]],
> (and what do they not include).

The characters included in [[:upper:]] and [[:alnum:]] depend on the
character set you use.

-Scott
-- 
Jonathan Scott Duff
duff at pobox.com



More information about the pmwiki-users mailing list