[pmwiki-users] Group and page name aren't recognized by pmwiki.php

Patrick R. Michaud pmichaud at pobox.com
Wed Mar 23 13:39:20 CST 2005


On Wed, Mar 23, 2005 at 12:44:07PM +0100, Joachim Durchholz wrote:
> >>The .htaccess file looks like this:
> >>  RewriteEngine on
> >>  RewriteCond %{REQUEST_URI} !^/*(pmwiki\.php|pub)/+.*$
> >>  RewriteRule ^(.*)$ pmwiki.php/$1 [L,NS]
> >
> >This last RewriteRule is bad, because it's going to try to redirect
> >everything to pmwiki.  That's probably not what you want.  Most
> > of the CleanUrls recipes take advantage of the fact that group
> > names begin with an uppercase letter.
> 
> Well, this construction is quite exactly what I want, since Group names 
> can begin with other things than just A-Z (for example, they can start 
> with umlauts). I think the recipes are often naive in this regard.

Assuming ISO-8859-1, one can always fix the RewriteRule to work with
leading accented characters by using

   RewriteRule ^([A-ZÀ-Þ].*)$ pmwiki.php?n=$1 [L,qsappend]

I don't know if Apache has support for named character classes; if so
then one could use

   RewriteRule ^([[:upper:]].*)$ pmwiki.php?n=$1 [L,qsappend]

but this didn't work on my server.

> For this reason, I consider it safer to exclude what I know must be 
> excluded (i.e. pmwiki.php and pub, and probably the uploads directory as 
> well), and consider everything else a group/page name.

You probably also want to exclude robots.txt, and perhaps favicon.ico.
In general, it's better to specify what you want to include as opposed
to what you want to exclude, but every site has its own needs.

> I also made some disturbing about the consistency of syntax usage in 
> PmWiki itself and in skins. At least those skins that I have downloaded 
> use PathInfo syntax for links on the template page (so these skins would 
> break on a query syntax-only site); 

Wrong--they absolutely *don't* break.  PmWiki internally converts 
PathInfo-style urls to query-syntax urls whenever $EnablePathInfo 
is not set.

> >If there's a reason why the simple recipe in Cookbook.CleanUrls
> >won't work, let me know and we'll go from there.
> 
> Yup, I found it. It's this code in pmwiki.php, near line 225:
> 
> if (!$pagename &&
>   preg_match('!^'.preg_quote($_SERVER['SCRIPT_NAME'],'!').'/*([^?]*)!',
>     $_SERVER['REQUEST_URI'],$match))
>   $pagename = urldecode($match[1]);
> 
> The first parameter of pref_match evaluates to a regex that matches 
> something like '/path/to/wiki/pmwiki.php' (with an optional appended 
> '/Group/Page' and/or a '?QueryString'); however, rewriting just elided 
> the script name, so it doesn't match.

Ah, I see.  And that's because you were using the PathInfo syntax
in the mod_rewrite.  Using the query-string syntax means that this
code never needs executing (because $pagename is already set).

> Doing some case analysis of Apache's behaviour and googling for the CGI 
> specification however showed me that there's an environment variable 
> named PATH_INFO that contains just what PmWiki needs: PATH_INFO.
> [...]
> However, looking at the CGI spec revealed that PATH_INFO is required to 
> have precisely the value that PmWiki wants (modulo the initial slash 
> that needs to be stripped off).
> So I'm baffled: why doesn't PmWiki use PATH_INFO?

Because it doesn't work on a lot of sites.  Don't believe every 
specification you read --although the CGI spec clearly says that 
this is what PATH_INFO is supposed to have, there are a *lot* of 
webservers, including Apache and PHP, that do not follow this 
part of the spec.  IIS is another one.

Apache 2.0.30 introduced an AcceptPathInfo directive; unfortunately
the default setting on many servers/PHP environments is that urls
with a PATH_INFO component results in a 404 Not Found error.  The only
way to fix this is to set AcceptPathInfo On, which assumes that the
wiki administrator has privileges to do so.  (For more details, see
http://httpd.apache.org/docs-2.0/mod/core.html#acceptpathinfo ).

Even for those sites that have a PATH_INFO variable set, it's often 
*not* the value defined by the CGI specification.  If PHP is running
in cgi-script mode, then PATH_INFO can end up being the entire url
path (which is treated as an argument to the php executable), and not
just the portion of the url that comes after what we think of as
the script name.

Things get much worse in IIS.

If you go back and search the archives you'll see that PmWiki has 
always had a strong preference to using PATH_INFO, but I eventually 
had to add an $EnablePathInfo option (default on) so that servers
that didn't support PATH_INFO could still run using the query string.
But this still left a lot of new admins wondering why PmWiki wasn't
working on their site, so in 2.0.beta8 I changed the default for
$EnablePathInfo to off, so that an initial install generally works
everywhere and then wiki admins can easily try turning $EnablePathInfo
on to test if it will work on their server.  Regardless, PmWiki knows
how to convert between PATH_INFO urls and query string urls, and
nearly always does the right thing depending on the value of 
$EnablePathInfo.

Pm



More information about the pmwiki-users mailing list