[pmwiki-users] Group and page name aren't recognized by pmwiki.php
Joachim Durchholz
jo at durchholz.org
Tue Mar 22 15:59:15 CST 2005
Hi all,
I'm having serious trouble getting the "nice URL" recipes from the
Cookbook to work. I'm not sure where exactly the problem is, I suspect
misunderstandings on my side and/or a PmWiki bug.
The ultimate goal: all pages should be reachable under the URL
http://www.maquaris.de/Group/Page. (This implies that a "pub" group
cannot be created, because PmWiki accesses skins and other resources
under http://www.maquaris.de/pub. That's not a serious problem since
PmWiki uppercases the first letter of any group name, but if there were
a name clash, I'd simply disallow the "pub" group.)
Since I need the short URLs to be displayed in the browser's location
bar, Redirect is out. Alias doesn't work because it insists on finding
.../Group/Pagename as a file or directory, and these don't exist, so
Alias is out, too. Leaves me with RewriteRule.
The .htaccess file looks like this:
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/*(pmwiki\.php|pub)/+.*$
RewriteRule ^(.*)$ pmwiki.php/$1 [L,NS]
The first line is just to activate the other Rewrite* directives.
The RewriteCond directive is there to allow all URLs that start with
pmwiki.php or pub. I was unsure about the actual number and position of
slashes, besides users might get adventurous and add extraneous slashes
in the URL bar of the browser, so I made it recognize zero or more
slashes before and one or more slashes after the pmwiki.php|pub item.
The negation makes sure that the following rule is activated only if the
pattern does *not* match, i.e. the following rule activates iff the URL
doesn't already start with pmwiki.php or pub as the first path element.
The RewriteRule simply captures the entire URL and prepends the
pmwiki.php/ to it. The [L] option prevents any further rewrite rules
from being activated, the [NS] option prevents the rule from being
re-applied to the output of the rewrite engine. (These options are
largely paranoia and "keeping the stuff compatible with future changes",
I think that neither case can actually happen in my current configuration.)
(This RewriteCond - RewriteRule construction is necessary because the
RewriteRule needs the matched pattern for its $1 variable, but a
negative pattern doesn't give me a variable - it looks only for things
that are not there. Oh, maybe I could have used $0 to recall the entire
request URI, giving me
RewriteRule !^/*(pmwiki\.php|pub)/+.*$ pmwiki.php/$0 [L,NS]
without the need for a RewriteCond - well, something to test tomorrow.)
The Apache log tells me that this all actually works (irrelevant
information snipped to avoid line wrap):
mod_gzip: r->uri=[/pmwiki.php/] OK
"GET / HTTP/1.1" 200 2631
"GET /pub/skins/gemini/layout-standard.css HTTP/1.1" 304 -
"GET /pub/skins/gemini/rightbar-narrow.css HTTP/1.1" 304 -
"GET /pub/skins/gemini/font-sans.css HTTP/1.1" 304 -
"GET /pub/skins/gemini/blue-color.css HTTP/1.1" 304 -
"GET /pub/skins/pmwiki/pmwiki-32.gif HTTP/1.1" 304 -
"GET /pub/skins/gemini/blue70-top.jpg HTTP/1.1" 304 -
The mod_gzip output is interesting only in that it seems to give the
script URI after rewriting, while the GET request is still logged with
the script URL before rewriting.
I don't know what the 304 response codes mean (might be some redirect
code - not sure what exactly is happening here, but the graphics are
delivered just fine - I might look into this closer tomorrow, but I
think if there are any problems here, they are unrelated to the problem
I'm having).
Now this all works fine, but when I call up any other page, I *still*
get the wiki's home page!
Here's the Apache log:
mod_gzip: r->uri=[/pmwiki.php//News/News] OK
"GET //News/News HTTP/1.1" 200 2634
Cannot add header information - [due to debug output]
mod_gzip: r->uri=[/pmwiki.php/] OK
"GET / HTTP/1.1" 200 2631
"GET /pub/skins/gemini/layout-standard.css HTTP/1.1" 200 5904
"GET /pub/skins/gemini/rightbar-narrow.css HTTP/1.1" 200 723
"GET /pub/skins/gemini/font-sans.css HTTP/1.1" 200 1736
"GET /pub/skins/gemini/blue-color.css HTTP/1.1" 200 2170
"GET /pub/skins/gemini/blue70-top.jpg HTTP/1.1" 200 704
"GET /pub/skins/pmwiki/pmwiki-32.gif HTTP/1.1" 200 1127
Cannot add header information - [due to debug output]
mod_gzip: r->uri=[/pmwiki.php/favicon.ico] OK
"GET /favicon.ico HTTP/1.1" 200 2637
I have added debug output, the code that extracts the URI information
now looks like this:
$pagename = $_REQUEST['n'];
echo ("pagename from _REQUEST[n] = $pagename<br>");
if (!$pagename) $pagename = $_REQUEST['pagename'];
echo ("pagename from _REQUEST[pagename] = $pagename<br>");
if (!$pagename &&
preg_match('!^'.preg_quote($_SERVER['SCRIPT_NAME'],'!').'/?([^?]*)!',
$_SERVER['REQUEST_URI'],$match))
$pagename = urldecode($match[1]);
$x1 = '!^'.preg_quote($_SERVER['SCRIPT_NAME'],'!').'/?([^?]*)!';
$x2 = $_SERVER['REQUEST_URI'];
echo ("pagename from matching $x2 to $x1 = $pagename<br>");
if (preg_match('/[\\x80-\\xbf]/',$pagename))
$pagename=utf8_decode($pagename);
echo ("pagename after utf8 decode = $pagename<br>");
$pagename = preg_replace('![^[:alnum:]\\x80-\\xff]+$!','',$pagename);
echo ("pagename after stripping high bits = $pagename<br>");
(i.e. those "echo" lines produce the debug output).
Pm, could you take a look at this? I suspect it's the regex that tries
to find SCRIPT_NAME in the REQUEST_URI, and the assumption underlying
that regex was invalidated by the rewrite stuff; but I don't know enough
about the assumption to propose a patch, let alone fix the problem.
(Besides, I don't know what other considerations go into URL parsing -
different HTTPD servers behave differently, so any correction that I
might think of would probably break on other servers.)
I have left the server at maquaris.de open, diagnostics are turned on.
Any help would be appreciated.
Regards,
Jo
More information about the pmwiki-users
mailing list