[pmwiki-users] Improve export-to-html recipe

Petko Yotov 5ko at 5ko.fr
Thu Nov 13 08:00:35 PST 2025


Added to the documentation:

   --reject-regex '[?&]action='

This should block action links while allowing page links like 
?n=Main.HomePage when the wiki has not enabled $EnablePathInfo.

Petko



On 13/11/2025 16:42, Johan Bengtsson wrote:
> Ok, I see now that $EnableRobotCloakActions was missing the dollar
> sign in our config.php resulting in a php parse error!
> 
> But still  --reject-regex '(.*)\?(.*)'  is useful with wget.
> 
> -Johan
> 
> Med vänlig hälsning / Kind Regards
> CHALMERS
> Johan Bengtsson
> Supporttekniker IT | Support Technician IT
> Chalmers verksamhetsstöd | Chalmers Operations Support
> IT-avdelningen | IT Office | IT-support | IT Support
> +46(0)31 772 6734
> Chalmers tekniska högskola | Chalmers University of Technology
> Chalmers teknikpark / Sven Hultins Gata 9B
> SE-412 96
> Göteborg | Gothenburg
> Sverige | Sweden
> www.chalmers.se
> 
> 
> ________________________________________
> Från: Petko Yotov <5ko at 5ko.fr>
> Skickat: den 13 november 2025 15:18
> Till: Johan Bengtsson
> Kopia: pmwiki-users
> Ämne: Re: [pmwiki-users] Improve export-to-html recipe
> 
> For known bot user agents, $EnableRobotCloakActions and $RobotActions
> would cause links to be rewritten to not show ?action URLs in the HTML
> source, and to return an error 403 Forbidden, not 500, if the agent
> tries to access a forbidden action.
> 
> Moreover, wget is not among the default known bot user agents, so it
> wouldn't activate the rewriting or blocking.
> 
> An internal server error 500 often appears when there is a problem with
> one of the scripts, either an error or a PHP incompatibility in local
> configuration, in a recipe, or in the core.
> 
> It may also be unrelated to PmWiki, say a web application firewall
> configuration.
> 
> Are there any fatal error details in the PHP error log?
> 
> Petko
> 
> On 13/11/2025 14:38, Johan Bengtsson wrote:
>> Hi,
>> 
>> I tried https://www.pmwiki.org/wiki/Cookbook/ExportHTML but found that
>> it didn't work so well.
>> 
>> "EnableRobotCloakActions = 1; "  caused our wiki to just show an HTTP
>> error 500.
>> 
>> This worked:
>> 
>> wget      --recursive      --level 5      --no-clobber
>> --page-requisites  -e robots=off --adjust-extension \
>>      --span-hosts      --convert-links
>> --restrict-file-names=ascii  --remote-encoding=ISO-8859  \
>>  --domains    ourwiki.portal.se   --no-parent \
>>   --no-host-directories     --reject-regex '(.*)\?(.*)'
>> ourwiki.portal.se/FooWiki/SomeGroup
>> 
>> The important bit is --reject-regex '(.*)\?(.*)'   which stops
>> downloading of action= links.
> 
> _______________________________________________
> pmwiki-users mailing list
> pmwiki-users at pmichaud.com
> http://www.pmichaud.com/mailman/listinfo/pmwiki-users



More information about the pmwiki-users mailing list