[pmwiki-users] Improve export-to-html recipe
Petko Yotov
5ko at 5ko.fr
Thu Nov 13 08:00:35 PST 2025
Added to the documentation:
--reject-regex '[?&]action='
This should block action links while allowing page links like
?n=Main.HomePage when the wiki has not enabled $EnablePathInfo.
Petko
On 13/11/2025 16:42, Johan Bengtsson wrote:
> Ok, I see now that $EnableRobotCloakActions was missing the dollar
> sign in our config.php resulting in a php parse error!
>
> But still --reject-regex '(.*)\?(.*)' is useful with wget.
>
> -Johan
>
> Med vänlig hälsning / Kind Regards
> CHALMERS
> Johan Bengtsson
> Supporttekniker IT | Support Technician IT
> Chalmers verksamhetsstöd | Chalmers Operations Support
> IT-avdelningen | IT Office | IT-support | IT Support
> +46(0)31 772 6734
> Chalmers tekniska högskola | Chalmers University of Technology
> Chalmers teknikpark / Sven Hultins Gata 9B
> SE-412 96
> Göteborg | Gothenburg
> Sverige | Sweden
> www.chalmers.se
>
>
> ________________________________________
> Från: Petko Yotov <5ko at 5ko.fr>
> Skickat: den 13 november 2025 15:18
> Till: Johan Bengtsson
> Kopia: pmwiki-users
> Ämne: Re: [pmwiki-users] Improve export-to-html recipe
>
> For known bot user agents, $EnableRobotCloakActions and $RobotActions
> would cause links to be rewritten to not show ?action URLs in the HTML
> source, and to return an error 403 Forbidden, not 500, if the agent
> tries to access a forbidden action.
>
> Moreover, wget is not among the default known bot user agents, so it
> wouldn't activate the rewriting or blocking.
>
> An internal server error 500 often appears when there is a problem with
> one of the scripts, either an error or a PHP incompatibility in local
> configuration, in a recipe, or in the core.
>
> It may also be unrelated to PmWiki, say a web application firewall
> configuration.
>
> Are there any fatal error details in the PHP error log?
>
> Petko
>
> On 13/11/2025 14:38, Johan Bengtsson wrote:
>> Hi,
>>
>> I tried https://www.pmwiki.org/wiki/Cookbook/ExportHTML but found that
>> it didn't work so well.
>>
>> "EnableRobotCloakActions = 1; " caused our wiki to just show an HTTP
>> error 500.
>>
>> This worked:
>>
>> wget --recursive --level 5 --no-clobber
>> --page-requisites -e robots=off --adjust-extension \
>> --span-hosts --convert-links
>> --restrict-file-names=ascii --remote-encoding=ISO-8859 \
>> --domains ourwiki.portal.se --no-parent \
>> --no-host-directories --reject-regex '(.*)\?(.*)'
>> ourwiki.portal.se/FooWiki/SomeGroup
>>
>> The important bit is --reject-regex '(.*)\?(.*)' which stops
>> downloading of action= links.
>
> _______________________________________________
> pmwiki-users mailing list
> pmwiki-users at pmichaud.com
> http://www.pmichaud.com/mailman/listinfo/pmwiki-users
More information about the pmwiki-users
mailing list