[pmwiki-users] Questions about PTV and PTV mass edit

ABClf languefrancaise at gmail.com
Thu Jul 4 12:34:08 PDT 2024


Thank you for your comments.
As for expanding PTV to more languages and accented words, it will be
nice for people using such alphabets ;
nonetheless, I may want to find and replace something else than a failing PTV.

As text files are a flexible way to store data, it might be convenient
to do text replacements with text editor.
I understand PmWiki's flat files are a bit special though, and working
directly on files not so easy.

>From inside PmWiki, Wikish used to have basic sed tool (sed
's/myname/yourname/g' Profiles.*) and in the best world, it would be
convenient to have a search and replace tool working from inside
pmwiki.

Here it is my very rudimentary test :
run a new pmwiki instance on some folder for testing purpose ;
copy a bunch of files in wiki.d from my website.
Go to test url and navigate : it works. Fine.

Now, I connect to server with ssh ;
search (grep) for pattern I want to change in wiki.d files
check results : it looks good.

Now change that pattern (one word) to see what will happen in real life :
grep -rl %0apréface: | xargs sed -i 's/%0apréface:/%0apreface:/g'

Change is done ; timestamp's files are updated to time of change
(sadly no option in sed to preserve file time) ; pageindex doesn't
look to bother (is not regenerated) ; browse page still work ;
replacement is not in history of course ; PmWiki keep showing page
last modified time according to time= if it exists.

It might look rustic or brutal but it looks like it works (sure it
works, PmWiki flat file need only 2 lines, plus 1 line for link list –
PmWiki/PageFileFormat) ;
question is what is the price to pay when :
loose replacement step in history, have pageindex not up to date,
time= different than file timestamp. Probably fastcache, if you use
it, will have to regenerate cached files when they are browsed.
Something else ?

To Klaric, if your script is made for the search and replacement part,
I think I don't need it, because the patterns I may want to replace
are very limited ; big thank you for your offer ;).
This said, using text import is interesting way of getting this work
done from PmWiki itself. I will run some tests later ; just tell me,
if possible,
1/ what did you download (the file in wiki.d, the action=source page ?)
2/ does Text Import keep your history, for edited pages which already
had an history ?
(I have used Text Import, but only for importing a bunch of ordinary
plain text files created in text editor, not for text files coming
from PmWiki).

I understand it's a lot safer to have ready to use backup.

Gilles.


Le jeu. 4 juil. 2024 à 07:15, Petko Yotov <5ko at 5ko.fr> a écrit :
>
> There is this recipe International PTVs that might work without
> mass-editing files:
>
>    https://www.pmwiki.org/wiki/Cookbook/InternationalPTVs
>
> I've put this recipe to the core candidates.
>
> The bits that iPTVs don't work yet, for pagelist ordering, or in
> pagelist templates, should be made to work. :-)
>
> Alternatively, we can refactor PageTextVar() to allow pre-processing of
> the page text before PTVs are extracted. This could allow you to define
> variable replacements removing the accents, so that a text is valid:
>
>    édition : Larousse
>
> Then in the wiki you use {$:edition} without accents instead of
> {$:édition}.
>
>
> Note that in both cases FastCache may not detect a change in config.php
> that modifies the pages HTML without the pages being modified.
>
> As for the mass editing, I second Gregor's implied suggestion to not do
> this directly on the live website. I'd also recommend having a backup
> snapshot from before the operation, ready to be restored.
>
> Petko
>
> On 04/07/2024 00:39, ABClf wrote:
> > I understand your answer, thank you.
> > Here are more informations, to complete my question, in case someone
> > has advice.
> >
> > At first stage, what I am planning to do are very basic and tiny
> > replacements ;
> > for example, I would like to able to replace things such as :
> >
> > édition :
> > with
> > edition :
> >
> > I guess I have around 20 terms to get replaced, affecting a few
> > thousands of pages.
> >
> > For such a basic task (one word replacement, no multiline, no link
> > affected), I would believe unix tools to be effective – as long as you
> > know how to use them safely ;) – rather than playing with PmWiki API
> > (first : far out of my knowledge ; second : I may have a lot of pages
> > to process and it looks like overkill to edit/save hundreds or
> > thousands pages for a one (or 10) word replacement).
> >
> > Also, for such a cosmetic/minor/negligible task, I don't mind about
> > the diff ; I do restore pages sometimes, but not often, and I should
> > be able to check the restored copy twice.
> >
> > I understand it is much better to preserve timestamp replacement
> > (sure, I don't want to touch the unix time= data and indeed I do use
> > fastcache recipe).
> >
> > At least, it is true that PmWiki file text format is somewhat tricky
> > (to me, it is one of the very few things I regret with PmWiki) but the
> > planned task would have very minimal impact and is expected to be
> > quite easy.
> >
> > I didn't dare to use Anomen's XMLPageStore for my main site (if
> > something bad happens will I be able to fix it ? if there is no more
> > support in case of PHP incompatibility ?) but I do use it for a
> > private and less important part with success. Wiki.d files created
> > with that script are a lot cleaner.  No idea about the security
> > issues.
> >
> > As for .pageindex creation, it takes long but it worked last time it
> > ran ; nevertheless I'm not sure the server does like the experiment
> > too much ; and because it is quite a big and long task in my case, I'm
> > limited to regen .pageindex only if needed.
> >
> > Thank you,
> > Gilles.
> >
> >
> > Le mer. 3 juil. 2024 à 22:03, Petko Yotov <5ko at 5ko.fr> a écrit :
> >>
> >> On 03/07/2024 21:48, Petko Yotov wrote:
> >> > PTVs are normally not indexed in .pageindex, only the page names, last
> >> > modified times, and the list of words for searches.
> >>
> >> Also the link targets from the page to other pages or to categories
> >> are
> >> stored in .pageindex.
> >>
> >> Petko

Le jeu. 4 juil. 2024 à 07:15, Petko Yotov <5ko at 5ko.fr> a écrit :
>
> There is this recipe International PTVs that might work without
> mass-editing files:
>
>    https://www.pmwiki.org/wiki/Cookbook/InternationalPTVs
>
> I've put this recipe to the core candidates.
>
> The bits that iPTVs don't work yet, for pagelist ordering, or in
> pagelist templates, should be made to work. :-)
>
> Alternatively, we can refactor PageTextVar() to allow pre-processing of
> the page text before PTVs are extracted. This could allow you to define
> variable replacements removing the accents, so that a text is valid:
>
>    édition : Larousse
>
> Then in the wiki you use {$:edition} without accents instead of
> {$:édition}.
>
>
> Note that in both cases FastCache may not detect a change in config.php
> that modifies the pages HTML without the pages being modified.
>
> As for the mass editing, I second Gregor's implied suggestion to not do
> this directly on the live website. I'd also recommend having a backup
> snapshot from before the operation, ready to be restored.
>
> Petko
>
> On 04/07/2024 00:39, ABClf wrote:
> > I understand your answer, thank you.
> > Here are more informations, to complete my question, in case someone
> > has advice.
> >
> > At first stage, what I am planning to do are very basic and tiny
> > replacements ;
> > for example, I would like to able to replace things such as :
> >
> > édition :
> > with
> > edition :
> >
> > I guess I have around 20 terms to get replaced, affecting a few
> > thousands of pages.
> >
> > For such a basic task (one word replacement, no multiline, no link
> > affected), I would believe unix tools to be effective – as long as you
> > know how to use them safely ;) – rather than playing with PmWiki API
> > (first : far out of my knowledge ; second : I may have a lot of pages
> > to process and it looks like overkill to edit/save hundreds or
> > thousands pages for a one (or 10) word replacement).
> >
> > Also, for such a cosmetic/minor/negligible task, I don't mind about
> > the diff ; I do restore pages sometimes, but not often, and I should
> > be able to check the restored copy twice.
> >
> > I understand it is much better to preserve timestamp replacement
> > (sure, I don't want to touch the unix time= data and indeed I do use
> > fastcache recipe).
> >
> > At least, it is true that PmWiki file text format is somewhat tricky
> > (to me, it is one of the very few things I regret with PmWiki) but the
> > planned task would have very minimal impact and is expected to be
> > quite easy.
> >
> > I didn't dare to use Anomen's XMLPageStore for my main site (if
> > something bad happens will I be able to fix it ? if there is no more
> > support in case of PHP incompatibility ?) but I do use it for a
> > private and less important part with success. Wiki.d files created
> > with that script are a lot cleaner.  No idea about the security
> > issues.
> >
> > As for .pageindex creation, it takes long but it worked last time it
> > ran ; nevertheless I'm not sure the server does like the experiment
> > too much ; and because it is quite a big and long task in my case, I'm
> > limited to regen .pageindex only if needed.
> >
> > Thank you,
> > Gilles.
> >
> >
> > Le mer. 3 juil. 2024 à 22:03, Petko Yotov <5ko at 5ko.fr> a écrit :
> >>
> >> On 03/07/2024 21:48, Petko Yotov wrote:
> >> > PTVs are normally not indexed in .pageindex, only the page names, last
> >> > modified times, and the list of words for searches.
> >>
> >> Also the link targets from the page to other pages or to categories
> >> are
> >> stored in .pageindex.
> >>
> >> Petko



More information about the pmwiki-users mailing list