[pmwiki-users] Questions about PTV and PTV mass edit

ABClf languefrancaise at gmail.com
Mon Jul 8 17:37:19 PDT 2024


My last test is as follows :

I tried to go with ImportText, following Gregor's suggestion, and have
a few remarks and one issue (which might be related to my own config).

To get the source of every targeted page (minus diff and pmwiki
starting lines), I first tried to use the bash script suggested in
PmWiki/PageFileFormat ;
with no success, because my wiki.d has per group customization plus
subfolders, and I was not able to make it work recursively.
A few hours later, ChatGPT was helpful enough to give me a working
code for my case ; here it is, slighty modified :

#!/bin/bash

# Répertoire source contenant les fichiers à modifier
SOURCE_DIR="/Users/gilles/zbackup/www/dev6/wiki.d/Source"
# Répertoire cible pour sauvegarder les fichiers modifiés
TARGET_DIR="/Users/gilles/tmp/pmwikiexport"

# Assurez-vous que le répertoire cible existe
mkdir -p "$TARGET_DIR"

# Boucle pour parcourir tous les fichiers du répertoire source
for file in "$SOURCE_DIR"/* "$SOURCE_DIR"/**/*
do
  # Vérifie si c'est bien un fichier
  if [ -f "$file" ]; then
    # Extraire le nom du fichier sans le chemin
    filename=$(basename "$file")
    # Appliquer les modifications avec sed et sauvegarder dans le
répertoire cible
    grep '^text=' "$file" | sed -e 's/^text=//' -e 's/%0a/\n/g' -e
's/%25/%/g' > "$TARGET_DIR/$filename"
  fi
done

Not very fast but it worked.

Now I have my wiki.d page converted as source files on my computer,
ready to be edited.
4 words are replaced in every file.
It looks ok at first sight.

Uploading with FTP in /import folder, than browse a page and add ?action=import.

Script prints out a message which is not accurate (doesn't list files
in /import).
No matter, import is processed in the background.
A few minutes later, pages are imported (a litlle more than 4300 very
light pages). Good.

Everything looks fine but, there is an pretty annoying issue with the
{$LastModifiedSummary} saved text, while using ImportText ; rather
than printing a revision number and a numerical diff changes, as
expected, plus ImportText sig, I am given a crazy line of this kind
(https://www.languefrancaise.net/Source/503?action=diff) :

08/07 par ImportText - (-2) N (-2) N (-3) N (-2) N (-2) N (-2) N (-3)
N (-3) N (-2) N (-2) N (-2) N (-3) N (-2) N (-3) N (-3) N (-2) N (-3)
N (-2) N (-2) N (-4) r3 (-3) N (-4) r2 (-4) r2 (-2) N (-2) N (-2) N
(-4) r4 (-3) N (-2) N (-2) N (-2) N (-2) N (-4) r2 (-4) r3 (-2) N (-2)
N (-2) N (-3) N (-3) N (-3) N (-4) r3 (-2) N (-2) N (etc.)
(a lot more lines ; for a lot more files).

Could this be fixed ? Or is my config the culprit ? do I have to
comment revision script before running ImportText ?

In config, I have :

$DiffKeepNum               = 50; # keep at least 50 revisions (default is 20)
(...)
$EnableRCDiffBytes         = 1; # Show bytes difference from previous version
(...)
# Compteur de révision
$FmtPV['$RevCount'] = '$page["rev"]';
# Date création page https://www.pmwiki.org/wiki/PmWiki/PageVariables
# add page variable {$PageCreationDate} in format yyyy-mm-dd
#$FmtPV['$PageCreationDate'] = 'strftime("%Y-%m-%d", $page["ctime"])';
#If you like the same format that you define in config.php with $TimeFmt use
$FmtPV['$Created'] = "strftime(\$GLOBALS['TimeFmt'], \$page['ctime'])";

# Numéro de révision dans historique
# (ancienne version) array_unshift($EditFunctions, "NumeroRevision");
InsertEditFunction("NumeroRevision", '>SaveChangeSummary');
function NumeroRevision($pagename,$page,&$new)
{
 global $EnablePost, $ChangeSummary, $Now;
 if (!$EnablePost) return;
 $rev = intval(@$page['rev'])+1;
 if($rev==1) $ajout = "N";
 else $ajout = "r$rev";
 $new['csum'] .= " $ajout";
 $new["csum:$Now"] .= " $ajout";
 $ChangeSummary .= " $ajout";
}

Also, in my case, since I am willing to search and replace a few
words, which have nothing to do with the page content, I'm not sure
it's a good idea to get it tracked in pages history (4000 edits,
that's quite a lot).
So, in case ImportText gets an update someday, would it be doable to
have a switch, in config for example, so as to save the changes in
page history or no ?

No matter what, ImportText will be handy for not trivial edits (add or
remove link, multiline edit, insert paragraph, etc.) ;
for trivial edits, if you don't need PmWiki saving process, a cautious
search and replace in wiki.d files might be ok.
Or, as suggested by Hans, use Fox search and replace.

Note to Hans.
Sorry Hans, I have not seen your suggestion right in time (was put in
spam I don't know why).
I do use Fox for different things, but I won't use it in this current
case. Changes to be made are trivial, but the volume of pages is too
heavy. What makes ImportText interesting here is the fact it should
complete if interrupted and it keeps a list of what has been done.
Also, I believe FoxTextReplace is to use with caution (I'm prone
enough to do mistakes).

Thank you.


Le ven. 5 juil. 2024 à 17:05, Petko Yotov <5ko at 5ko.fr> a écrit :
>
> On 04/07/2024 21:34, ABClf wrote:
> > Now change that pattern (one word) to see what will happen in real life
> > :
> > grep -rl %0apréface: | xargs sed -i 's/%0apréface:/%0apreface:/g'
>
> This may work well.
>
> I'd still recommend making a backup snapshot in case something goes
> wrong.
>
> > Change is done ; timestamp's files are updated to time of change
> > (sadly no option in sed to preserve file time) ; pageindex doesn't
> > look to bother (is not regenerated) ; browse page still work ;
> > replacement is not in history of course ; PmWiki keep showing page
> > last modified time according to time= if it exists.
> >
> > It might look rustic or brutal but it looks like it works (sure it
> > works, PmWiki flat file need only 2 lines, plus 1 line for link list –
> > PmWiki/PageFileFormat) ;
> > question is what is the price to pay when :
> > lose replacement step in history,
>
> May not be a problem.
>
> The problem with external edits and page history is when there are
> changes on multiple lines not included in the internal page history. If
> someone tries to restore an earlier version from the ?action=diff page,
> it may or may not work. The diff restore-merge function restores the
> lines per the line numbers at the time of every edit, starting with the
> latest text. If at some point the text was externally modified and the
> line numbers were changed, it may restore earlier versions wrong.
>
> I feel in your case, single word replacement, this will not be a
> problem.
>
> > have pageindex not up to date,
>
> Not likely a problem. It may affect search results for "préface" or
> "preface"
>
> If UnaccentUTF8 is enabled, as you only remove accents, pageindex
> doesn't need updating:
>
>    https://www.pmwiki.org/wiki/Cookbook/UnaccentUTF8
>
> If .pageindex returns pages that do not match the terms, PmWiki will
> reindex these pages.
>
> > time= different than file timestamp. Probably fastcache, if you use
> > it, will have to regenerate cached files when they are browsed.
>
> Looking closer into FastCache, it would only delete the cache and
> regenerate the page after you edit it on the wiki.
>
> Depending on $FastCacheInvalidateAllOnUpdate, changing any page should
> delete all cache (by default) or will update only the cache for the
> edited page.
>
> This means, visitors may still see the previously cached page before
> your mass editing. If the PTVs are visible on the rendered page,
> visitors may see in your pages "préface:" instead of "preface:".
>
> I don't expect this to be a major problem: purists on a linguistic
> website are unlikely to be offended when a word is written correctly.
>
> OTOH the word is not capitalized so they well may. :-P
>
> > Something else ?
>
> I'd still recommend making a backup snapshot in case something goes
> wrong. :-)
>
> Petko



More information about the pmwiki-users mailing list