[pmwiki-users] Sync of local pmwiki with remote pmwiki

Joachim Durchholz jo at durchholz.org
Fri Jul 22 16:22:07 CDT 2005


Radu wrote:

> Just a few pointers, cos I'm still looking for the answer. :)
> 
> Component Software has a set of tools for doing quite accurate diffs 
> (AFAICT, smaller diffs than the gnu diff)
> http://www.componentsoftware.com/Products/RCSBrowser/index.htm
> 
> And for even more precise diffs, try WinMerge
> http://winmerge.sourceforge.net/
> 
> These are good if you have both file trees on the same windows machine, 
> and they're both very time-consuming on large directories (not something 
> you'd like to do in a cron job).

They should run quickly if there are few changes. If they don't, they 
are broken ;-p

I'm not sure whether using external software is a good idea. (More on 
that below.)

> Unfortunately, I don't know of any tool (Unix or otherwise) that does 
> automatic merging of changes between two changed files. cvs and rcs lock 
> files when they're open for editing.

No, not at all.
CVS doesn't lock, if you're trying to save an edited file to the source 
tree that somebody else edited in the mean time, you're presented with a 
merged file, conflicting lines of code highlighted.
RCS will lock files if requested to do so - we don't need to request 
that ("If it hurts, don't do it" the doctor says *g*). It comes with a 
"merge" command that will do the merges, and present any conflicts in a 
merged file. (If that sounds similar to CVS, that's no accident: CVS 
uses RCS for file storage, diffs, and merges.)

The problem with using RCS is that it does a by-line comparison. That's 
usually a too rough granularity, since that translates to a by-paragraph 
comparison. Which means that PmWiki would have to translate the raw 
markup into files with a line per sentence, run the merge tool over 
them, identify what consecutive lines in the merged result need to be 
joined back into paragraph lines.
Doesn't sound very attractive to me :-)

OTOH there's already a diff in PHP, and it's already in PmWiki (or so 
I've been told). Writing the merge function should be relatively 
straightforward.
The algorithm is like this:
Identify the last common base version that both edits started from.
Do a diff from the common base to the both submitted versions. (Use a 
form of the diff output that preserves full context.)
Scan the diffs linearly.
Sections that are unchanged in both edits can be copied over as-is.
If you find a change, look whether there was a change to the 
corresponding part in the other diff; if there was none, copy over the 
change.
If there was a change in the corresponding part, you have an edit 
conflict. Insert an obtrusive marker in the output, the two versions of 
the file, and continue.
After the scan is complete, if there was no conflict, save the page, 
otherwise return the merged result to the user in an edit page.

This algorithm isn't exactly trivial, but it's no rocket science either :-)

Regards,
Jo




More information about the pmwiki-users mailing list