[pmwiki-users] FW: cookbook "ShellTools" (was: Include specific lines of text on a page)

Mon Jan 21 09:27:38 CST 2008

> Why create separate 'head' and 'tail' markup expressions? Why not just
> one, if one is needed at all?

It could certainly be a single tool.  There's some advantage to keeping a
consistent vocabulary with an existing toolset.  Those who are familiar with
linux have an immediate return on investment rather than going through a
learning curve.  (And there's a higher percentage of linux shell users who
would use these tools as opposed to the general pmwiki public simply due to
the mindset and familiarity.)  But as has been said elsewhere there is no
compelling reason why to keep the same names and options and etc.  For
instance, I lobbied for "grep" but that's already pretty cryptic as opposed
to a more intuitive "find" or "extract-text" or ...

> >> Looking again at the (:include ..:) options I realise that in the
> >> grep.php script I lost the ability to source text not just from whole
> >> pages, but from specified sections. I think it would be useful to gain
> >> that too.
> 
> > This would be a very powerful feature - I have no idea how it's
> implemented,
> > but I've been reading enough of your stuff in the forum to know that you
> are
> > very experienced in that field.  And this may be one of the places where
> it
> > makes a lot of sense to deviate from simple text-file-processing and do
> > something wiki-specific.
> 
> This could be part of the 'head and tail' tool or function.

Certainly it makes sense to reuse code wherever possible.  I have a tendency
in this field to say "the simpler the better" in terms of these tools and
then let the user combine them in complicated ways.  But it may be that this
is counter-intuitive when applied in a pmwiki context and in that case I've
got no problem trying to put together a one-tool-fits-all solution -- but if
we are going that way then it makes sense to move away from the shell
terminology completely -- a grep which combines sed and head and tail is
going to be more confusing to a shell person who expects these names to lead
to the simple-tool-combined-complexly kind of solution.

Maybe we're looking at two completely different functional specs and maybe
what is needed is 2 different cookbooks.  Perhaps you could continue with
the tool you've made great strides on already (preferably with a different
name?) and if I can borrow from your foundation maybe I'll continue a slower
development of a wiki-sh or shelltool cookbook...

> > If I wanted to find the first n lines or the last n lines there could be
> an
> > insertion of a (head...) or (tail...) at any level outside the
> (grep...).  I
> > don't know if I'm adequately showing how powerful these simple tools can
> be
> > when given the flexibility to order them and combine them in a freeform
> > way...
> 
> yes you do. I just don't know an efficient way for multiple page
> processing having separate markup expressions.

In the version of tail I most recently submitted it demonstrates that you
are not re-processing multiple files.  You are using the nesting
functionality to process the files once (at the innermost level of the
nesting) and then at each succeeding outer level of nesting you are only
processing the OUTPUT of the next level inside...  Your "grep" already
allows for multiple files using wildcards.  When I surrounded the "grep"
call with "tail" it did not go and look at the files again -- it only looked
at the matching lines which your "grep" returned...

Let's say I have file named "fileA" containing

XYZ line 1
ABC line 2
DEF line 3
BBB line 4

And a file named "fileB" containing

QWE line 1
ERT line 2
CVB line 3

Then I execute this command:

{(tail -n 2 (grep "B" file?))}

Grep looks out there and finds the 3 lines with a B in them and returns them
to the markupexpr so the invisible in-between step looks like this:

{(tail -n 2 (ABC line 2
BBB line 4
CVB line 3))}

Then "tail" doesn't care at ALL where those 3 lines came from -- it just
goes ahead and finds the last 2 of those 3 lines and the ultimate output
(having processed through the lines ONCE) is as follows:

BBB line 4
CVB line 3

Maybe I'm misunderstanding the concern about multiple page processing?  Or
perhaps I didn't communicate clearly enough before how I envisioned the
nesting was going to work...

This functionality is existent in the "tail" I submitted most recently and
so can be tested...

-Peter