Without getting too far afield in pontificating and math theory, one of
the things I learned long ago at the dawn of the personal computing age is
that any data set can be searched and sorted in a very short time if the
group is arranged in no more than 7 (ideal) logical groups with each
organized progressively below that.

To apply this to the real world of any data set which may or may not
directly apply to a mathematical sort.

Take 1000 pages.  If they can be logically split into 7 subsets, that
gives you approximately 150 pages to deal with. Dividing those once again
into seven subsets gives you only 20 pages at the third level and 7 at the
fourth. The challenge is to give careful thought to the top seven
categories.  If you find that too much is falling into a category, then
the structure is too broad, if too few, then the category can probably
best be combined with another.

For my research on Civil War Soldier records I have 6 million men to deal
with.  These can be logically split into Union and Confederate, then by
state, then by Unit of Service (regiment or battalion), then by Company.
So in 4 levels I can drill down from 6 million names to a list of 100 or
so names which logically sort on last name.

The war itself also breaks down into logical groupings, places (10,000
elements), events (6,000 elements), dates (1500 days). Places break out
into states and counties.  Events break out into battles, skirmishes,
campaigns, and naval actions - and dates break out into weeks and months.
Again any given data item can be reached at 3 or 4 levels from the top
once you understand the structure.

If you're stymied as to how to group your data elements take a look on
line at the LOC library classifications. These eggheads spend their entire
life putting knowledge into piles.  It may (probably will) surprise you
that someone has already organized your pile for you.

John Rigdon

