[pmwiki-users] bibliographies revisited
christian.ridderstrom at gmail.com
christian.ridderstrom at gmail.com
Sun Oct 1 11:13:18 CDT 2006
On Thu, 28 Sep 2006, John Rankin wrote:
> >I don't have time to follow up on this properly right now, so this is
> >just a quick reply.
Ok, I've now finally had time to look at your web pages and have time to
actually read the thread and reply. Actually, I've already sent in a few
replies (taken from an initial version of this post). It's still a bit
long, so when possible we should try and split it up into sub-threads for
I would like to begin with an overall question to keep in mind regarding
your suggested plan/solution:
Does it scale?
The scaling I consider here concerns questions such as:
* Will it work well in practice when the bibliographic database
(bib-db) grows to hundreds of entries?
* Can different authors maintain and use their own bib-db:s?
* Can an author maintain and use several separate bib-db:s?
Achieving scalability *is* difficult, so I'm not saying that we must
require all of the above answers to be yes. We should however be clear on
which questions are answered with a 'no, not initially', and which will be
I am especially worried that a choice of user interface and storing
mechanism now will make it difficult to later to scale up the system.
Compatibility and interoperability with BibTeX?
In an ideal world, it would be possible to take a bunch of BibTeX files
and import them to a wiki where they are then used and possibly modified.
It would also be possible to go the other way and export (parts of) the
bibliographic information from the wiki back to the same bunch of BibTeX.
Phrased differently, the wiki would allow an online storage mechanism for
the bibliographic information that people could use collaboratively. We
should consider questions such as:
* To what extent will it be possible to import .bib-files?
Restrictions on citation keys?
* Will it be possible to go back and forth between .bib-files?
* How do we maintain correspondence between .bib-files and bibliographic
information on the wiki?
And now to reply to your post:
> >> > 1. what is the downside to storing each citation in its own wiki
> >> > page, where the unique reference id becomes the page name?
> >> I've never done much in the way of BibTex or bibliographies, so my
> >> comments should be discounted accordingly. However, I know that
> >> sometimes it's a pain if an author is limited to a one-to-one
> >> correspondence between pages and data.
> >My reaction is that I'd probably be very annoyed if I was forced to use
> >a one-to-one mapping. My experience is that it's a PITA to do it that
> >Using *virtual* pages on the other hand might be a good (parallel)
> I don't understand this comment, I'm afraid.
I simply mean that we could have a system where the bibliography database
(bib.db) is stored in one wiki page (or .bib-file), but *pretend* that
each citation entry resides on a separate page. Let's assume that the
bib.db is actually stored in the page Site.BibDB. Also assume we have a
virtual group called Bibliography. By going to the page
the system will extract the entry in the bibliography database that
corresponds to Ridderstrom2003-LLC and generate the page on demand. Note
that we could easily use a separate URI argument (eg ?bibstyle=unsrt) to
specify a specific style for the citation.
Note: The key I used for my thesis is actually
so I wouldn't be able to refer to my thesis using a page name.
In a similar fashion, going to a page that does not correspond to an entry
in the Bibliography database could allow for a simple way to let the user
add a new entry to the database without having to edit the database
So in essence I mean that we could use the virtual pages as a mechanism
for the user interface. However, the issue with restrictions on citation
keys make me reluctant to use this interface. I would literally have to
change hundreds of keys if I were to re-use my biblographic database.
> >The rest I'll hopefully have time to read and give comments on
> While commenting, perhaps give thought to how much of a PITA
> this would be. There appears to be a *lot* more development to
> support a multiple citations per page solution, as you lose
> the ability to re-use a whole lot of existing code.
Hmm... are you sure about that? I suggest we discuss that separately -
I've posted about it in a new thread.
> Why, exactly, would it be a PITA? Is removing this pain worth increasing
> the budget by 10%, 20% ... 100% ... whatever it takes?
Primarily it is my gut feeling... and from the gut, quite a bit of extra
work on the *backend* would be worth it. My instinct is that you wouldn't
regret it. On the other hand, if you don't do it, I think it is likely
that you will regret it.
Looking at it differently, my instinct is that you should keep the backend
very similar to BibTeX files, but consider the pardigm of "one page, one
citation" to be part of the user interface.
As an aside, if you are interested I could send you copies of the BibTeX
files that I used during my PhD. They were useful for much more than just
maintaining a database of bibliographic information. For instance, I used
to add abstracts for each entry, as well as my private notes after reading
a paper. That let me use the database to also remember what the paper was
about. I also kept my information in different BibTeX files (4-5 files?).
One of these files was sort of a "macro"-file, that defined stuff like the
names of journals or conferences. This was useful in order to make sure
that journal names etc was written in a consistent manner in the
bibliography. Something else I used to annote in each entry was the
affiliation(s) of the author(s). This was useful to see what work was
being done by which groups.
Reading the above, I realize that having an abstract/summary is not a
reason for not storing a single citation per page. I also realize that
(:pagelist:) could be used efficiently to get a list of all entries that
for instance correspond to a certain affiliated group. (Note that this in
practice almost requires you to use macros for the affiliation name -
otherwise you get into all the problems of misspellings:-(
So the reason I can think of right now is that sometimes you really wish
to be able to edit a lot of entries at once (in the same edit form). A
very simple example is if I maintain a list of URIs in the bibliography
database. In this case I think that I'd prefer keeping a whole bunch of
URIs in a single page. If I had to click up a different page for each URI,
I'd probably get annoyed quickly. Or let's say I'm editing more complex
entries, and I have to click between *lots* of different pages just to get
consistent naming etc. (A typical case is that you want to make sure the
name of an author is spelled correctly and consistently, and perhaps
consistency wether using the full first name or just an initial).
I also think that it might get very annoying to have the change history of
modifications to a bibliography database spread out over lots of different
pages. Let's say you are editing the database (kept on many pages) in
order to get consistent naming, and then you decide -oops- that it was
wrong. You now have to revert the history of each and every one of those
For the URIs, a similar thing might be that a domain name changes, and you
now have to update all the URIs pointing to that domain. With all the URIs
in a single page, the change is relatively easy - and nicely kept in the
history. With a single URI per page, it'll get tedious.
The most important point is however the compatibility with BibTeX files,
i.e. being able to go back and forth between the bibliographic information
on the wiki and what you have as local BibTeX files. Please note that it
also is important to be able to have a few different BibTeX files - I
don't think it would be enough to get a correspondence to a single BibTeX
file, and not just because you want to have a separate (BibTeX) file that
defines macros for consistent naming.
I can see that the easy way right now is to store a single citation on a
single wiki page. This would quickly allow to create a wiki system with a
single bibliography database for the entire wiki (and hence all authors).
Assuming we have an import mechanism from a BibTeX file, I can also see
how you could automatically populate the pages in the group Bibliography/.
We do however have to think about the reverse process. If the citations in
Bibliography/ comes from two different BibTeX files, I'd want to be able
to update the BibTeX files with the relevant citations from Bibliography/.
This requires that each entry "knows" which BibTeX file it belongs to (or
something like that). We could of course maintain a correspondence between
one group and one BibTeX file, but then we lose some of the advantages of
having Bibliography/ as a site-wide bibliographic database.
That loss might of course be inevitable, as authors probably want their
own bibliographic databases. I've got pages in a separate group on
pmwiki.org, and I'd probably expect to be able to have a separate
bibliographic database for those pages. Perhaps even more than one.
> Is multiple citations per page a "must have on day one" or
> a "could be added later" requirement?
I would say it is a "must be able to have later on, or it'll bite me" in
the sense that I'll be restricted. My gut also says that being able to
synchronize between a set of local BibTeX files, and a set of bibliography
databases on the wiki is necessary (later on).
(As an aside, needing different bibliography groups might be a motivation
for having a hierarchical group structure - then each group could have
their on Bibliography/ sub-group if that is desired).
> In particular, check
> This is attempts to define a practical work plan of small chunks, in a
> logical build sequence. Feel encouraged to add to or resequence the
> plan, but please don't remove any of the items. My feeling is that the
> first 3 stages are about 2:1:2 in relative size.
Your plan is probably workable, but I'm worried the system will have some
serious limiations. As for the limitations, that could be ok if you are
ready to consider the system a useful first prototype, i.e. something
that's testing the waters and getting you familiar with what users need. I
would however be prepared to do a complete re-design of the whole thing
after having some practical experience with using it.
But it also very much depends on what you are trying to achieve. Our goals
are probably different, and maybe my aims are set much to high after being
used to the full power of BibTeX?
I think that the "safe" plan would be slightly different. For instance,
I'd start with being able to import *and* export a *set* of .bib-files as
I expect that functionality can become difficult to add later on.
I am not completely convinced that the best way in the long run is to have
one or more bibliography groups, but I agree that it is a good starting
point - certainly good enough for a usable prototype system. The main
advantage is that we get a lot of functionality for free. I am worried
about restrictions on scalability from that system though...
Anyway, getting in to implementation details before agreeing on the scope
is probably a bad idea... I would however at least investigate the idea of
storing the bibliographic databases in one or more .bib-files. It might be
possible to re-use a lot of existing code for many of the tasks.
Christian Ridderström, +46-8-768 39 44 http://www.md.kth.se/~chr
More information about the pmwiki-users