May 27, 2010

An immodest and hopefully obvious proposal for electronic citations

I had a thought today after reading of Barnes & Noble's new iPad app, which allows customers to loan/borrow purchased books. I haven't heard whether the annotations go along with the lending, but it strikes me that academics needing to cite locations in ebooks and those interested in annotation technology both need a way to refer to locations within electronic documents.

The problem for academics looking for citation conventions is that we're all used to page numbers, which give us a way to identify a location manually by flipping through pages (or by hunting for a letter or other archival document within a file folder). Do we really need that sort of human-navigated location specificity? If we can search for text inside a document, we certainly don't. But the reference format is needed, and I think there would be an easy way to create another convention that would serve both academic purposes and ereader technology:


What's that, you ask?

location/file number (within envelope, 1 if no envelope)/file size/file checksum (using some conventional algorithm)

Given a particular edition (i.e., uncorrupted file in a recognized format with a file size and checksum), this would give a precise location. With a different edition, the approximate location within a file and the first part of the quoted passage should be sufficient for finding the passage quickly. Let's call the three numbers a brief spot location reference and the numbers plus the quotation the spot location reference. What if you're referring to a passage?


I know I'll be torn limb-from-limb by my fellow historians, until I point out the following:

When Patto/d her hat./
This passage shows the protagonist's commitment to blah blah blah yadda yadda yadda./
Sherman Dorn/20100527080312-0500

That's the range reference, the first and last ten characters of the (theoretical) passage, annotation text, annotation author, and timestamp of annotation. And there, ladies and gentlemen, is a format for annotating electronic materials. It does not require changing the EPUB format, just tracking a file of annotations and ereader software that can put the annotation in the right place (the start and end of the passage for disambiguation). They can be shared, accumulated, analyzed, etc.

There may be important reasons why this wouldn't work, but I can't think of them at the moment.

