Since 2010, I’ve been maintaining a list of “interesting esoterica” – papers, books, essays and poems that I find interesting entirely on their own merits. It’s mainly bits of esoteric maths – hence the name – but I’ve also included quite a few things just because they have amusing titles. The main idea is that when I’m talking to someone and want to show them a cool thing that I’ve half-remembered, I can look up the exact reference: I’ve shared the paper “Orange peels and Fresnel integrals” more times than I can count (probably the same as the number of times I’ve eaten an orange).
Back when I started the collection, I used the free “research manager” Mendeley to keep track of things – it offered an easy way to press a button while looking at a paper and instantly save all the right metadata about it to my personal library. Over the years, the experience of using Mendeley has either got worse or not improved as much as I’d like it to: I use my desktop PC less than I used to, and the desktop app hasn’t improved much anyway; the web interface to my library has steadily lost features as they rewrote it; and because I set the Interesting Esoterica collection up as a group, Mendeley won’t let me download copies of PDFs that I uploaded to it. There’s always been the little niggle that I’m giving data to Mendeley, a private company (and now a subsidiary of Elsevier), who can decide what I’m allowed to do with it.
And what I want to do with my Interesting Esoterica collection isn’t what Mendeley is normally used for – I’m not writing research papers which need to have citations formatted in the correct style; I’m not working with a group of co-authors; and some of the stuff I’m collecting isn’t classic “research material” at all. What I do want is to present the things I’ve found nicely – the collection is an aesthetic project as much as anything.
While scraping the dates, I noticed that the Mendeley web interface uses American-style dates. How unenlightened!
I decided that keeping all of my information in a .bib file would be the best way to keep it usable, no matter what happens in the future. The BibTeX format does what I want anyway – record references to things I’ve seen – and if I’d made up a database scheme, I’d have had to think about how to generate a .bib export anyway.
The hosting service that I use for my personal sites, as well as The Aperiodical, only supports yucky PHP scripting, not lovely Python, so I set about hacking together something which would parse my .bib file and display its contents in a web page without being able to rely on any of the good existing Python libraries to process BibTeX. There are a few PHP libraries which claim to parse BibTeX, including phpBibLib, which I’ve used before for a work thing. On running it over the file I exported from Mendeley, however, I discovered that it doesn’t really parse BibTeX, it just does some global string replacements, and some of the fruitier accented character commands in my file broke it. I took a look at the source code to see if I could fix it, but decided to start again from scratch: to get phpBibLib to parse properly would involve effectively rewriting it anyway, and it isn’t released under a licence that would let me share my changes easily.
Fairly shortly I had a script which loaded my .bib file and rendered a summary in HTML. My first attempt at writing a formal parser using a PEG library was far too slow to be usable, but once I’d written that I was able to rewrite a more low-level parser which is fast enough to parse the entire file each time a page is loaded. If you’re interested in using my parser, it’s on my github repository.
Next, I added forms to edit existing entries and add new ones. One of the great attractions of Mendeley was that I could use a bookmarklet to add the page I’m looking at to my library, so that was the first feature to go in. Then it turned out that the Mendeley data wasn’t that great – papers from the arXiv didn’t always have the right metadata and links to PDF versions attached, and a few other entries must have just been PDFs that I put in the desktop app, so I had to track down relevant URLs based on their titles. That took a couple of hours.
Finally, I could work on making the site look nice. I made the index page list all the entries in a patchwork design, with each item painted using a different colour drawn from a desaturated palette. On the detail page for an individual entry, I made sure the title, authors and abstract were prominent, and when the entry had a reference to a PDF file I embedded it on the right hand side. To be completely open and make sharing even easier, I show the BibTeX for each entry at the bottom of the page, and there’s a link to download the whole .bib file at the bottom of the index page. In an inspired last touch, I added an “I’m feeling scholarly” button next to the search form, to take you to a random entry. That’s my favourite thing about the collection: whenever you pick a random item from it, you’ll find something really interesting and entertaining. I’ve had a lot of fun rediscovering things that I’d added a few years ago and almost completely forgotten.
So, that’s the story of how I liberated my data from a benevolent cloud service, and made a site to present it just the way I want. You can have a look at my Interesting Esoterica collection at its new home, read.somethingorotherwhatever.com.
If you’d like to do something similar with your .bib file, I’ve put all my code on GitHub at github.com/christianp/bib-site. I’ve set it up so you can easily change how the site looks, without meddling with PHP code. All you need is a web host that can run PHP – no database or weird CGI stuff required.