Calibre ISBN detector

As I recently bought Sony PRS-600 (fairly good reader with nice touch screen, in case you read Polish see my review), I become interested in ebook management. For Linux user it looks like the only reasonable option is to use Calibre - useful application which not only lets me manage my reader, but also provides well designed ebook database.

One of the nice Calibre options is that once you enter a book ISBN, plenty of useful information (canonical versions of author name and book title, description, cover, even tags) can be downloaded automatically. But, for some reason, the application does not detect ISBN. I repeated the sequence open a book, go a page or a few down, copy ISBN, go back to Calibre, open book data, paste ISBN a few times and decided it is boring and could be automated.

So I wrote a short script which performs this very action.

Purpose

The script is analysing calibre database (it assumes calibre is already installed and properly configured), looking for books without ISBN, then tries to find their ISBN by scanning leading pages. If ISBN is found, the script saves it (updates given book Calibre metatada). No other metadata changes are performed.

Later on ISBN can be used to grab the book metatada and/or book cover inside Calibre GUI. Just spawn Calibre and look for books with ISBN set and missing metadata, for example using query like:

 isbn:~[0-9] not publisher:~[a-z]

(above means: isbn contains some digit, publisher does not contain any letter). Then mark appropriate books (I prefer to handle them in batches of no more than 10-20 so I can review the changes easily), right click, expand Edit Medatada Information submenu and pick Download Metadata (or some other Download option).

Prerequisities

The script has been developed and used on Ubuntu Linux. It should work on other platforms (if necessary tools are installed), including Windows and Mac, but I haven't tested it.

Calibre must be installed, properly configured and have some books in the database (otherwise it does not make sense to run the script). The calibredb command must be in PATH (alternatively CALIBREDB variable on the beginning of the script can be modified to contain full path to calibredb).

Tools providing the following commands:

pdftotext
catdoc
djvutxt

must be installed and present in PATH. On Ubuntu Linux or Debian Linux those can be installed from standard repositories, just install the following packages: poppler-utils, catdoc, djvulibre-bin - either using GUI, or by running

$ sudo apt-get install poppler-utils catdoc djvulibre-bin

Python 2.6 is required (script is using features of tempfile and subprocess introduced in 2.6). Also, lxml library must be installed. On Debian or Ubuntu just install the following packages: python2.6 and python-lxml, for example by:

$ sudo apt-get install python2.6 python-lxml

Download and Installation

The script is available here (to download just click raw and save the file as guess_and_add_isbn.py in any folder of your choice).

Usage

Spawn terminal or console, check whether PATH is set properly, then run:

$ python guess_and_add_isbn.py

and wait for the script to finish.

Note: it may take some time, especially on bigger databases.

The script can be run while Calibre is running (it will notify running Calibre about data changes). There is minor annoyance in such case (every time some book is updated, Calibre refreshes the book list and forgets which books were selected), so I do not recommend searching or editing books while the script is running.

The script can be safely re-run again (for example after new books are added).

Source code

Official repository: http://bitbucket.org/Mekk/calibre_utils

Calibre ISBN detector

Purpose

Prerequisities

Download and Installation

Usage

Source code

Subscribe

Quote

Recent Entries

I like and recommend

Recent Comments

Play Chess