Calibre utils update

Some updates to my Calibre helper scripts (which I described previously here): ISBN guessing script covers wider range of file formats, there are also two new scripts: one converts all .doc files to .rtf, another one crosschecks disk directory against Calibre database and reports documents not registered by Calibre.

ISBN guessing improvements

The guess_and_add_isbn.py script is now also handling .chm files. arCHMage is used to extract the text, so must be installed. On Debian and Ubuntu it is packaged, so just:

$ sudo apt-get install archmage

Regular expressions used to locate ISBN were slightly improved, some variations which were formerly missed are now handled.

Finally, while the script is running, it now properly reports the number of parsed files.

Convert docs to RTF

The convert_docs_to_rtf.py script locates all books which have .doc format but do not have .rtf and creates .rtf versions. I wrote it because Calibre itself is not able to convert docs, but is able to convert RTF to ebook formats.

OpenOffice is used for conversion, so must be installed. Also, the ootools library must be installed, simplest way to do it is to:

$ easy_install ootools

Note: the script happens to report obscure errors (Segmentation fault) while shutting down. I haven't tracked it down (it is either a bug in one of the libraries, or my misuse of them) but it is harmless (it happens after all conversions are finished, while the helper objects are destructed).

Find hanging books

The find_books_missing_in_database.py scripts cross-checks the Calibre database against the contents of the Calibre disk folder, and reports any books which are present there but not registered in Calibre (so not found by searches and in general invisible in the interface).

I wrote the script after I corrupted my repository a little bit by trying to put it on Dropbox and using it from two machines (for some reason my Calibre installations disagreed about upper/lower cases, Dropbox messed it additionally by disallowing files with names which differed only by letter case, and finally I got database with some books without any format and unregistered books on the disk). But it can be used as a general health-checking tool.

Dropbox is very cool, I sync many files using it and wholeheartedly recommend it, this is just one of those corner cases where it does not fit.

Calibre utils update

ISBN guessing improvements

Convert docs to RTF

Find hanging books

Subscribe

Quote

Recent Entries

I like and recommend

Recent Comments

Play Chess