I have been using F-Spot for some time, but decided to try digiKam (I am starting to feel, that albums are not that bad idea, after all, and I like autodetection of new/changed images). As it turned out, digiKam failed to import F-Spot tags (in spite of the fact I was working in Write metadata to file mode), so I had to do some scripting to copy the F-Spot tags to the digiKam database.
Below the crucial script and some extra comments.
Importing images
digiKam insists on using the predefined image directory. Its name
depends on KDE language version, in my case it is called ~/Obrazki
,
in case of English locale ~/Pictures
. F-Spot allows one to
configure such a dir, but defaults to ~/Photos
.
So, to transfer my images from F-Spot to digiKam, I had to make my images visible to digiKam (and have it import them).
It turned out to be trivial. I just removed my ~/Obrazki
(aka
~/Pictures
) and symlinked the root of my photo directory
there instead:
$ rmdir ~/Obrazki $ ln -s ~/Photos ~/Obrazki
(I could also move the dir, but this way I keep F-Spot working, would I like to go back)
Then, I started digiKam. It automatically spotted new folders and images, popped some progress bar and imported all my photos. They all showed up properly in digiKam, with correct date and time.
But without tags!
So I wrote a short Python script, which copied all the tags for me.
Copying tags
Both F-Spot and digiKam share the same idea of storing and
managing the program data. Individual photos are stored on disk, in
specified directory, while metadata (file names, locations, tags,
....) are kept in SQLite database. Those databases can be found in
~/.gnome2/f-spot/photos.db
and ~/Obrazki/digikam3.db
(or Pictures/digikam3.db
,
or something similar, depending on your language).
So, to copy the tags, I had to read them from photos.db
and write
them to digikam3.db
. Of course, those databases use different
data structures, but - as it turned out - are very similar.
To analyze the data structure I used commands like:
$ sqlite3 ~/.gnome2/f-spot/photos.db sqlite> .tables ... list of tables here ... sqlite> .schema tags ... tags table definition here ... sqlite> select id, name from tags; ... bzzz ....
and
$ sqlite3 ~/Obrazki/digikam3.db sqlite> .tables ... list of tables here ... sqlite> .schema Images ... tags table definition here ...
It turned out, that (except differently named tables and fields) there are just a few important differences between those databases:
-
F-Spot uses full file path to identify the image (
photos.uri
), while digiKam identifies the image with the base file name (without directory) and the album id. Album id references separate albums table, where albums are defined by the path relative to main digiKam photos directory. So, while in F-Spot some photo has urifile:///home/marcink/photos/2008/01/08/100_1989.JPG
, in digiKam the same photo has name100_1989.JPG
, and belongs to album with url/2008/01/08/
. -
F-Spot uses two-level tag hierarchy (category and tag), represented by
is_category
andcategory_id
fields intags
table, while digiKam allows for any tag trees (pid
, or parent id in theTags
table).
After some reconsideration I decided to ignore the latter (it is easier to drag and drop tags in digiKam after import than to write a hierarchy copying code).
The script
Here is my script. It heavily uses SQLAlchemy - the great database access layer for Python. I actually used it, and it properly copied all my tags.
You can try running it as it is (maybe turning the final commit into rollback on the first run, to spot what the script is actually doing), or use it as basis for some similar tool. I'd recommend closing digiKam before it is run.
You can also download this script.
#!/usr/bin/python # -*- coding: utf-8 -*- import os home = os.environ['HOME'] fspot_db = "sqlite:////%(home)s/.gnome2/f-spot/photos.db" % locals() digikam_db = "sqlite:////%(home)s/Obrazki/digikam3.db" % locals() # Prefix to remove from F-Spot urls to get the digikam URLs. fspot_mount = r'file:///home/marcin/Photos' # Turn into true to see all SQL commands ECHO_DB = False ############################################################ from sqlalchemy import create_engine, MetaData from sqlalchemy import Column, Table, \ DateTime, UnicodeText, Integer, String, Unicode,\ ForeignKey from sqlalchemy.sql import func, join from sqlalchemy.orm import mapper, sessionmaker, \ deferred, backref, relation from sqlalchemy.exceptions import InvalidRequestError ############################################################ # F-Spot database ############################################################ fspot_engine = create_engine(fspot_db, echo=ECHO_DB) fspot_metadata = MetaData() fspot_tags_table = Table( "tags", fspot_metadata, Column("name", Unicode()), autoload = True, autoload_with = fspot_engine) # id, name, category_id, is_category, sort_priority, icon fspot_photo_tags_table = Table( "photo_tags", fspot_metadata, Column("photo_id", Integer, ForeignKey('photos.id')), Column("tag_id", Integer, ForeignKey('tags.id')), autoload = True, autoload_with = fspot_engine) # photo_id, tag_id fspot_photos_table = Table( "photos", fspot_metadata, Column("uri", Unicode()), autoload = True, autoload_with = fspot_engine) # id, time, uri, description, roll_id, # default_version_id, rating class FSpotTag(object): pass class FSpotPhoto(object): pass mapper(FSpotTag, fspot_tags_table, properties = { 'icon' : deferred(fspot_tags_table.c.icon), # Big 'photos' : relation( FSpotPhoto, secondary = fspot_photo_tags_table), }) mapper(FSpotPhoto, fspot_photos_table, properties = { 'tags' : relation( FSpotTag, secondary = fspot_photo_tags_table), }) ############################################################ # Digikam database ############################################################ digikam_engine = create_engine(digikam_db, echo=ECHO_DB) digikam_metadata = MetaData() digikam_images_table = Table( "Images", digikam_metadata, Column("name", Unicode()), Column("dirid", Integer(), ForeignKey("Albums.id")), Column("datetime", Unicode()), # I had problems with date format autoload = True, autoload_with = digikam_engine) # id, name, dirid, caption, datetime digikam_albums_table = Table( "Albums", digikam_metadata, Column("url", Unicode()), Column("date", Unicode()), autoload = True, autoload_with = digikam_engine) # id, url, date, caption(NULL), collection(NULL), icon(NULL) digikam_tags_table = Table( "Tags", digikam_metadata, Column("name", Unicode()), autoload = True, autoload_with = digikam_engine) # id, pid, name, icon, iconkde digikam_image_tags_table = Table( "ImageTags", digikam_metadata, Column("imageid", Integer, ForeignKey('Images.id')), Column("tagid", Integer, ForeignKey('Tags.id')), autoload = True, autoload_with = digikam_engine) # imageid, tagid class DigikamTag(object): def __init__(self, name): self.name = name self.iconkde = "tag" self.pid = 0 class DigikamPhoto(object): pass mapper(DigikamTag, digikam_tags_table, properties = { 'icon' : deferred(digikam_tags_table.c.icon), 'photos' : relation( DigikamPhoto, secondary = digikam_image_tags_table), }) mapper(DigikamPhoto, digikam_images_table.join(digikam_albums_table), properties = { 'tags' : relation( DigikamTag, secondary = digikam_image_tags_table), # I do not use this column and I had problems decoding it # on some photos (invalid utf-8). So let's skip it 'caption' : deferred(digikam_images_table.c.caption), }) ############################################################ # Connect to the database ############################################################ FSpotSession = sessionmaker( autoflush = True, transactional = True) FSpotSession.configure(bind = fspot_engine) DigikamSession = sessionmaker( autoflush = True, transactional = True) DigikamSession.configure(bind = digikam_engine) fspot_session = FSpotSession() digikam_session = DigikamSession() ############################################################ # Creating missing tags and caching them ############################################################ digikam_tags = {} for f_tag in fspot_session.query(FSpotTag).all(): d_tag = digikam_session.query(DigikamTag).filter_by( name = f_tag.name).first() if not d_tag: print "Creating missing Digikam tag %s" % f_tag.name d_tag = DigikamTag(f_tag.name) digikam_session.save(d_tag) digikam_tags[ f_tag.name ] = d_tag ############################################################ # Copying image tags ############################################################ import os.path def fspot_uri_to_digikam(f_uri): """ Takes F-Spot file uri. Returns the tuple (dir-uri, filename) to be used on Digikam """ if not f_uri.startswith(fspot_mount): raise Exception("Don't know how to handle image %s" % f_uri) rest = f_uri.replace(fspot_mount, "") return (os.path.dirname(rest), os.path.basename(rest)) # Note: I have only minority of my photos tagged. Therefore # I prefer to iterate starting from tags. With well tagged # collection it may be better to iterate over photos, then tags for f_tag in fspot_session.query(FSpotTag).all(): d_tag = digikam_tags[ f_tag.name ] for f_photo in f_tag.photos: # Lookup the same photo in Digikam database (d_album_uri, d_photo_name) = fspot_uri_to_digikam(f_photo.uri) try: d_photo = digikam_session.query(DigikamPhoto).filter_by( name = d_photo_name, url = d_album_uri).one() except InvalidRequestError: print "No photo in digikam db. Removed? Digikam: (%s,%s), F-Spot: %s" % (d_album_uri, d_photo_name, f_photo.uri) continue # Append tag (without checking, I am working on empty tag database) print "Adding tag %s to photo %s in album %s" % (d_tag.name, d_photo_name, d_album_uri) d_photo.tags.append(d_tag) ############################################################ # The END ############################################################ fspot_session.flush() digikam_session.flush() fspot_session.rollback() digikam_session.commit()
Possible extensions
There are a few loose ideas of improving this script:
-
Handle incremental tag copying - just check for the tag presence in the destination image before adding it.
-
Copy tags from digiKam to F-Spot - just reverse the loops, iterate over digiKam photos and tags and save them to F-Spot database.
-
Make two-side sync.
-
Handle tag hierarchies after all.
Would somebody work on them, I'd be glad to hear.