Aperture 3: Stop suggesting duplicates on import!

I am an amateur photographer, who happens to use Aperture 3 to manage my photos and my photography workflow. I have a decent archive with 12000+ referenced files scattered around a few disks, and I am happy with how things are organized.

The software was finally working for me instead of the other way around, until one day I modified my images outside Aperture. Then I suffered from the problem where Aperture thinks I need to import thousands of images that I had already imported in Aperture, organized into projects, folders and albums, added metadata and image processed.

Google revealed tons of people reporting the same fault, but no prompt solution. Still, the information in these posts gave me ideas to device a workaround. I decided to share it so it may help another soul.

Below I provide some background info to help you understand what caused the problem and how I fixed it. I encourage you to read it, because you should know what you are doing. Ar least, the theory behind the workaround might help you to better understand how it could work for you.

view background information
 

It all started when I noticed AP3 is having a real difficult time removing keywords I had attached to photos. Removing keywords from the keyword HUD is supposed to remove them from all associated items, right? Nop, they kept coming back. Metadata batch change to clear all keywords on hundreds of photos? Didn’t help. It only cleaned keywords in a number of images, but not on thousand others. I reckon Aperture processes only those it can populate at that very moment and does not even bother with the rest. Repair/rebuild library did not help either.

So I tried another method: wrote all IPTC metadata back to the originals, used Phil Harvey’s brilliant ExifTool to clear all the keywords, then had Aperture update metadata back from the originals.

Well, clearing keywords were successful, but having Aperture read the metadata back from originals did not work. I still do not know why? Did not care, because I had a bigger issue: I noticed that, although the database looked 100% healthy, the import dialog was saying that all previously imported photos were waiting for me to reimport into the Aperture DB. What the hack? Is it the “do not import duplicates” not working, or is it that I had tempered with the image files behind Aperture’s back caused Aperture no longer recognizing them?

Still, Aperture listed no file as offline or missing. Indeed, “reveal in finder” was pointing to the right images on disk. Then why was Aperture thinking it needed to import them?

This is when I started poking around the Aperture internal database, to get to know the principles used to decide whether two images are identical or not. I created a dummy database, tossed a few (referenced) images in, observed what is changing in the database tables, started changing files in a number of ways outside Aperture, then checked what Aperture lists in the import dialog. The test results are:

Aperture did not think it needs to reimport images when I changed:

  • File access time
  • File modification time
  • File creation time
  • The name of the image file
  • The location of the image file.

Indeed, Aperture detected when I renamed a file or moved it around folders, updating it’s database accordingly. Nice feature, but surprising since the manual clearly states “Aperture checks the filename of the master to determine if an image is a duplicate of one in the library“.

Aperture incorrectly detected the same image as a new one, only when the file size was changed. And the file size do tend to change when you add/remove metadata or change image create time outside AP3 (with tools like ExifTool).

This led me to believe that as long as the unique file identifier on the disk (inode id or the file serial number) and file size stays the same, Aperture will correctly identify the image as a duplicate – hence exempt it from the list of files to be imported. In my case, deleting keywords from my images with ExifTool had changed their sizes, and Aperture database fileSize values were in disagreement with them.

So, theoretically updating the file size info in the database to the new file size on disk should have helped. And it did!

The shell script below compares the file size of the specified images on disk to the file size kept in Aperture database, and on a no match condition updates the value in database to the value on disk.

Messing around with any database you have not designed yourself is a bad bad idea. So use this script at your own risk.
Make sure you backup your Aperture Library or at least the database (usually resides at Pictures/Aperture\ Library.aplibrary/Database/apdb) beforehand, and keep that backup for long long enough time to until you are sure nothing is broken.

Furthermore, I am not a developer and my coding abilities are limited to simple scripting. I tested the code below only up to the point it worked for me. It is not fool proof, probably prone to errors, and I will not try to write a better version. It is up to the more talented members of the community, to turn it into a state of the art code if they are willing to do so.

An example would be:

Enjoy!

Leave a Reply