Aperture 3: Stop suggesting duplicates on import!
I am an amateur photographer, who happens to use Aperture 3 to manage my photos and my photography workflow. I have a decent archive with 12000+ referenced files scattered around a few disks, and I am happy with how things are organized.
The software was finally working for me instead of the other way around, until one day I modified my images outside Aperture. Then I suffered from the problem where Aperture thinks I need to import thousands of images that I had already imported in Aperture, organized into projects, folders and albums, added metadata and image processed.
Google revealed tons of people reporting the same fault, but no prompt solution. Still, the information in these posts gave me ideas to device a workaround. I decided to share it so it may help another soul.
Below I provide some background info to help you understand what caused the problem and how I fixed it. I encourage you to read it, because you should know what you are doing. Ar least, the theory behind the workaround might help you to better understand how it could work for you.
It all started when I noticed AP3 is having a real difficult time removing keywords I had attached to photos. Removing keywords from the keyword HUD is supposed to remove them from all associated items, right? Nop, they kept coming back. Metadata batch change to clear all keywords on hundreds of photos? Didn’t help. It only cleaned keywords in a number of images, but not on thousand others. I reckon Aperture processes only those it can populate at that very moment and does not even bother with the rest. Repair/rebuild library did not help either.
So I tried another method: wrote all IPTC metadata back to the originals, used Phil Harvey’s brilliant ExifTool to clear all the keywords, then had Aperture update metadata back from the originals.
Well, clearing keywords were successful, but having Aperture read the metadata back from originals did not work. I still do not know why? Did not care, because I had a bigger issue: I noticed that, although the database looked 100% healthy, the import dialog was saying that all previously imported photos were waiting for me to reimport into the Aperture DB. What the hack? Is it the “do not import duplicates” not working, or is it that I had tempered with the image files behind Aperture’s back caused Aperture no longer recognizing them?
Still, Aperture listed no file as offline or missing. Indeed, “reveal in finder” was pointing to the right images on disk. Then why was Aperture thinking it needed to import them?
This is when I started poking around the Aperture internal database, to get to know the principles used to decide whether two images are identical or not. I created a dummy database, tossed a few (referenced) images in, observed what is changing in the database tables, started changing files in a number of ways outside Aperture, then checked what Aperture lists in the import dialog. The test results are:
Aperture did not think it needs to reimport images when I changed:
- File access time
- File modification time
- File creation time
- The name of the image file
- The location of the image file.
Indeed, Aperture detected when I renamed a file or moved it around folders, updating it’s database accordingly. Nice feature, but surprising since the manual clearly states “Aperture checks the filename of the master to determine if an image is a duplicate of one in the library“.
Aperture incorrectly detected the same image as a new one, only when the file size was changed. And the file size do tend to change when you add/remove metadata or change image create time outside AP3 (with tools like ExifTool).
This led me to believe that as long as the unique file identifier on the disk (inode id or the file serial number) and file size stays the same, Aperture will correctly identify the image as a duplicate – hence exempt it from the list of files to be imported. In my case, deleting keywords from my images with ExifTool had changed their sizes, and Aperture database fileSize values were in disagreement with them.
So, theoretically updating the file size info in the database to the new file size on disk should have helped. And it did!
The shell script below compares the file size of the specified images on disk to the file size kept in Aperture database, and on a no match condition updates the value in database to the value on disk.
Make sure you backup your Aperture Library or at least the database (usually resides at Pictures/Aperture\ Library.aplibrary/Database/apdb) beforehand, and keep that backup for long long enough time to until you are sure nothing is broken.
Furthermore, I am not a developer and my coding abilities are limited to simple scripting. I tested the code below only up to the point it worked for me. It is not fool proof, probably prone to errors, and I will not try to write a better version. It is up to the more talented members of the community, to turn it into a state of the art code if they are willing to do so.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
#!/bin/sh echo "**************************************************************" echo "* Shell script to update the file size parameter of master" echo "* images in the Aperture database to the actual values on disk" echo "**************************************************************" # Argument 1: "full/path/to/image_file_or_folder" # The argument must be: #   1. surrounded in double quotes to prevent shell expansion of wildcard characters #   2. a full path to the target from the volume root excluding the initial backslash (/) # because this is how Aperture keeps image file paths in the DB ### Check argument count if [ $# -lt 1 ] then echo "Error in $0 - Invalid Argument Count" echo "Syntax: $0 /full/path/to/image_file_or_folder" echo "" exit 1 fi ### Set variables IMAGEPATH=$1; IMAGENAME=$(basename "$1") IMAGEFOLD=$(dirname "$1") ### Retrieve Aperture DB Location # Code for retrieving Aperture DB location is reused from # http://fiveyears62.com/2010/05/10/new-database-access-the-database-isnt-locked-anymore-in-aperture-3-0-3/ PREFERENCE="" function setPreference { PREFERENCE=$(osascript \ -e 'tell application "System Events"'\ -e "return value of property list item \"$1\" of ¬ property list file ((path to preferences as Unicode text) & \"$2\")" \ -e 'end tell') if [ ${PREFERENCE:0:1} = "~" ] ; then PREFERENCE="$HOME${PREFERENCE:1}" fi } setPreference "LibraryPath" "com.apple.Aperture.plist" LIBRARY="$PREFERENCE" DB="$LIBRARY/Database/Library.apdb" echo "...Using database $DB" # IFS is Bash's internal field separator, # Needs to be set to a newline character instead of the default value of # whitespace character otherwise find command output will not be properly parsed OLDIFS=$IFS IFS=$'\n' ### Retrieve list of files and start processing FILELIST=$(find "$IMAGEFOLD" -name "$IMAGENAME") i=0 for f in $FILELIST do # Retrieve file size on disk FILESIZE=$(stat -f "%z" "$f") # Retrieve file size in Aperture DB FILESIZEINDB=$(sqlite3 "$DB" "SELECT fileSize FROM RKMaster WHERE imagePath = '$f';") # If a record is found, and if values don't match update the value in DB if [ $FILESIZEINDB ] then if [ $FILESIZEINDB -ne $FILESIZE ] then updateresult=$(sqlite3 "$DB" "UPDATE RKMaster SET fileSize = '$FILESIZE' WHERE imagePath = '$f';") echo "Updating $f db record: Size in db is ${FILESIZEINDB}, size on disk is ${FILESIZE}" let i++ else echo "Skipping $f: File size match." fi else echo "Skipping $f: Image not in Aperture." fi done echo "$i records updated" # Restore IFS IFS=$OLDIFS exit 0 |
An example would be:
|
1 |
~myuser$ aperturefixdupsimport "Users/myuser/Pictures/raw/*.cr2 2>&1 | tee -a results.log" |
Enjoy!