In Part One we looked at how to download our photos from Google Photos to a local drive. But now we’ll look at how to archive them into human readable folders that can be included in daily snapshot backups.
Our Problem
Now we have all our photos downloaded, but I really wanted things to live on my NAS with the rest of my important files. This would also let me potentially delete photos from Google Photos but keep things locally.
I wanted this to be “automated” to where I could edit a fairly basic file and update the links between Google Albums and local directories. Also, I didn’t want to copy + paste a script for each directory.
I keep all meaningful photos in Albums, so I focused on syncing the albums
rather than individual photos. Copying all the photos would be fairly trivial
with rsync
.
The CSV
The first part was the CSV file. I created a fairly basic format with no headings, and had the following columns:
- User (More on this later)
- Google Album Name
- Local Path
I added a User field to allow for multiple Google Photo accounts to use this script, in case my Wife has photos she’d like to put there too. There’s a switch in the script that follows this and basically just replaces the source path it’s pulling from.
Here’s an example line from my CSV:
"grumpy","Houston","Grumpy/Trips/2019.07 - Houston"
The Script
Here’s the bash script I wrote:
#!/bin/bash
set -e
###
# Config Variables
###
GRUMPY_PHOTOS_PATH='<source_photos_path>'
MAP_FILE='/.../photo-paths.csv'
OWNCLOUD_PATH='/.../shared/Photos/'
###
# End of Config Variables
###
while IFS="," read -r user album dest; do
# Clean out the quotes we have padding the CSV
temp="${user%\"}"
temp="${temp#\"}"
user=$temp
temp="${album%\"}"
temp="${temp#\"}"
album=$temp
temp="${dest%\"}"
temp="${temp#\"}"
dest=$temp
# Get the right source based on the user column
if [ "$user" == "grumpy" ]; then
SRC="$GRUMPY_PHOTOS_PATH"
fi
# If we don't have an SRC, skip this line
if [ "$SRC" == "" ]; then
echo "Skipping line with user $user"
continue
fi
# Find the album directory
while IFS= read -r -d $'\0'; do
DIRNAME="$( basename "$REPLY" )"
DIRNAME="${DIRNAME:5}"
if [ "$DIRNAME" == "$album" ]; then
# Clean up some variables to make things sane
SRCDIR="$REPLY/"
DESTDIR="$OWNCLOUD_PATH/$dest/"
echo "Syncing $SRCDIR to $DESTDIR..."
sudo -u www-data mkdir -p "$DESTDIR"
rsync -rL --delete --owner="www-data" --group="www-data" "$SRCDIR" "$DESTDIR"
touch "$DESTDIR/__DO_NOT_ADD_FILES_HERE__"
fi
done < <( find "$SRC/albums" -regex ".*/[0-9]+ .+" -type d -print0 )
done <$MAP_FILE
echo "Fixing ownership..."
chown -R www-data: $OWNCLOUD_PATH
Here’s what that script does:
- Get a list of links from the CSV.
- For each link, find the corresponding Google Album. This is a bit more complex then it sounds because the Albums have dates prepended, and can change location when adding photos.
- Rsync the source album to the destination. We have
--delete
set to prevent any duplicate files (noted below) and-L
to copy the symlinks (also noted below). - We put a empty file in each directory to warn people not to put files directly there.
There’s also a couple variables you’ll need to edit at the top:
GRUMPY_PHOTOS_PATH
(but you can rename this) - This is the source of photos coming fromgphotos-sync
. You’ll want this path to be the base where you are syncing photos, ie the directory that contains both thephotos
and thealbums
directories.MAP_FILE
- The location of the CSV file you’ll use to map everything. In my case, this also lives on my NAS.OWNCLOUD_PATH
- This is the location on the NAS where everything should live.
Why –delete
One bug that was reported with gphotos-sync
is the ability for it to
re-download photos at times if modifications to metadata are made. This could
potentially cause issues with our use case, so we clean up files.
Also, if a photo is deleted from the album, I’d like to remove it from the server too. Worst case, things are still backed up and snapshotted daily.
Why -L
By default, gphotos-sync
will just place symlinks to the photos in the albums
directory. This is nice because it doesn’t duplicate data, but in our case we
want to move the actual photo data. -L
tells rsync to follow links and copy
the actual file rather than just the link.
Wrapping Up
That script isn’t the best, but it works for what I need it for. It runs daily right after the photos are downloaded, and well ahead of our offsite backups. The advantage of all this is all our photos not only get downloaded locally, but placed in another offsite backup in case things go really wrong elsewhere.