This is one post in the continuing saga of my challenges of backing up fairly important personal data. Over the years I’ve accumulated a good collection of music. Much of it comes from ripped CDs that I own and can easily recover if the digital files are lost. The rest of my music is not as easily restored, being purchased from Amazon, iTunes, and eMusic. I already regularly backup important files using an rsync job run out of a daily crontab. I’d like to expand the cronjob to also backup my music as well.
To complicate matters, I don’t want all of my music backed up, just the purchased music. I have been good about using the Comments field in the ID3 tag to record the source of the music, so I wrote a short script (my first attempt at Python) to filter on that. The python script creates a new directory tree of symbolic links which are then followed by the rsync command to backup the full file.
The script uses the mutagen library. Here’s the code-y goodness. Its a BSD License, so do with it what you want as long as my name stays attached
#!/usr/bin/python
#
# Copyright (c) 2010, Eric Friedrich
# All rights reserved.
# Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
# Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
# -Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other
# materials provided with the distribution.
# - Neither the name of Eric Friedrich; nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior
# written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
# INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
# TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import string, sys, os
import mutagen
saveTags = ["eMusic", "Amazon.com", "iTunes Music Store"]
def filter_comment(comment):
for tag in saveTags:
if comment.find(tag) != -1:
return True
return False
def create_symlink(root, file):
orig = os.path.join(root, file)
# I'm pretty sure root will always be an absolute, given its generated
# by os.walk. If its not, need a check for os.isabs()
root = os.path.relpath(root, sys.argv[1])
print "Creating symlink", os.path.join(sys.argv[2], root, file)
root = os.path.join(sys.argv[2], root)
if os.path.exists(os.path.join(root, file)):
return
if not os.path.exists(root):
os.makedirs(root)
os.symlink(orig, os.path.join(root, file))
if len(sys.argv) < 3:
print 'Usage: mp3_backup.py '
sys.exit(0)
if not os.path.exists(sys.argv[2]):
print "Destination directory does not exist, creating"
os.mkdir(sys.argv[2])
print 'Searching ', sys.argv[1], 'for mp3 files\n'
for root, dirs, files in os.walk(sys.argv[1]):
for file in files:
try:
mp3_file = mutagen.File(os.path.join(root, file))
# print os.path.join(root, file)
except IOError:
print "Skipping file,", os.path.join(root, file), "no, header"
continue
if isinstance(mp3_file, mutagen.mp3.MP3) and \
"COMM::'eng'" in mp3_file and \
filter_comment(mp3_file["COMM::'eng'"].text[0]):
create_symlink(root, file)
#EOF