Hi, I'm Paul Finn

I'm a full stack web developer with a preference for Python. I also enjoy discussions on entrepreneurship and software marketing.


Handcrafted in Portsmouth, NH
← Return to blog home

Python & Heroku: An Easy Method For Removing EXIF Data

This past week I built a simple web service that consumes email messages with image attachments and returns a simple, minimalist single-page photo album.

This was purely an exercise (to solve a very once-in-awhile problem) and not the beginnings of an entrepreneural effort. Regardless, I still felt like it was important to ensure that all uploads are stripped of their EXIF data as this service is open to the public.

Why? EXIF data can contain many bits of identifying information. With each photo taken, cameras are embedding information such as timestamps, shutter speed and GPS coordinates. The inclusion of GPS coordinates if often non-desired for amateur photographers and raise possible privacy and security issues. It's also fair to say that a majority of the general public is not aware that todays smartsphones and cameras add this information.

(In December 2012, anti-virus developer John McAfee was arrested in Guatamala while on the run from authorities. A magazine posted an exclusive interview, with photos, to their website. Some of these photos contained the GPS coordinates of his location in the EXIF data. Armed with this information, authorities were able to find and arrest him.)

A little bit of searching yielded a small handful of python libraries that were perfectly capable of reading EXIF data but not clearing it. I should note that I was dismissing any solutions which involved PIL, Python Imaging Library, which has given me nothing but trouble every time I try to get it installed with the necessary image format extensions.

After some more research I found a post on a message board suggesting that you could remove all EXIF information with the help of an ImageMagick command line tool called mogrify. While I was initially afraid of introducing a large and potentially complicated library to my little side project, I found out that Heroku's Cedar stack, where my code was going to be hosted, includes ImageMagick with each instance. Relying on mogrify seemed like a viable solution.

In order to use the command-line tool mogrify I had to start a new subprocess inside my Flask.py application to run the command and the required arguments.

from exceptions import OSError, ValueError
from subprocess import CalledProcessError

#...your code here...
with open(filename, "wb") as f:
    f.write(data)
    try:
        subprocess.call(['/usr/bin/mogrify', '-strip', filename])
        #Proceed to upload image, etc
    except (OSError, ValueError, CalledProcessError) as e:
        #Handle as necessary

What's great about this approach is that it's pretty portable: ImageMagick is cross-platform and you just need to make sure you have the proper path to mogrify. If you're sure you have ImageMagick installed on your system, but not sure where it lives, run the following command and it will echo the proper path.

>> whereis mogrify

I've verified this solution against images originating from iPhone 4S and created by Adobe Lightroom. Both of these images contained EXIF information in Windows explorer before processing and contained no EXIF data after processing. Be sure to test this in a more thorough manner with your preferred tools if neccessary.

Also worth noting: in the example shown above mogrify overwrites the file and does not make any copies.

While I initially was afraid of the time commitment that would be needed to implement a reliable EXIF-stripping solution for my Flask app, using ImageMagick's mogrify tool turned out be a perfect solution. It's fast, comes pre-installed on Heroku's Cedar stack (currently the default) and I was able to add it to my photo processing workflow in just minutes.


I hope this was helpful. If you have comments or feedback, please email me at pvfinn at gmail.com. Thanks!

← Return to blog home