Downloading and Saving Image to ImageField in Django

Screenshot of Python code distorted and warped

Finding a way to programmatically save images to a Django model has been a real pain in the rear. There is almost no comprehensive documentation on simply associating an image you want to download to a model’s ImageField, and even if you’re able to overcome that hurdle there is a lot of mess to deal with if you want to verify that you’re saving an actual image, and not a corrupt image at that. End frustration.

Here is some code that can be used to

  • download an image from a URL
  • verify that it is actually a non-corrupt image
  • save the image to Django’s media folder
  • and then associate that image to a model Class using Django’s ImageField

Most of the examples I found save the image to a temp folder, check the image there, then move it to a public folder once it’s verified. I want to do all my file checks entirely in memory – without writing – before choosing whether or not to save the downloaded image.

Just about all of this is going to happen in the models.py, and very little is going to happen in whatever script or view wants to download and save the image. Sorry for the squeezed code and all the line-breaks…need to move off of WordPress.

import imghdr # Used to validate images
import urllib2 # Used to download images
import urlparse # Cleans up image urls
import cStringIO # Used to imitate reading from byte file
from PIL import Image # Holds downloaded image and verifies it
import copy # Copies instances of Image

# Create your models here.

class Product(models.Model):
    prod_id = models.CharField('Product ID', max_length=20)
    prod_img = ImageField('Product Image', upload_to='products', null=True, blank=True)
    # upload_to will be a folder within your MEDIA_ROOT, created if it doesn't exist
    # Don't forget to set MEDIA_ROOT and MEDIA_URL in your project's settings.py
    # null=True because not all Products will have an image
    # blank=True because I don't want admin to throw up if I try to save a Product without filling out the ImageField
    # IMPORTANT PART: Below we're going to redefine the save() method for this class. We will essentially say that if save(some_url) is called, go fetch the image at that URL and save it to the ImageField prod_img. (There are possibly other elegant things you can do instead of just passing the image's URL in the save method. You should be able to move around the parts afterward to do what you want.)
    def save(self, url='', *args, **kwargs):
        if self.prod_img != '' and url != '': # Don't do anything if we don't get passed anything!
            image = download_image(url) # See function definition below
            try:
                filename = urlparse.urlparse(url).path.split('/')[-1]
                self.prod_img = filename
                tempfile = image
                tempfile_io = cStringIO.StringIO() # Will make a file-like object in memory that you can then save
                tempfile.save(tempfile_io, format=image.format)
                self.prod_img.save(filename, ContentFile(tempfile_io.getvalue()), save=False) # Set save=False otherwise you will have a looping save method
            except Exception, e:
                print ("Error trying to save model: saving image failed: " + str(e))
                pass
        super(Product, self).save(*args, **kwargs) # We've gotten the image into the ImageField above...now we actually need to save it. We've redefined the save method for Product, so super *should* get the parent of class Product, models.Model and then run IT'S save method, which will save the Product like normal

def download_image(url):
    """Downloads an image and makes sure it's verified.

    Returns a PIL Image if the image is valid, otherwise raises an exception.
    """
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0'} # More likely to get a response if server thinks you're a browser
    r = urllib2.Request(url, headers=headers)
    request = urllib2.urlopen(r, timeout=10)
    image_data = cStringIO.StringIO(request.read()) # StringIO imitates a file, needed for verification step
    img = Image.open(image_data) # Creates an instance of PIL Image class - PIL does the verification of file
    img_copy = copy.copy(img) # Verify the copied image, not original - verification requires you to open the image again after verification, but since we don't have the file saved yet we won't be able to. This is because once we read() urllib2.urlopen we can't access the response again without remaking the request (i.e. downloading the image again). Rather than do that, we duplicate the PIL Image in memory.
    if valid_img(img_copy):
        return img
    else:
        # Maybe this is not the best error handling...you might want to just provide a path to a generic image instead
        raise Exception('An invalid image was detected when attempting to save a Product!')

def valid_img(img):
    """Verifies that an instance of a PIL Image Class is actually an image and returns either True or False."""
    type = img.format
    if type in ('GIF', 'JPEG', 'JPG', 'PNG'):
        try:
            img.verify()
            return True
        except:
            return False
    else: return False

The first trick is understanding how to save an image to the Django model. In this case, the image file doesn’t exist yet (it’s on another server), so if we try to save, there is no image to save. When we redefine/intercept the save method, we essentially cut the save method off and perform a couple steps first: 1) use the external URL that is passed to save() to request the image 2) take the server’s response and make sure it’s an image 3) if it is save it to our own server 4) assign the image we just saved to the ImageField, and then we essentially let save() continue. If the image isn’t valid, we don’t populate the ImageField and we let save() continue so that Product can still be saved. You could just as easily do something more elegant, like create a CharField for an external image’s URL and have the save method read from that, avoiding the need to pass a parameter to save(). It’s up to you, but all the parts should be there.

Now anytime you save a Product model you just pass an external URL to save like so:

# In a script or view
# Some code to get your image's URL 
p = Product(prod_id='123')
p.save(external_img_url)

That’s it. Are there redundant parts in this code? Quite possibly. But after spending the better part of a day on this, I’m ready to move on and clean this up later.

Note that this doesn’t check to see if the image exists already. I’ll have to do that later, and will update code once it’s done. Right now if img.gif exists, this will create img_1.gif, ing_2.gif, and so on.

About the author

I like learning, programming, and creating. I'm an incessant problem solver, and have an insatiable desire to absorb information. I'm an information sponge. A critical thinker. My left brain and right brain function as one. Business and technology should be one. I invest based on opportunity or technicals, whichever is right. I never let my schooling interfere with my education. Marketing is not fluff, it is data-driven. I find patterns, and analyze the random. I like to teach and explain.
7 Responses
  1. Ebtessam Zoheir

    it was also a mistake from me for thinking that download_image and valid_img are functions of this class. I thought i was indented. Thank for the tutorial 😀

Leave a Reply to I try to download one poster to each movie, but every poster is downloaded multiple times. Is something wrong with my loop or rendering context? - BlogoSfera Cancel Reply