Search This Blog

Friday, April 27, 2012

Metadata for images and SEO

So, some of us on twitter have been talking about images and metadata (you know, one of my fave topics). Anyhow, I realized I was running out of room on twitter, so I thought maybe a blog post was in order.  ;-D By the way, I do have a chapter within Semantic Web Technologies and Social Search for Librarians, which covers EXIF data and Flickr specifically with step by step instructions and lots of great examples.You might also want to check out some of my presentations on slideshare ( too - lots of good intros to metadata there.

One of the most confusing things about image metadata is that it consists of 4 parts: embedded (internal, within the actual file coding) metadata, structural/administrative metadata (often external but things like copyright could be embedded; also includes file type)  and descriptive metadata (generally external) which we add, things like description (duh, right?), title, keywords/tags, who is in the photo, etc. To add another level of complexity, then there is the website metadata about images, things like alt tag, description, title tag, that sort of thing... (for a quick list of things to include, 10 things you can do to optimize image search   -- note> these are all website metadata relating to images, not actually image metadata -- but it is all important metadata)
 Website metadata is what is traditionally indexed in search engines, but as some sites (like Flickr) harvest the embedded metadata from an image  + allow us to create even more metadata, all of that metadata is made available to search engines via Flickr.   

A bit more about embedded metadata
  • is created in the process of creating the image. Many phones will embed location and date information - so that when you upload it to flickr or some other website, the site will automatically know where the image was taken and when. It might even know what kind of camera took the photo (it's not really magic, it's metadata). This data is called EXIF (found in JPGs but some other file formats, also). Sort of like a date stamp but inside your image vs. on your image. This metadata can be edited with editors, but often is left as it is.
  • is readable by some websites, Flickr and Picasa, among others. 
  • Search engines currently do not use EXIF a lot although Google Images added EXIF data in July 2011 and some database search tools like can index it - but it is still somewhat limited in terms of usefulness (at the moment) in the larger search engine world (even the Google advanced search doesn't include choices to help search and limit with EXIF). What most search engines use/crawl is the metadata wrapped around the image (given that Facebook currently strips out EXIF, its search would only be descriptive metadata that you add).
  • Some external metadata is also created during the image creation process, e.g., when you take a photo or scan an image, it saves it with a file name. That file name, the size, and file type are external types of metadata  but it is more easily accessible to the average user - we change the file names all of the time - though we can't change the file type without using a converter, because the software wouldn't be able to "understand" it. (Just a little tip> if you have a corrupted SD card, chances are you changed metadata to a JPG but not to its matching RAW file, or in some way made the image unreadable via the metadata ... if you  are shooting in RAW + JPG and delete a JPG on a SD Card, you must delete the RAW file, too - your camera will usually do this for you; however, not all software will)    
 Here is a snippet of the EXIF from my image (this is from notepad so the squares are just blanks); if you look carefully you see that it is EXIF II and it recorded the type of camera I used, the Fstop, exposure and more... (much more)
and the actual image when viewed through a source that can "understand" and display a jpg....
So, yes, even an image is just all data... ;-)

1 comment:

Real Estate and More said...

Awesome information! I'm a photographer, so would it be good for me to add copyright information and type of shoot (i.e. Real Estate Photography Denver) to my metadata now. I assume that some day, the EXIF will be read by the big guys. right???