Monday, April 21, 2008

Flickr Machine Tags Revisited

Over the weekend we added the ability to browse by different fields in accessCeramics. While this isn't really that hard of a problem, technologically speaking, it became clear that flickr machine tags couldn't handle this very well.


It came down to this. To get a list of the different values the images had for glaze, for example, we would have to do the following:

  • Make an API call to get a list of images in the collection

  • For each image, make an API call to get the list of tags & machine tags

  • Cycle through each set of tags and perform a regular expression match to find the 'glaze' machine tag, and store that value in a PHP array

  • Put the array in order, and print the results.


Or, if the metadata was in a local database like MySQL, we would have to:

  • Make an SQL query to the database, and process & print the results



In addition, the API doesn't support partial or truncated machine tag searches. For example, if I wanted to search for 'sculpture', I wouldn't get 'figurative sculpture' as part of my results set. This would require our users to be overly precise with their searches, which just isn't ideal design.

Because of situations like this, we decided to migrate the metadata to a MySQL database. We'll still include machine tag generation in Flickr as part of the process, in the event that API methods are added in the future. But storing all metadata in a local DB will greatly enhance the site's functionality and speed, especially as it grows to the volume we hope it will.

When this project began, we were hoping Flickr would be the database layer. To a certain extent it still is, but it just doesn't offer the same flexibility as a fairly simple MySQL database. It's still a cool idea, and hopefully Flickr will make some enhancements in the way machine tags are used and queried. With some of the recent chatter about an academic Flickr, hopefully the conversation that will move Flickr in that direction has begun.

No comments: