By

S3

In the life of a web application, there comes a point where that shared hosting account just isn't good enough (and you found out because your provider kicked you off), or your server just isn't able to pull the queries from the database fast enough. Then one day, you finally get the filesystem error EMLINK, which you have a VERY hard time googling.

This is simple, you just created the maximum number of subdirectories that you can have in a directory. This is suprisingly not a common issue with file_column, acts_as_attachhment or attachment_fu, although I'm shocked as why it's not. So, what do you do when you're faced with scalability issues, and you're image handling plugin is broken!

THROW IT ALL AWAY!

That's what I had to do. Recently we worked on a site and we decided that because it was getting too hammered, that we would put the images on S3. Then we found the ultimate weakness of S3, which is that it's not able to easily handle batch processing. We used the AWS:S3 library for most of the movement of the files, but we found that if we made a mistake, it would cost us hours to get these back.

Eventually, the day was saved with jetS3t, and Cockpit. Using jetS3t, we were finally able to actually get through all the S3 issues, and it saved the day at the end. (Actually, Dave saved the day at the end, my computer kept running out of memory). But we managed to get S3 support into it, and all we had to do was sacrifice File Column and replace it with this:


def user_image=( blob )
# establish S3 connection
AWS::S3::Base.establish_connection!(:access_key_id => AWS_ACCESS_KEY_ID, :secret_access_key => AWS_SECRET_ACCESS_KEY)
datestamp = Time.now.strftime('%d%m%Y')
identifier = UUID.random_create.to_s
object_path = "images/" + datestamp + '/' + identifier + '/'
object_key = object_path + blob.original_filename
self.image = blob.original_filename
self.image_dir = 'http://s3.amazonaws.com/bucket/images/' + datestamp + '/' + identifier + '/'
image_data = blob.read

#Send the file to S3
AWS::S3::S3Object.store(object_key, image_data , 'bucket', :access => :public_read)

# resize to thumnail here
img = Magick::Image.from_blob( image_data ).first
thumbnail = img.resize_to_fit! 96, 96

# Set the thumbnail directory path
thumb_key = object_path + 'thumb/' + self.image

AWS::S3::S3Object.store(thumb_key, thumbnail.to_blob , 'bucket', :access => :public_read)
end

However, if you have to do S3, I would highly recommend using a long key so that you can sort your re.sults better based on this key! However, the biggest gotcha I found when adding S3 integration to my rails app was including AWS/S3. If you include and require it, it will break your routing, this is something that can cause hours of headaches, especially if you are doing something else. At the end, we learned that S3 is a misnomer. For a large number of files, it's far from simple.