Join Octablog readers

 
 
24
Jan

Amazon storage solutions have been a hotly debated issue recently at the Octabox water-coolers. My partner Adam believes that S3 is the best thing since sliced bread and I think its a lot hype but of no real value. The truth is probably somewhere in between.

Lets give Amazon’s service a head-to-head comparison with our hosting providers, ServInt:

Amazon:

  • Provides storage
  • Provides bandwidth
  • Charges per request (Is this a new model? never heard of that)
  • Sells books

ServInt:

  • Provides storage
  • Provides bandwidth
  • Provides application and database servers
  • Is a managed solution

I’ll focus on the storage and bandwidth, and consider the rest added value, starting by building several possible real-world scenarios

1. Small art gallery:
We’ll start with a small but not obscure art gallery. It stores about 12Gb of images and videos and transfer is about 200Gb.

At ServInt, the only choice is the essentials package (a bit of an overkill, but the cheapest solution ServInt has to offer). Monthly charge is fixed at 50$, barring any unforeseen sudden increase in bandwidth or storage. ServInt actually allows accounts to go over the limits, and charges for the difference (0.25$ per 1Gb bandwidth, 3$ per 1Gb storage).

At Amazon, the calculation is relatively simple:
(12 Gb storage x 0.15$) + (200 Gb traffic download x 0.18$)
+ (25 Gb traffic upload x 0.10$) + (800,000 GET requests x 0.01$ per 1000)
+ (100,000 PUT requests x 0.01$ per 1000) = 49$ and change.

The amazon charge is composed of ~80% bandwidth and ~19% requests.

* Get requests were derived from dividing the download traffic by the average high-resolution image size of 250Kb. PUT requests are based on upload traffic divided by the same image size.

Round One: Draw

2. The upstart video sharing community:
Those have been popping like flies recently. Supposing its marginally successful, storage is about 30Gb and transfer is about 800Gb. This time we’ll be taking the VSPro Deuce account at ServInt, which is fixed at 150$ per. The calculation for Amazon is the same as before, and it comes to 160$ and change.

The amazon charge is composed of ~95% bandwidth and ~2% requests.

* This time Get requests were based on the traffic divided by the average web-encoded movie size - about 3Mb.

Round Two: Advantage, ServInt

3. The up and coming social network:
XYZ Social network has been doing recently well, and has some plenty of users uploading and sharing photos and so forth, marking its current needs at 220Gb storage and 3,000Gb bandwidth. It requires a dedicated server, such as the simplest enterprise solution by ServInt, which cloaks at 500$ per. Again, calculating for Amazon we come to 1,021$ and change.

The amazon charge is composed of ~57% bandwidth and ~40% requests.

* This time Amazon exploded since I’ve put average user uploaded image size at 80kb (including thumbnails), increasing requests by a large factor over previous cases.

Round Three: ServInt by TKO

Excel style -

Amazon Vs. ServInt graph

Amazon Vs. ServInt figures

So what does this all mean?

First of all, it’s interesting to note that most of Amazon’s costs are bandwidth related. The other significant cost factor is the amount of requests - which becomes most pronounced for low size files, high transfer sites. In fact, storage costs on average comprised only 3.2% of the total costs!

Those figures support my view that Amazon is an expensive alternative to standard web hosting. Lets not forget, that the hosting packages mentioned actually include a hosting server, providing the backend to run the site. The only path in which Amazon seems like a viable alternative, considering their reputation for speed and reliability might be the heavier media types, such as video and audio streams.

There are some who would obviously dispute my conclusions, such as SmugBlog (marketing ploy?) and actual small blogs (not to mention my partner in crime, Adam)

Categories: Hosting

8 Comments »

  1. Certainly not a marketing ploy. :)

    I think if you read my S3 posts, and the slides I present at conferences, you’ll see that I say over and over that S3 is great for storage, but not (yet?) great for serving.

    You don’t mention whether ServInt guarantees at least 3 copies in two datacenters - that’s a major part of the equation that’s often overlooked. Amazon does, so if data durability is important, other solutions can quickly get more expensive if you have to replicate storage.

    Comment by Don MacAskill — 24 Jan @ 4:30 am

  2. That’s why I put it with a question mark ;)
    Anyway, we’ve been using ServInt for close to 3 years now, and we had 0 downtime. While I’m no expert on their specific agreement, according to their website they provide N+1 redundancy in their data center, and a money back on downtime guarantee. I guess Amazon reputation must really be worth it to justify the price difference

    Comment by Eran Galperin — 24 Jan @ 4:43 am

  3. On a different note, I hate sliced bread.

    Comment by Adam Benayoun — 24 Jan @ 11:50 am

  4. It’s not downtime I’m worried about - it’s dataloss. And it only takes once to be significant.

    N+1 isn’t as good as N+2(+). :)

    You should find out, for sure, how many copies of your data exist. I’d love to know myself. If they really do provide 3 copies in 2 different datacenters, that’s awesome. If not, it’s not a fair comparison.

    Comment by Don MacAskill — 24 Jan @ 8:00 pm

  5. What you are talking about is not redundancy, it’s backups. The N+1 redundancy is for uptime purposes - meaning a server could go down and you would still not experience downtime.
    Regarding backups, I do not know the exact technical details, but we had some incidents where a programmer accidentally deleted a database table or a customer removed some files unintentionally, and they easily rolled back our data to whatever time frame we needed. Does Amazon provides rollbacks?
    Note that this backup procedure also covers the database, which is something that Amazon obviously does not.

    Relying completely on your provider for your data loss prevention is wrong in my view. Offline storage is very cheap and if your business is your data, you should store your data periodically on your own machines.

    You might consider dropping ServInt a line and asking them about their data guaranties if you’re really interested ;)

    Comment by Eran Galperin — 24 Jan @ 8:37 pm

  6. When you’re dealing with hundreds of terabytes or more, backups and redundancy blur together, particularly when all of it needs to be online at all times.

    Offline storage isn’t very cheap at all.

    And lots of companies use Amazon for their databases, including backups, so they obviously do. It’s trivial to use S3 as a backup for whatever data you want, including your databases. Zmanda even offers S3 as a backup target.

    Comment by Don MacAskill — 24 Jan @ 9:25 pm

  7. Got me there Don ;) I actually witnessed in my own little research that for storage only Amazon is great (read the name of the post :)). However, for active hosted content its pricier than the alternatives.

    Regarding databases you can obviously store your backups at Amazon manually or use another intermediator such as Zmanda, but Amazon does not provide automatic rollback protection to your active database.

    In conclusion, as I wrote in my post - good for storage bad for data serving.

    Comment by Eran Galperin — 24 Jan @ 9:53 pm

  8. I think that one of the strongest advantages of S3 is the fact that it can handle virtually any amount of traffic. So if your video sharing sites traffic peaks at certain times, you’re not effected.

    Comment by Arik — 25 Jan @ 1:54 pm

RSS feed for comments on this post TrackBack URL

Leave a comment