On ways in which DynamoDB won’t scale


Note: I wrote this post in some other blog on June 2018, then moved it over here.

A lot of the work I’ve done for the past two months has revolved around one huge DynamoDB table and a few smaller tables complementing the huge one. I thought I’d share some of what I learned in the process.

I’m not going to be talking about the basics of DynamoDB (eventual consistency, avoiding scans and hot partitions, keeping items under 1KB, etc.). For the most basic advice, you should check out the DynamoDB docs.

Spoiler alert: the main takeaway from this post will be the following.

“DynamoDB will not scale unless you know what you’re doing from the very beginning.”

A lot of people have mentioned this already 1 2 3. Hot partitions
are probably the biggest problem here, but I’m not going to talk about that
(yet).

Maximum query return size

Queries return up to 1MB of data (and this is before any filtering is done).

Now, imagine you’re building some kind of social network where people post photos. You’ll save the following data.

  • Entity: Photo
    • UserId
    • Timestamp – moment when the photo was added to the database.
    • Thumbnail
    • PhotoLink – something to help you retrieve the actual photo stored in S3.
    • Likes – number of likes/shares/upvotes/whatever
    • Title
    • Description

And your most important queries will be to get the last few photos by a
specific user.

Now, if you don’t know what you’re doing and decide to believe in the allegedly seamless scalability and ease of use of DynamoDB, you may consider having a single DynamoDB table with UserId as the partition key and Timestamp as the sort key, and having everything else as attributes. That way you can make a Query with Limit=K and ScanIndexForward=false to get the last K photos of a user.

Of course, DynamoDB will overwrite any items with the same UserId and Timestamp, which is something you probably don’t want to happen. But you can fix that with a fixed-size random postfix to the Timestamp or something of the sort. Let’s ignore that in the name of simplicity.

Now suppose you also want to add a feature where users can see their most-liked photos. This is where the 1MB limit in the query return size kicks in. You could implement it by querying for all photos by a user and sorting them by Likes in-memory. But, if the number of photos by a user grows so that you have more than a few MB of data for a single user, you’ll have to get that data in several sequential requests with 1MB responses each, instead of a single query, and that will degrade the user experience significantly.

In the case I’ve been working on, there was no noticeable difference between getting 0.1MB of data (i.e. setting a pretty low Limit) or 0.9MB of data in a query, but when we went over 1MB and had to do 2 sequential requests, the time pretty much doubled, as expected. This means sequential reads are a no-go if you’re striving for fast response times.

Global Secondary Indexes are incredibly expensive

To do that query efficiently you’ll have to add a Global Secondary Index with UserId as the partition key and Likes as the sort key which essentially doubles the cost of your table.

So why not adding a local secondary index instead? First of all, that won’t save you a lot of money if your expenses come mostly from write capacity. But there’s something worse.

ItemCollections are capped at 10GB

If you’re not very aware of this restriction, you could end up in a world of
pain.

When you have a local secondary index, you can’t have a collection of items with the same value in the partition key exceeding 10GB in total. In other words, applying the concept to this case, you can’t have more than 10GB of data for a single user (i.e. a collection of items with the same value in UserId).

In this case, it’s fairly unlikely normal users will ever reach that limit, but a bot posting every 30 seconds could get there quite fast (depending on the item sizes). And in other applications the limit could be very much within reach.

Conclusions

You could have thought that the entity we were trying to model (Photo) required simple enough queries that it had to be trivial to model with DynamoDB, but you would’ve thought wrong. If you need to make queries that are at all more complex than just retrieving a single entry given a key, you should not take the decision of using DynamoDB lightly.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

%d bloggers like this: