The Amazon DynamoDB team is back with another useful feature hot on the heels of encryption at rest. At AWS re:Invent 2017 we launched global tables and on-demand backup and restore of your DynamoDB tables and today we’re launching continuous backups with point-in-time recovery (PITR).
You can enable continuous backups with a single click in the AWS Management Console, a simple API call, or with the AWS Command Line Interface (CLI). DynamoDB can back up your data with per-second granularity and restore to any single second from the time PITR was enabled up to the prior 35 days. We built this feature to protect against accidental writes or deletes. If a developer runs a script against production instead of staging or if someone fat-fingers a DeleteItem call, PITR has you covered. We also built it for the scenarios you can’t normally predict. You can still keep your on-demand backups for as long as needed for archival purposes but PITR works as additional insurance against accidental loss of data. Let’s see how this works.
Continuous Backup
To enable this feature in the console we navigate to our table and select the Backups tab. From there simply click Enable to turn on the feature. I could also turn on continuous backups via the UpdateContinuousBackups API call.
After continuous backup is enabled we should be able to see an Earliest restore date and Latest restore date
Let’s imagine a scenario where I have a lot of old user profiles that I want to delete.
I really only want to send service updates to our active users based on their last_update
date. I decided to write a quick Python script to delete all the users that haven’t used my service in a while.
import boto3
table = boto3.resource("dynamodb").Table("VerySuperImportantTable")
items = table.scan(
FilterExpression="last_update >= :date",
ExpressionAttributeValues={":date": "2014-01-01T00:00:00"},
ProjectionExpression="ImportantId"
)['Items']
print("Deleting {} Items! Dangerous.".format(len(items)))
with table.batch_writer() as batch:
for item in items:
batch.delete_item(Key=item)
Great! This should delete all those pesky non-users of my service that haven’t logged in since 2013. So,— CTRL+C CTRL+C CTRL+C CTRL+C (interrupt the currently executing command).
Yikes! Do you see where I went wrong? I’ve just deleted my most important users! Oh, no! Where I had a greater-than sign, I meant to put a less-than! Quick, before Jeff Barr can see, I’m going to restore the table. (I probably could have prevented that typo with Boto 3’s handy DynamoDB conditions: Attr("last_update").lt("2014-01-01T00:00:00")
)
Restoring
Luckily for me, restoring a table is easy. In the console I’ll navigate to the Backups tab for my table and click Restore to point-in-time.
I’ll specify the time (a few seconds before I started my deleting spree) and a name for the table I’m restoring to.
For a relatively small and evenly distributed table like mine, the restore is quite fast.
The time it takes to restore a table varies based on multiple factors and restore times are not neccesarily coordinated with the size of the table. If your dataset is evenly distributed across your primary keys you’ll be able to take advanatage of parallelization which will speed up your restores.
Learn More & Try It Yourself
There’s plenty more to learn about this new feature in the documentation here.
Pricing for continuous backups is detailed on the DynamoDB Pricing Pages. Pricing varies by region and is based on the current size of the table and indexes. For example, in US East (N. Virginia) you pay $0.20 per GB based on the size of the data and all local secondary indexes.
A few things to note:
- PITR works with encrypted tables.
- If you disable PITR and later reenable it, you reset the start time from which you can recover.
- Just like on-demand backups, there are no performance or availability impacts to enabling this feature.
- Stream settings, Time To Live settings, PITR settings, tags, Amazon CloudWatch alarms, and auto scaling policies are not copied to the restored table.
- Jeff, it turns out, knew I restored the table all along because every PITR API call is recorded in AWS CloudTrail.
PITR is available in the US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (Sao Paulo) Regions starting today.
Let us know how you’re going to use continuous backups and PITR on Twitter and in the comments.
– Randall