Back

Oops!... I Deleted My Blog

profile

Lucia Gomez

1/9/2024

3

Subscribe

One night while I was hanging out with friends, I wanted to show them some of the design changes I had made to my blog. I pulled up my website, navigated to my blog, and saw... "No posts found." Slight panic. I logged into my database to see what was up, and saw that my blog post dataset was empty. Three years worth of posts were wiped, with no backups available.

After going through the five stages of grief, I eventually came up with a few last-ditch fixes to try. Two stressful hours later I figured out what happened and I had all of my blog posts restored. 24 hours later and I had a process in place to make sure this never happens again!

Cause

First of all, I had assumed that MongoDB automatically creates backups of my database, saving snapshots of what the data looked like at certain points in time. When I rushed to the Backups tab and saw an empty table, I got a painful lesson in learning to check that data is in fact being backed up.

no backups listed I stared at the empty Backups page, and it stared back

Next, I was confused as to how this happened. A few days earlier I had been working on writing unit tests for my server to verify that my database operations and API endpoints were working as expected (more on that later). I used an in-memory test database for this, and was very paranoid about not touching any of my production data. As I wrote the tests I repeatedly checked my website to make sure my blog posts were intact. Still, I eventually saw in my database's logs that all posts were deleted by me, around the time when I was working on these tests.

the problem is me

It's good practice to clean up after your tests run, so I was clearing out my test databases after each test run:

after(async () => {
  await PostsModel.deleteMany({})
  await closeDB()
})

This had to have been the culprit. My test script specified that while in test mode, it should have only connected to the test database. But at some point I must have accidentally run the script without specifying process.env.NODE_ENV = test.

Recovery

I refused to believe that I had permanently lost 3 years worth of posts. I started looking for other places aside from my MongoDB database where I could find a backup.

A few months ago I had to migrate my database from MySQL hosted on ClearDB to MongoDB Atlas, because ClearDB was increasing their prices. During this process I had to download a copy of my data as a JSON file. As luck would have it, this file was still sitting in my Downloads folder. This file had almost all of my posts, except for a post I had written a few days ago. Panic was starting to subside.

I kept looking for a way to restore that one missing post, since I had spent a few hours writing it and didn't want to redo it. As I was restoring the rest of my posts via the miraculous JSON file, I had to connect to my database via Compass, MongoDB's desktop app for interacting with databases. I poked around this user interface a bit and noticed a tab called oplog. Inside it had several million entries containing every operation I've ever performed on my database. This is where I saw the fateful operation that deleted all of my posts.

delete logs The top 2 lines mean this was a delete operation acting on my list of posts

But! It also meant that I could see every operation that updated or added to my database. A few months ago I added an auto-save feature to my blog, that saved my draft posts every 10s as I typed. Thanks to this feature, I could look at recent logs and see versions of drafts for my missing post. I clicked through a couple hundred draft logs until I found one from right before I published the post, and I was able to restore the entire post from the contents of this log!

mongodb oplog My heart jumped when I saw the first log containing part of a draft

Backup Process

Successfully restoring all of my blog posts was cause for celebration, but I wasn't satisfied until I made sure that I prevented this problem from ever happening again. I wanted to automatically backup my data in case of future catastrophe, and apparently to idiot-proof against myself.

After some more digging (and a lot of sleep), I learned that Atlas does offer automatic backups, but not on the free tier that I'm on.

option to enable backups My cluster is on the M0 (free) tier, not M2+

So I got to work on creating my own process for regular, automatic backups. Here's an overview of what I did:

  • Wrote a bash script that connects to MongoDB and uses the mongodump command to generate a timestamped backup folder
  • Tested this script locally on my computer-- it worked! But I wanted it to run automatically once per day, so I looked into scheduling it to run on my Heroku server, which is live 24/7
  • Got the mongodump command to run on my Heroku server. This took some craftiness. Heroku runtime environments can be configured with buildpacks to install necessary dependencies, like the MongoDB commandline tools in this case. I found a buildpack for this, but it was outdated. I forked it and edited the script to use a more recent version
  • Heroku's filesystem is ephemeral. Any backup files that the server generated would only exist temporarily. So after generating the backup files, the script needed to upload them to the cloud somewhere
  • I created an AWS S3 bucket to store these backups in the cloud
  • Wrote a JS script to connect to my AWS bucket and upload the backup files. The backup generation bash script initiates this upload after completing the mongodump
  • Scheduled this script to run every day at 3am using Heroku Scheduler (similar to cron)

That's it! My backups are automatically uploading to AWS every night at 3am.

AWS bucket One timestamped database backup stored on AWS

Next Steps

  • Purge backups after a certain amount of time. A few months? Keep the past 30 days + one per older month? I'll have to experiment. But I don't want to take up too much storage space with thousands of backups
  • Write a restore script using the mongorestore command. Hopefully I'll never need it but... just in case
  • Set up an email alert system to tell me if my database is ever empty? If I hadn't gone to show my friends my blog, I wouldn't have known it was empty. I'd like to know as soon as this happens

While I didn't technically drop a database table, I still feel obligated to include this famous XKCD comic:

drop tables comic

profile

Lucia Gomez

1/9/2024

3

Subscribe