A while ago, there was a post on Hacker News, which asked for advice on how to do backups:
If someone is a less able, cognitively impaired, or perhaps very young - It would be good to see what advice HN would give them for protecting their data from ransomware, theft, or disaster.
That actually got me thinking - in software development, we sometimes count on external service providers to handle our backups, other times we might expect platforms like VMware to do them for us, while other times there are a variety of complex tools, everything from backing up entire file systems, to going down to the level of using specialized tools to backup database instances. But what do you do, when you want simple backups?
So, to that end, i shared a few of my personal thoughts, and som information about my own backup solutions in that discussion, as well as decided to turn it into a blog post of my own. Without further ado, let's get into it!
The absolute simplest option that i can think of: have a few large HDDs or SSDs that you connect to your device with an USB enclosure and copy over all of your data.
It's also reasonably simple to automate that process with something like cron and rsync. If it's not possible to automate that process, or it's too cumbersome, then just do it manually and keep a log somewhere.
Example:
2021.10.01 - copied over /home and /data to HDD1
2021.09.01 - copied over /home and /data to HDD3
2021.08.01 - copied over /home and /data to HDD2
2021.07.01 - copied over /home and /data to HDD1
2021.06.01 - copied over /home and /data to HDD3
2021.05.01 - copied over /home and /data to HDD2
...
Pros:
Cons:
If a slightly more complex solution is okay, then you might use some of the software that's out there, built with the purpose of automating backups. Some of the solutions are paid, others are free, but the general idea is the same - you set up some directories which you'd like to backup, you set a schedule and perhaps some rules on what to back up and how and let the software run whenever.
Example:
Pros:
Cons:
Now, this is a bit more complicated, but since any regular computer can become a server and since HDDs are pretty cheap anyways, spending a few days setting up a backup solution can sometimes be worth it, if the people can spend some time following guides or reading the documentation.
Essentially, you'd set up your own server, with an OS of your choice (personally i'd suggest Debian or Ubuntu LTS) and would install some software package, that could connect to the devices that you'd like to back up and would pull data from them at a set schedule. Alterting options are also available should anything go wrong.
Example:
Pros:
Cons:
There are more complicated setups out there, such as file systems with snapshotting, RAID to avoid individual disk failures, storage pools etc., which may or may not be worth it, based on the complexity vs the benefit that they provide.
NAS solutions and cloud solutions for storage can also be explored, as long as security isn't forgotten about - for most people, both of those can be good options and can be combined with any of the alternatives.
Integrity is surprisingly hard to get right and as long as you have multiple backups of the same files over time, it's not always worth it to worry about it too much. Ideally, check the files that matter to you in the backups manually, like whether your master's thesis was backed up correctly.
Version control systems are also surprisingly nice for smaller files, like the aforementioned thesis - with something like GitLab and the aforementioned server backups, that introduces more redundancy and versioning in the mix, with tools that most developers will be familiar with.
Restoring backups is important, yet not often considered much. If you use something like BackupPC, you absolutely need to test whether you can properly download or restore the files that you've backed up into it, otherwise it's useless.
Lastly, you should remember the 3-2-1 rule of backups:
The 3-2-1 rule can aid in the backup process. It states that there should be at least 3 copies of the data, stored on 2 different types of storage media, and one copy should be kept offsite, in a remote location.
(from 3-2-1 Rule)
In my experience, that's why file/archive based solutions are perhaps the best option, because they're easy to carry over to other storage mediums.
Not only that, but the above options actually combine with one another rather nicely. Right now, i use manual backups of the data that matters to me on my phone (connect through USB, transfer files monthly), use the backup software to automatically propagate my files across multiple drives, then use BackupPC to pull my files to a backup server, which also has multiple mirrored drives that incrementally synchronize with rsync and cron. Then i also use a few Nextcloud instances which copy my local files to my own VPSes, which coincidentally are also pulled down by BackupPC. And then i also copy some of my keychains and other important files on local storage mediums - SD cards or memory sticks, as well as encrypted containers on almost every device that i use (VeraCrypt). In my eyes, the simplest solutions that don't require specialized hardware are perhaps the best ones.
Thus, my risk analysis looks like this:
And it's also possible to plan further contingencies, if needed: