Backup what, how often & were to store.

Are you prepared for losing your data? Why not? It is possible. Do you have a plan on how to archive vital data, how and where to store it securely?

Are you prepared for losing your data? Why not? It is possible. Do you have a plan on how to archive vital data, how and where to store it securely?

Costs of making backups

Is it worth spending time and money for making backups or can we assume it is not necessary and that everything would be ok despite not doing it?

Making a backup of "everything" may be impossible and in most cases is not required. However, copying important data should be done every night.

You will find how important it is when you lose crucial files. In real life, it is hard to exaggerate the importance of making copies. Statistically, people rarely make backups and are far from emphasizing this matter.

Are you an exception to this rule? Then stop wasting your time reading this!

What should be duplicated?

It is not as hard as it looks like. Imagine that you lost your laptop or cellular phone or main server yesterday. Try to continue work with the spare one but with a brand new operating system and no personal data. No history in your web browser and no single e-mail too.

Then it would become clear what you need the most and what is crucial for you. You will also have the possibility to feel how bad it is to lose data. Write out the list of data sources you need and make a plan on how to back them up and how often.

Unique data that is not possible to recover from nowhere else than from your copy is the most valuable. The article you wrote last week is probably more important than the newest movie you plan to watch next weekend. The movie can certainly be recovered, at least you can buy it again.

How often should I make a copy?

Now you know which data sources are most important for you. So, it is time to decide on how often it should be copied.

Generally, it depends on two factors:

  • how often this data is modified
  • how important this part of data is for the useability of other data

Data modified very often should be duplicated daily or even more often. For instance, it may be a good idea to make a daily copy of the e-mail application storage.

On the other side, it might be complicated to make a daily copy of all photos. Such a collection can occupy hundreds of gigabytes on the disk. Moreover, photos aren't modified on daily basis and there is no point in making a backup of all pictures every day.

Incremental backup of huge volumes

When dealing with an immense collection of valuable and unique data ex. family photobank, incremental backup can be taken into consideration.

Making an incremental backup on Linux systems is pretty easy. The Linux command below will create a usual tar archive of directory /home/user/photos and save the tar archive in /backup/photos-2021-04-05.tar. Depending on the amount of data this can be a pretty big archive. We used --listed-incremental to point where tar should store the information about the state of files.

tar --listed-incremental=/backup/photos-incremental-list.txt -cf /backup/photos-2021-04-05.tar /home/user/photos

This command will make a copy of modified and new files - so in most cases, it will not be too big. It will make use of --listed-incremental=/backup/photos-incremental-list.txt to decide which files were modified added or deleted.

tar --listed-incremental=/backup/photos-incremental-list.txt -cf /backup/photos-2021-04-06.tar /home/user/photos 

Similar procedure can be performed after adding or modifying photos.

When something bad happened, some photos are lost or deleted by mistake, just execute Linux commands:

tar --listed-incremental=/dev/null -xf /backup/photos-2021-04-05.tar
tar --listed-incremental=/dev/null -xvf /backup/photos-2021-04-06.tar

The option --listed-incremental has to be used but the file used to store information about modified files is no longer needed.

Data integrity

Some parts of the data must suit the rest. For instance, on a web server, it is important to have a backup of the SQL database and file system made at the same time.

If possible, a backup should be made while applications are stopped and nothing is being modified. In many cases, it is not possible to stop the server. Then you have to meet halfway and make a copy of the working system which can cause some problems but it is better than nothing.

On redundant systems, it may be possible to stop a half, then make a copy while the other half is working normally and then synchronize databases and filesystems.

Security of backup files

Backup should be stored on another computer in "another room". That is minimum but what if thieves brake into your office and steal both of the PCs?

It would be much better to store the backup in a separate building with a separate electrical network. The best place for backup storage would be "over the ocean" - as far as possible from the data source. Hopefully, such a scenario is technically feasible.

Amount of data in a backup

The fewer the data in a backup the easier it is to deal with. Less data means faster compression, less space on the disk, faster recovery. All important data should be secured. In the same vein, unimportant data should be excluded from coping. That is why it is not a good idea to store 4K movies in a Documents folder.

It is not a big deal to store tiny backups over the ocean - on another continent. Assuming that the amount of data is a few Gigabytes it would take several dozen minutes to download it if necessary.

Encryption - security of backuped data

We are making a backup to stay safe. It would not be good if someone else steals private data or company secret information. It is a frequent practice that the system is secured very well, strong passwords and two-factor authentications. Unfortunately, at the same time backups are stored as "open text".

If backups are not strongly encrypted, an outsider may know passwords and backup will act as an open gate. Such situations can lead to breaking into important systems and cause a lot of troubles.

10 march 2021 update

A serious accident took place in OVH Data Center in Strasbourg, tonight. A significant amount of data and equipment was destroyed during the fire and extinguishing action. Even servers that weren't directly affected will stay offline for a few days.

"We have a major incident on SBG2. The fire declared in the building. Firefighters were immediately on the scene but could not control the fire in SBG2. The whole site has been isolated which impacts all services in SGB1-4. We recommend to activate your Disaster Recovery Plan."

https://twitter.com/olesovhcom/status/1369478732247932929

Cloud backup is not enough

OVH advised to apply "your Disaster Recovery Plan". But hey ... do you have one? Or have you relied on the hosting provider's Recovery Plan? Unfortunately, the OVH's Recovery Plan does not seem to encompass such an extensive failure. This means that you have no such recovery plan in this situation.

Just to be clear: I'm not against Cloud Computing, I'm not fighting OVH. Although in such a situation data security in "cloud solutions" starts looking pale. This is the lesson for everybody that you cannot blindly trust the assurances of service providers.

You need tour own backup

You need your own recovery plan no matter what they write in the advertisements. It is worth checking the strings attached and getting to know what kind of accidents are covered by insurance. For example, fire is excluded in OVH’s insurance plans.

All Right Reserved ©2005-2021

wiecko.com