Snapshots reliability

Discussions related to using VirtualBox on Linux hosts.
Post Reply
jeffcourteau
Posts: 13
Joined: 22. Mar 2017, 01:00
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: W2K12R2, Linux CentOS, FreeBSD, appliances, etc...
Location: Quebec City, QC, CA
Contact:

Snapshots reliability

Post by jeffcourteau »

Hello guys,

I am running a production environment composed of 11 VMs on 3 physical server hosts running CentOS 7 and VirtuaBox 5.18.something. I spent a whole lot of time putting things together and I don't want to lose this hard work. I take regular backups of my clients stuff from script running inside the VMs onto an external physical machine, then to an external HDD for offsite fallback. I am well padded on that side.

Now the question is: Are snapshots reliable enough for regular backup tasks? Having my the disks in a "crash consistent state" is an acceptable risk, and 1 or 2 machines unusable after a disaster is not such a big deal, anyways most stuff is doubles (2 DB servers, 2 web servers, 2 mail servers, 2 domain controllers). I just want to be sure I don't break my stuff while in the process of taking my backups.

Here is the way I see it:
- List running VMs on a host, send result to a file;
- Take a snapshot of all the VMs present in the file;
- List all the VMDKs or VDIs of the running VMs and send files paths to a file;
- Copy all the listed VMDKs and VDIs to my backup server;
- Merge all the VMs snapshots for continued operation.

I essentially want to make sure I do not lose a server on the way. Are snapshot reliable enough for this kind of production use? Could I end up with a vboxmanage snapshot <VMNAME> delete <SNAPSHOTNAME> crashing and leaving my VM in an inconsistent state and having to rebuild the VM from scratch after a simple backup?

I have seen a few posts from mpack that date "back then" in 2011, but nothing recent about the readyness of snapshots for production use.

Thanks in advance for your enlightenment!

J-F Courteau
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Snapshots reliability

Post by mpack »

jeffcourteau wrote: Now the question is: Are snapshots reliable enough for regular backup tasks?
I'm afraid the question is misplaced, as snapshots have nothing to do with making backups. Snapshots are like branches in a river, they are not a second river. A backup is, by definition, when you have more than one of something - and ideally not all held in the same place.
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: Snapshots reliability

Post by Perryg »

My theory is when using mission critical guest especially in a production environment, I use a 3 step philosophy :
  • 1) never use snapshots.
    2) if you think you need snapshots go to step one.
    3) if you really think it will be a benefit to use snapshots go to step one.
While you might save a few bits using them they will never replace the real thing. Most do get a false sense of security when they us them but you can search here and the rest of the Internet and see the real horrors of what happens when you rely on something that resembles a house of cards in a wind storm.
jeffcourteau
Posts: 13
Joined: 22. Mar 2017, 01:00
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: W2K12R2, Linux CentOS, FreeBSD, appliances, etc...
Location: Quebec City, QC, CA
Contact:

Re: Snapshots reliability

Post by jeffcourteau »

mpack wrote:
jeffcourteau wrote: Now the question is: Are snapshots reliable enough for regular backup tasks?
I'm afraid the question is misplaced, as snapshots have nothing to do with making backups. Snapshots are like branches in a river, they are not a second river. A backup is, by definition, when you have more than one of something - and ideally not all held in the same place.
Any "in-use-file-backup" technology uses some kind of snapshot for backup tasks. VSS (Volume Snapshot Service) is an example. Using snapshots as a versioning system is one use case, but backup should be another one. I never use snapshots for versioning, and I never stack snapshots. I only want to use them for the time of the backup, and once completed, commit the snapshot (the vboxmanage snapshot [vmname] delete [snamshotname] function).

All I want to know is: is it reliable enough for regular, production use? Using a single snapshot only for a few minutes and then commit the snapshot to continue operation without snapshot, are there any chances of corruption?
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Snapshots reliability

Post by mpack »

jeffcourteau wrote: Any "in-use-file-backup" technology uses some kind of snapshot for backup tasks.
Only as a peripheral step. The VSS step is immediately followed by a physical copy to secondary storage. The latter is the backup step, the former is just a synchronization tool: a disk flush.

As to VirtualBox snapshot reliability, I think it's best if you just Googled for "snapshot problem site:forums.virtualbox.org".
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: Snapshots reliability

Post by Perryg »

Ask yourself, what are you going to do with the snapshot you backed up. It can't be associated with the base once you remove/delete the original snapshot and is totally useless on its own.

The only purpose I have seen that works as it should is to take a snapshot before you do something like update, Etc. in case there is a failure which allows you to fall back to the working version quickly. After a positive results one should always merge/delete the snapshot to bring the guest back to a stable environment.

In any regards you always have the right to do what you want but you asked a question and one that has plagued a lot of users. The answer is they are not reliable for backups.

One other thing I have noticed even when using the snapshot for the purpose I indicated. If the host is overly burdened when merging/deleting the snapshot there is a possibility corruption can occur due to heavy I/O and or other reasons. Nothing I know of short of a full backup is adequate for mission critical machines.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Snapshots reliability

Post by socratis »

jeffcourteau wrote:Here is the way I see it:
- List running VMs on a host, send result to a file;
- Take a snapshot of all the VMs present in the file; (that's of the live, breathing VMs)
- List all the VMDKs or VDIs of the running VMs and send files paths to a file;
- Copy all the listed VMDKs and VDIs to my backup server;
- Merge all the VMs snapshots for continued operation. (something that may have changed since the time you took the snapshot)
You're talking about taking a live snapshot, which involves more than the changed sectors in the hard drive. It also involves the contents of the memory at the time that you took the snapshot, a ".sav" file.

On top of that, you're talking about copying a live system. Not good either.

I suggest you re-examine your backup strategy, especially if you don't want to shutdown the VMs. There's a huge difference in taking a live backup and an off-line backup, even on real systems.

The question is: why involve snapshots at all. Shutdown the VM, copy the whole thing, reboot the VM. It's called "scheduled maintenance".
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
jeffcourteau
Posts: 13
Joined: 22. Mar 2017, 01:00
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: W2K12R2, Linux CentOS, FreeBSD, appliances, etc...
Location: Quebec City, QC, CA
Contact:

Re: Snapshots reliability

Post by jeffcourteau »

Perryg wrote:Ask yourself, what are you going to do with the snapshot you backed up. It can't be associated with the base once you remove/delete the original snapshot and is totally useless on its own.
I do not want to backup the snapshot file, I want to backup the VDI that is unlocked once you have a snapshot running. That is the whole point of the thing.
- Start snapshot
- backup the VDI file
- commit snapshot

That way I would have a "point in time" backup of the VM. But the snapshot function doesn't seem to be reliable enough for any production use. So I'll simply forget about this.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Snapshots reliability

Post by mpack »

It's the step where you merge the current snapshot into a running VM that I would be most concerned about.
socratis wrote: The question is: why involve snapshots at all. Shutdown the VM, copy the whole thing, reboot the VM. It's called "scheduled maintenance".
That would be my preference too. I suspect it would also cause the least disruption - everything else drags out the process.

But, if the user is adamant about live backup then IMHO it would be better to run a backup process inside the guest.
jeffcourteau
Posts: 13
Joined: 22. Mar 2017, 01:00
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: W2K12R2, Linux CentOS, FreeBSD, appliances, etc...
Location: Quebec City, QC, CA
Contact:

Re: Snapshots reliability

Post by jeffcourteau »

The question is: why involve snapshots at all. Shutdown the VM, copy the whole thing, reboot the VM. It's called "scheduled maintenance".
When you want to run 24/7, scheduled maintenance can be a lot more hassle. I can't afford having a 1 minute downtime, I run a random number generator used by universities, gambling sites and other businesses that need random numbers 24/7/365. I have a way to switch my DB servers without downtime, but it takes about an hour of manual work switching my master to a secondary and the secondary to a master. So using the snapshots to free my VDIs while I back them up would have been a great option, if snapshots were reliable enough.

Maybe a "storage snapshot" would be a great addition to VBox. I don't need anything that's in memory, all I need is what's in the VDI.
jeffcourteau
Posts: 13
Joined: 22. Mar 2017, 01:00
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: W2K12R2, Linux CentOS, FreeBSD, appliances, etc...
Location: Quebec City, QC, CA
Contact:

Re: Snapshots reliability

Post by jeffcourteau »

Thanks for your help guys, I know snapshots is a hot topic when it comes to having problems. I am simply not going to use them.

Have a good one!
Post Reply