Suggested "Best Practice": Backup VMs in a productive environment

This is for discussing general topics about how to use VirtualBox.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

scottgus1 wrote:may boot fine but have damaged data. This last case is the most insidious, and hardest to detect.
Ok, this sounds good... or rather bad if it happens :shock:
So, I am getting more and more convinced that I really have to shut down my VMs before "saving" them. (But actually, this is why I asked for this discussion: I want to understand it - which I could not from the older threads on that topic. So, thank you all for all your valuable contributions!)

Given the need to shut down the VMs and then "saving" them: are there any objections to this
- take a snapshot (of the powered off machine)
- then "saving" it using rsync
I actually have the hope that this will some more copy time because the "older" vdi files don't have to be copied again and again, but only the newer differencing ones. Does this sound reasonable?
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

scottgus1 wrote:If you want to discuss this, please make a new topic, as forum rules require one subject per topic.
Wilco. (Only, I thought this would belong together, and i didn't want to drown the forum with new threads.)
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by scottgus1 »

Emma2 wrote:- take a snapshot (of the powered off machine)
This sounds promising, in the following usage:
1. shut down the VM
2. take a snapshot
3. restart the VM
4. copy the base disk
5. after confirming the copy, delete the snapshot while the VM is running.

Steps 1-3 would be fast, and the VM would be running on the snapshot disk while step 4 happens. All of the VM's clean-up and database closing needs would be handled by step 1. Step 5 merges the snapshot disk data back into the base disk, then the VM goes back to having no snapshots.

I just tested the process and found that it works. But it should be thoroughly tested on your end.

You will not want to have a chain of snapshots in your VMs. Snapshots make a VM more delicate and can be easily corrupted by an accidental file delete, as well as having lots of stale data overridden by newer snapshots, filling up the host disk. The safest VM has no snapshots during normal day-to-day operation.

You would not rsync or file-copy the snapshot, only the base disk. The snapshot is a temporary setup that will enable continued running of the VM while the base disk is being copied and confirmed, then the snapshot disappears.

I have rsynced VM files. My rsyncs did the 60GB VMs fine but choked on the 280GB VM. I had to 7zip the 280GB into 50GB uncompressed chunks, rsync the chunks, then reassemble them at the offsite location. FWIW straight file copy will probably be faster than rsync if copying over gigabit ethernet or faster.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

Great, this sounds to become an algorithm...

In accordance to what you wrote, I now suggest the following (with one additional step):

1. shut down the VM - to have it in a "safe state"
2. take a snapshot - to have an "immutable" virtual disk
3. copy the VM directory (except for the virtual disk files) - to backup that "safe shut down state"
4. restart the VM - to avoid a long downtime
5. copy all virtual disk files (except for the "running disk" - to actually copy all disk files belonging to the stopped machine
6. delete the live VM's helper snapshot - to avoid unnecessary snapshots there

Can we can agree that this will create a safe backup as well as prevent unnecessary downtime? Or did I still miss something?

If this is it, I would still like some help in the actual implementation (and I assume this should be here, not in a new thread?):
(I know the following questions are more shell script related than VBox related, but I'd like to keep our solution in one place, just in case anyone else is looking for something similar.)

ad 1. if I order the VM to shut down by vboxmanage controlvm <vm> acpipowerbutton, how can I wait until it actually has shut down, maybe including a timeout for failure?
ad 2. done by: vboxmanage snapshot <vm> take backupSnapshot
ad 3. how can I "copy a whole directory except for vdi (or similar) files"?
ad 4. done by: vboxmanage startvm <vm> --type headless
ad 5. how can I find the "running disk" name, and how can I copy all vdi (or similar) except for the latter?
ad 6. done by: vboxmanage snapshot <vm> delete backupSnapshot

So, if we can solve those three remaining questions, I will be "happy & safe" ;-)
Martin
Volunteer
Posts: 2560
Joined: 30. May 2007, 18:05
Primary OS: Fedora other
VBox Version: PUEL
Guest OSses: XP, Win7, Win10, Linux, OS/2

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Martin »

Exchange step 2 and 3. Taking a snapshot modifies the guest configuration file (.vbox) to register the snapshot.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

Martin wrote:Taking a snapshot modifies the guest configuration file (.vbox) to register the snapshot.
I know, but if I "Exchange step 2 and 3.", is it "guaranteed" the the virtual disks I copy after taking the snapshot fit into the VM copied before that? I actually thought it would be more secure to have a "defined state" copied (i.e. all files from the same situation) - knowing well that I could or should delete the unnecessary snapshot after restoring the copied VM.
But you are right, if it is "safe" to exchange 2 and 3, this would spare deleting the snapshot in the backup - but is it safe?
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by scottgus1 »

Martin is correct. You want the no-snapshot version of the VM backed up.

The snapshot is a workaround method to allow copying the VM while the VM is running. The snapshot will be deleted thereafter. You would not want the backed-up configuration file expecting a snapshot to be there, then have no snapshot available because it gets deleted.

So the process would be:
Emma2 wrote:1. shut down the VM - to have it in a "safe state"
2. copy the VM directory (except for the virtual disk files) - to backup that "safe shut down state"
3. take a snapshot - to have an "immutable" a copyable base virtual disk
4. restart the VM - to avoid a long downtime
5. copy all virtual disk files (except for the "running disk" - to actually copy all disk files belonging to the stopped machine
6. delete the live VM's helper snapshot - to avoid unnecessary snapshots there
Note the change about "immutable". It may just be terminology, but Virtualbox actually has a way to set a disk as "immutable" I don't know if you thought this was necessary, but you do not want to set the VM's disk as "immutable". Leave the disk as is.
Emma2 wrote:exchange 2 and 3, this would spare deleting the snapshot
No, deleting the snapshot is still necessary. The snapshot is what allows the base disk to be copied while the VM is running. Then the changed data in the snapshot disk needs to be merged back to the base disk after the backup is complete. Deleting the snapshot merges the data back in.
Emma2 wrote:[will this] create a safe backup
I have not used this method before, but it seems a solid method. Have in-the-VM 3rd-party disk image software in the VM saving disk images to shared folders in the network as an alternative method, so if there is a glitch in the snapshot method above, you have a fallback.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

Thank you , Scott.

- Point taken with the snapshot.
- The word "immutable" was my fault, I meant "a virtual disk which will not changed due to the VM running"
- "spare deleting the snapshot" actually meant "in the backupped VM", in the running VM it has to be deleted, got that

So, I am down to learning how to script my underlined tasks (in your new sequence steps 1, 2, and 5). Any suggestions?
(Otherwise, I will continue with my brand-new book on shell scripting... which I will do anyway ;-) )
arQon
Posts: 228
Joined: 1. Jan 2017, 09:16
Primary OS: MS Windows 7
VBox Version: PUEL
Guest OSses: Ubuntu 16.04 x64, W7

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by arQon »

> 4. Copy the VM to its backup destination: cp -vr <vm directory> <backup directory>

The later suggestion of rsync is clearly a much better option than this, though there are caveats re rsync's block-differencing that you probably need to read up on first.

If that doesn't work out well for you, the #2 choice is probably not a "mindless" cp, but the (very useful) clonevdi instead.
As with any backup approach, you first need to determine what's most important to you re speed/space tradeoffs, a lot of which comes down to what drives you're using, but when I moved my backup TARGET from SSD to spinning rust several years ago I found that using clonevdi with --compact was generally only trivially slower than just copying the whole image, and sometimes actually faster, simply because of the (generally large) reduction in blocks written. Even with many TB of storage available, anything that halves the output filesize for basically no additional time cost is a clear win.

Compressing the backups (7z -mx=1) is generally highly effective, but too time-intensive to be worth it unless you're absolutely desperate for space. Obviously it IS generally worth it if you want to move the backups offsite via WAN, or copy them to a USB stick, etc.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by scottgus1 »

arQon wrote:The later suggestion of rsync
I have rsynced disk files to offsite locations, and the concept works well. But for speed's sake, to keep the temporary snapshot file small as possible, the fastest copy method should probably be used. My experience is that straight file copy over gigabit network is faster than rsync. In fact, straight file copy to another drive in the host would be fastest, then an 'fc' to confirm the on-host copy, and an SHA256-like hash on the original for checking integrity on the offsite copies. Then delete the temporary snapshot. The on-host fc'd copy can then be used for copying over ethernet on-site or rsyncing to off-site, and fc the onsite backups against the on-host backup, and hash to confirm offsite backups. While CloneVDI might give less data copied, the backup file would be different than the original and therefore not confirmable.
Emma2 wrote:how can I wait until it actually has shut down, maybe including a timeout for failure?
After 'acpipowerbutton', go into a time-delayed loop ('ping 127.0.0.1 n 5' is an approximate 5 second delay), redirect output of 'vboxmanage showvminfo "vm name" ' to a temporary disk file then 'find' in that file the text 'status: powered off'. Once that text is found, come out of the loop and continue the backup process. Also increment a counter each time through the loop, and if the counter goes over a predetermined amount (# seconds delay * # counts = total time to allow for shutdown) then assume that the VM has seized and issue 'vboxmanage controlvm <vm> poweroff'. Be sure the VM OS reliably responds to 'acpipowerbutton' before assuming 'poweroff' is a decent fallback for a seize-up.
Emma2 wrote:how can I "copy a whole directory except for vdi (or similar) files"?
how can I find the "running disk" name, and how can I copy all vdi (or similar) except for the latter?
Technically, all you need is the .vbox file and the base disk file(s). The logs aren't important if the VM is steady. The Snapshots folder will be empty under normal running, and you won't copy the snapshot during the backup. (The 'Snapshots' folder will also contain saved states: don't use 'save-state' except for if the host goes to battery backup, and don't back up a save-stated VM, restart it and shut it down normally first.) I would copy the .vbox-prev file too, it is a previous copy of the .vbox file after you make changes to a VM and can come in handy if the .vbox file is damaged. It is especially critical to keep the .vbox file safe if you dabble in Virtualbox disk encryption. 'Copy *.vbox*' and 'copy *.vdi' should get the files needed, if the .vbox and all base disks are in the same folder. If you keep a second VM base disk outside the VM folder, you'll either have to manually add a specific copy command for that disk or write a parser script to read the .vbox file for the disks the VM uses. The simpler the VM layout the easier the backup script will be.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

I would rather go for a "normal" copy, too, i.e. cp or rsync.
arQon wrote:Compressing the backups (7z -mx=1) is generally highly effective, but too time-intensive to be worth it unless you're absolutely desperate for space. Obviously it IS generally worth it if you want to move the backups offsite via WAN, or copy them to a USB stick, etc.
This is a good point, not for my home base, but for backing up my remote server.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

scottgus1 wrote:After 'acpipowerbutton', go into a time-delayed loop ('ping 127.0.0.1 n 5' is an approximate 5 second delay), redirect output of 'vboxmanage showvminfo "vm name" ' to a temporary disk file then 'find' in that file the text 'status: powered off'. Once that text is found, come out of the loop and continue the backup process. Also increment a counter each time through the loop, and if the counter goes over a predetermined amount (# seconds delay * # counts = total time to allow for shutdown) then assume that the VM has seized and issue 'vboxmanage controlvm <vm> poweroff'. Be sure the VM OS reliably responds to 'acpipowerbutton' before assuming 'poweroff' is a decent fallback for a seize-up.
Thank you, your ping thing sounds a good idea (although, as a bash newbie, I thought there could be some built-in "timeout function").
But... do I explicitely have to send vboxmanage controlvm <vm> poweroff - even if the guest OS does shutdown -P (Linux) or shutdown -s (Windows)?
scottgus1 wrote:Technically, all you need is the .vbox file and the base disk file(s). (...) The Snapshots folder will be empty under normal running, and you won't copy the snapshot during the backup.
This is right for my "normal" productive (server) machines, but I do have some VMs in which I actually do need snapshots. I have several "install test" machines, simulating different update states to make sure an update of my software does work on "all" older versions. Ok, these machines are not really important, and I could re-create them more or less easily - but anyway this would need some time and I would like to spare this. I am quite sure that this can be solved by clever bash scripting. I admit, however, that this is probably beyond the scope of this forum, so I will not urge further into that direction.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by scottgus1 »

Emma2 wrote:do I explicitely have to send vboxmanage controlvm <vm> poweroff
'controlvm <vm> poweroff' is only if the timer count expires while waiting for the VM to report "status: powered off" from 'showvminfo'.

Here's a Windows batch file for the process I laid out:

Code: Select all

rem 5 second delay per loop
rem 10 minutes for VM to shut down = 600 seconds / 5 = 120 loops
set VMseizedtime=120
set /a statecount=0
vboxmanage controlvm "vm name" acpipowerbutton
loop:
ping 127.0.0.1 /n 5 > nul
set /a statecount += 1 > nul
vboxmanage showvminfo "vm name" > status.txt
find /i /n "State:           powered off" status.txt > nul
if "%Errorlevel%"=="0" goto shutdowngood
if %statecount% GEQ %VMseizedtime% goto VMstuck
goto loop

:VMstuck
vboxmanage controlvm "vm name" poweroff
rem delay a bit for Virtualbox to settle
ping 127.0.0.1 /n 20 > nul

:shutdowngood
rem continue backup process
Emma2 wrote:I do have some VMs in which I actually do need snapshots.
The backup process for these VMs will be more complicated. You'd need to determine what files are in the VM folder before the snapshot is taken, then copy only those files. In Windows I'd try:

dir drive:\path\to\VM\Folder\Snapshots /b > dirlist.txt

Then parse dirlist.txt for the snapshot disks that exist before the backup snapshot is taken.
arQon
Posts: 228
Joined: 1. Jan 2017, 09:16
Primary OS: MS Windows 7
VBox Version: PUEL
Guest OSses: Ubuntu 16.04 x64, W7

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by arQon »

scott's point about about a compacted VDI not being trivially-verifiable is a good enough one to withdraw that suggestion.

> You'd need to determine what files are in the VM folder before the snapshot is taken, then copy only those files.

This is where rsync DOES shine. Bookkeeping is a separate issue, but if the VM is running then those snapshots are all fair game to be copied BEFORE shutting down the VM, which could drastically shorten the amount of time it spends offline.
Emma2
Posts: 51
Joined: 16. Feb 2021, 11:59

Re: Suggested "Best Practice": Backup VMs in a productive environment

Post by Emma2 »

Yes. I now have collected all sorts of advice, so I will try to assemble it into an algorithm and will come back once finished.
Post Reply