Backups of only OS used space, not entire VDI?

This is for discussing general topics about how to use VirtualBox.
Post Reply
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Backups of only OS used space, not entire VDI?

Post by kdiamond »

Hi.

I can understand how Dynamic and Fixed drive storage work. Eventually, both will get filled to a maximum, even if the OS is actually using 20% of it. Size can never reduce, only enlarge.

An example: For 20GB OS, I need to backup 500GB of VDI. :( Not so nice if you have 20 VM's.

So the question is. Can I do backups of only OS space used (20% of VDI), without additional HDD sectors of the deleted files (80% of VDI). I'm guessing the Host (Virtual box) can not do that (or maybe it could by using guest additions???). But some internal OS backup software might be able to do that as it would do it on a physical machine. Networking PC to VDI or similar?

Does anyone have a suggestion in this direction?

Thank you

Br,
Dali
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Backups of only OS used space, not entire VDI?

Post by scottgus1 »

kdiamond wrote:Can I do backups of only OS space used (20% of VDI)
You can compact the vdi, then it would be back to the only-used size. 'vboxmanage modifymedium compact' does this. Also Mpack's CloneVDI does this. CloneVDI has the added advantage of making a compacted clone of the vdi and not altering the original.

Disadvantage of such steps during backups is no ability to confirm the integrity of the backup via an FC file compare.

The VM must be shut down completely, not save-stated, and no snapshots allowed.
kdiamond wrote:some internal OS backup software might be able to do that as it would do it on a physical machine. Networking
This also works. The VM can stay running, and it can be snapshotted. Save the backup image from the 3rd-party software running inside the VM to a real shared folder on the network, on a different physical drive than the vdi is on. Don't use Virtualbox Shared Folders for these backups.
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Re: Backups of only OS used space, not entire VDI?

Post by kdiamond »

Thank you so much Scottgus1.

You gave me really valuable info that I was missing.

Now I'm doing daily, weekly, and monthly backups of 12 VM. Imagine how much backup space it takes as VM's keep expending. I didn't know about vboxmanage modifymedium compact. That should solve everything. Backups and from time to time I can manually delete snapshots and compact VM's too.

I do backups like this.

1.Make a snapshot
2.Clone VM using that snapshot
3.Rar with filename timestamp (without compression) to backup drive (8TB HDD)
4.Delete Clone VM

I guess adding vboxmanage modifymedium compact after step 2 would gave me only-used size backups.

I was looking at CloneVDI forum. A lot of activity going on there!!! Are there any other benefits of using this 3rd party tool over built-in VBox CLI commands? In general, I would avoid 3rd party tools unless those provide some additional value.

Thank you
Br,
Dali
mpack
Site Moderator
Posts: 39156
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Backups of only OS used space, not entire VDI?

Post by mpack »

kdiamond wrote:I didn't know about vboxmanage modifymedium compact. That should solve everything.
You might be disappointed there. Compaction means to release any VDI blocks that the guest OS filesystem no longer needs. But, the gotcha is that VirtualBox doesn't understand guest filesystems, so it has no idea what blocks are needed. So you have to run a third party guest-specific tool inside the guest to zero fill all unused areas of the guest filesystem (that it can reach, i.e. it can't reach unpartitioned areas). This can be a long slow process, and it makes the VDI jump to max size, so you better have plenty of space available on the host.

The main difference between CloneVDI and VBoxManage is two-fold: (1) that the former does understand guest filesystems so doesn't need the semi effective zero fill kludge. (2) CloneVDI never modifies a disk "in place", it always makes a clone (optionally preserving the UUID) then modifies the clone. That's because in my mind the risk involved in performing radical surgery on your only copy of any data file is unacceptable. People say "well make a backup"... but then you are doing exactly what CloneVDI does, except with more work.

Bottom line, you can't do it without using third party tools, and CloneVDI makes lighter work of the whole thing. Plus CloneVDI is a GUI app.
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Re: Backups of only OS used space, not entire VDI?

Post by kdiamond »

Thank you for the explanation mpack. Yes, it makes perfect sense.

Let me play with CloneVDI and will post in that forum with questions I might have.

Thank you

Br,
Dali
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Backups of only OS used space, not entire VDI?

Post by scottgus1 »

Looks like you're on your way, kdiamond!

One thing I'll point out:
kdiamond wrote:I do backups like this.

1.Make a snapshot
2.Clone VM using that snapshot
3.Rar with filename timestamp (without compression) to backup drive (8TB HDD)
4.Delete Clone VM
Any backup routine that involves making snapshots at least complicates the VM and may leave you with a 'backup' that is the equivalent of pulling the power plug on the VM, with lost data as a result. So not really a restorable no-data-loss backup. Especially so if the VM was running. And the clones make new UUIDs, meaning the data in the backup is not identical to the original, so no integrity confirmation can be had, so no idea if the backup worked.

The best backup is with a full folder copy of the VM folder while the VM is fully shut down not save-stated, then file-compare the copy & original. To back up while the VM is running, use 3rd-party software to back up the data inside the VM.

Trying to compact the VM data using either method above will break the ability to fil-compare the backups for an integrity check. It's up to you if such an integrity check is important. I'd consider I'm not backed up until I've integrity-checked. Disk space is rather cheap nowadays.
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Re: Backups of only OS used space, not entire VDI?

Post by kdiamond »

Hi.

Here are my findings

Linux Ubuntu
Disk size 500Gb,Used 92GB
disk1.jpg
disk1.jpg (5.76 KiB) Viewed 3115 times
Hi

VM folder = 170Gb
CloneVDI = 151GB (using Compact drive while copying)
Vbox Clone VDI = 153GB
Vbox Clone VDI after VBoxManage.exe modifymedium --compact = 153GB (I guess modifymedium --compact is done already as cloning VM)

So I'm not getting even close to 92Gb used Data. Here we see that host actually doesn't understand guest filesystems as @mpack said.

So if my thinking is correct, the reasonable solution for not toooooo big backups is finding the right disk size and not allowing it to expand beyond needed for no good reason.

1. Limit disk sizes to a reasonable size for the OS purpose. Make it fixed. For example, If I use currently 92Gb, then I can set it to a fixed size of 200Gb. It will sooner or later hit the maximum and then the OS will start to overwrite unused sectors. That means no matter how many deleting/copying/temporary files... VDI will not expand over 200Gb, not like now (500Gb dynamic) which will grow to 500Gb even if 92Gb is used. Partitions can easily be expanded with partition tools at any time later on.

2. I can backup 200Gb image as @scottgus1 said "Disk space is rather cheap nowadays."

3. I can find a good internal OS backup tool that will make a VDI from used space and upload network drive. I have Windows and Linux machines. I don't think I would like to complicate it as much.

Br,
Dali
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Re: Backups of only OS used space, not entire VDI?

Post by kdiamond »

scottgus1 wrote: Trying to compact the VM data using either method above will break the ability to fil-compare the backups for an integrity check. It's up to you if such an integrity check is important. I'd consider I'm not backed up until I've integrity-checked. Disk space is rather cheap nowadays.
I understand what are you trying to point out here. But I can not have servers down for the entire backup process. This can be one hour a day.

what I do now is:
1. make a snapshot (this will halt VM for a few seconds only, no big deal)
2. create a clone from this snapshot.
3. delete a snapshot
4. backup a clone
5. deleting a clone

and yes, clones will start nicely every time. I was under the impression this process above makes a good backup as the snapshots are not changing while doing the cloning.
mpack wrote:Bottom line, you can't do it without using third party tools, and CloneVDI makes lighter work of the whole thing. Plus CloneVDI is a GUI app.
I tried CloneVDI. It works nicely. The Backup took about 20 minutes, so I wonder what happens with new data at the time backup is made? For example, I start backup and within that time I copy new files on VM.

How CloneVDI does that? Any similar to my steps above?

Just out of curiosity. Would not like to advertise other products here, but speaking of the features, is there any other VM host that addresses backups differently, actually understands OS file-systems and has internal agents that can backup only used space backups without stopping the VM. I would assume VMWare could do that with Windows OS as it's all the same company including their NTFS? But kinda doubt they would do it for Linux.

Thank you

Br,
Dali
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Backups of only OS used space, not entire VDI?

Post by scottgus1 »

kdiamond wrote:Make it fixed.
Fixed isn't necessary and can bite you. Leave the disk dynamic and make it the desired final size. It'll get that big eventually but no bigger. Dynamic can also be made bigger later if the need comes up. Fixed cannot be made bigger later. (Fixed is reasonable, however, if a VHD is being used, as the design flaw in VHD that can kill the drive file if dynamic doesn't get triggered when fixed.)

I'm not sure why the VM disk containing 92GB isn't closer to 92GB after compacting though. Maybe there's other partitions than the one screenshotted line?
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: Backups of only OS used space, not entire VDI?

Post by scottgus1 »

kdiamond wrote:what I do now is:
Yes, already posted:
kdiamond wrote:I do backups like this.
scottgus1 wrote:Any backup routine that involves making snapshots at least complicates the VM and may leave you with a 'backup' that is the equivalent of pulling the power plug on the VM, with lost data as a result.
The VM OS does not know it is being snapshotted and backed up. It keeps writing to its drive. Yet when the snapshot is taken, the data goes to being written to the snapshot drive and the parent drive stops being written to. So when the backup takes place, the unfinished data in the parent drive is all that's backed up. So unfinished writes aren't backed up. Therefore the data in the backup is incomplete, databases are dirty, open files aren't volume shadow copied, etc. When restored, the VM behaves exactly as if the power were cut on it for its last run. Not a good backup.

Thus:
scottgus1 wrote:The best backup is with a full folder copy of the VM folder while the VM is fully shut down not save-stated, then file-compare the copy & original. To back up while the VM is running, use 3rd-party software to back up the data inside the VM.
So if this case is required:
kdiamond wrote:I can not have servers down for the entire backup process.
Then use in-the-VM 3rd-party backup software, do a running-OS backup, and maybe do a full-shutdown backup with integrity check once a week or month.

Snapshot-based backups will cause trouble when restoring, and no one will be able to help put the VM back together again.
fth0
Volunteer
Posts: 5668
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: Backups of only OS used space, not entire VDI?

Post by fth0 »

kdiamond wrote:1. Limit disk sizes to a reasonable size for the OS purpose. Make it fixed. For example, If I use currently 92Gb, then I can set it to a fixed size of 200Gb.
An alternative would be to use a dynamic disk and create a smaller logical volume or partition in the guest OS. For example, a logical volume of 200 GB (in the 500 GB dynamic disk image) would only grow up to 200 GB. Then you could either enlarge the logical volume to 300 GB or add another logical volume of 100 GB, to grow the limit to 300 GB, and so on.
scottgus1 wrote:I'm not sure why the VM disk containing 92GB isn't closer to 92GB after compacting though.
That's probably because the virtual disk image (VDI) format is organized in blocks of 1 MB. Each block represents 2000 sectors à 512 bytes per sector, and if at least one sector of a block is in use, the whole block is written to the (dynamic) virtual disk image. Note that the disk compaction only eliminates whole blocks whose 2000 sectors are all currently unused.

I think mpack's CloneVDI is not different in this respect, it only knows which blocks are completely unused by analyzing the guest OS's file system instead of looking for zero-filled blocks. But I'm not 100% sure about that.
kdiamond
Posts: 18
Joined: 21. Mar 2021, 01:33

Re: Backups of only OS used space, not entire VDI?

Post by kdiamond »

Thank you. All things noted.

I know your concern about snapshots and clones but it works all the time 100%. I never had one single fail of snapshot or clone would not start or anything.

Yes, dynamic it is. But a smaller dynamic, not to allow it to expand so drastically. I have now managed to get 170Gb VDI down to 29Gb on Windows 10 OS. I have cloned existing VDI (500Gb dyn) to (100Gb dyn)

How I did it:
1. Manually deleted all unused files - I used folder size app to find the biggest folders
2. Run windows disk cleanup, and deleted everything including system backups
3. Empty recycle bin
4. Defrag drive
5. Write zeros to all the free disk space on drive C: (sdelete.exe c: -z)
6. Compact VDI (VBoxManage.exe modifymedium --compact "d:\path_to\image.vdi")
7. Create a new 100Gb VDI and mount is as a secondary Drive
8. Boot from LiveCD UltimateBootCD
9. Use EaseUs Disk Copy 2.3.1 to clone the drives
10. Boot from new VDI

Maybe there is a better way, but it works. If you have any suggestion improvements please let me know.

Thank you

Br,
Dali
mpack
Site Moderator
Posts: 39156
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Backups of only OS used space, not entire VDI?

Post by mpack »

When the drive is 150GB I don't think the 1MB granularity will have such an extreme effect.

A brief scan doesn't reveal to me what the guest OS and filesystems - these discussions are not generic. Linux guests often have a swap partition, and that can't be compacted since AFAIK it doesn't contain a formal filesystem, nor is it empty. If I knew enough about Linux to identify it for certain as a swap partition and know somehow that the guest was not hibernated, then I imagine that one could simply discard the entire swap partition contents... but CloneVDI won't do anything so risky. Also kind of pointless, since Linux will quickly fill it with crap again.

P.s. try searching "snapshots problem site:forums.virtualbox.org".
Post Reply