sync

Discussions related to using VirtualBox on Linux hosts.
Post Reply
Quantr Peter
Posts: 3
Joined: 18. Apr 2021, 06:23

sync

Post by Quantr Peter »

Hi, how to real-time sync all files to other Linux, I need a real-time backup solution. People usually use rsync? thanks
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: sync

Post by scottgus1 »

Quantr Peter wrote:real-time sync all files to other Linux, I need a real-time backup solution.
There can be two ways to interpret this in the context of a Virtualbox VM: syncing the data inside the VM, or syncing the whole VM itself.

Syncing the data inside the VM:
This can be handled in exactly the same way you'd sync two or more computers on the network or over the internet: install 3rd-party syncing software (not provided by Virtualbox) and link the networked computers together in the 3rd-party software. A VM is simply another computer for this task: install 3rd-party syncing software inside the VM. A Bridged, NAT, NAT Network, or Host-Only network can be used depending on local, LAN, or internet connectivity needs. See Virtualbox Networks: In Pictures.

Syncing the whole VM itself:
This is not possible while the VM is running. The VM OS does not know when the host app running the backup is scanning the VM's data, and the VM's OS will not prepare databases, open files, and running write processes for on-the-fly backup. Corrupted data or dead VMs will result, and the backup cannot be confirmed because the VM OS is still live and changing the data. The VM must be shut down fully from within the VM's OS for a graceful shut-down before backing up the whole VM. Also remember to copy the VM's .vbox file. In fact, copy the whole VM folder.

What backup software to use:
That's up to you, and is a subject of a web-search and feature comparisons.

I have rsynced Virtualbox VM disk files before, over the internet, for backup purposes. The VMs were fully shut down before rsyncing. Rsync was able to handle 60-ish GB files well. Rsync kept choking on the 280GB file, though; it had to be split into 50GB chunks before rsyncing, then reassembled at the destination. I would run SHA 256 hashes on the source and destination to confirm the backup. I also had backup software inside the VMs for live backup purposes, with copy and hash confirmation across the internet.
Quantr Peter
Posts: 3
Joined: 18. Apr 2021, 06:23

Re: sync

Post by Quantr Peter »

thanks
arQon
Posts: 228
Joined: 1. Jan 2017, 09:16
Primary OS: MS Windows 7
VBox Version: PUEL
Guest OSses: Ubuntu 16.04 x64, W7

Re: sync

Post by arQon »

scottgus1 wrote:Rsync was able to handle 60-ish GB files well. Rsync kept choking on the 280GB file, though; it had to be split into 50GB chunks before rsyncing, then reassembled at the destination.
@Scott, do you remember if the destination server was running an rsync daemon or not?

Without rsync at the *far* end, files you copy will always be sent in their entirety, so even "just" a 60GB vdi is a very substantial chunk of transfer. With rsync running at the far end *as well* though, the whole file should be split into 4K chunks, and only the delta chunks need sending. (Much like a bit-torrent transfer).

Even a 280GB image *should* be fine with a server at the far end: that's still only about 70K blocks. But if you did have one and you still had problems, that hints at a bug in rsync, so if can remember I'd like to know. Thanks.

> I would run SHA 256 hashes on the source and destination to confirm the backup. I also had backup software inside the VMs for live backup purposes, with copy and hash confirmation across the internet.

Definitely worth doing (as long as you can ssh into the remote end), though an rsync server should already be checksumming each block.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: sync

Post by scottgus1 »

arQon wrote:@Scott, do you remember if the destination server was running an rsync daemon or not?
Yes, Both ends were Windows hosts, running Synametrics' "Deltacopy", which installed a Cygwin-based rsync service. Later, I used the rsync service by itself, without going through Deltacopy directly. Could have been a bug...

I never did want to trust the whole backup processes, the hash made me feel more comfortable that everything came over OK.
arQon
Posts: 228
Joined: 1. Jan 2017, 09:16
Primary OS: MS Windows 7
VBox Version: PUEL
Guest OSses: Ubuntu 16.04 x64, W7

Re: sync

Post by arQon »

Thanks.

Deltacopy is just a GUI frontend to rsync, so the underlying engine shouldn't have been modified. rsync's checksumming isn't great, but it's not naive either, so the odds of a collision on a changed block are basically zero: 50GB is only ~12 million blocks, which is about 30 oom short. IOW, rsync shouldn't have failed even once (or more accurately, on a single VDI), let alone on several. I can see a bug causing problems, obviously, but the algorithm itself should still be solid.
(Even so, I wouldn't trust anything without local hashes either. It's a good habit to have. :P)
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: sync

Post by scottgus1 »

FWIW it was only one 280GB file that had the rsync problems. But that was the only super-large file I've tried to rsync. All the other VDI's rsync'd fine.
Post Reply