Page 1 of 1

VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 20:14
by Andrew@USOP
We're running a few VMs by of a windows server, but where the VM files live on a Synology DS1515 (Specs below)

There are 4x 1 GigaBit Ethernet NICs on the Server with NIC Teaming enabled used to "talk" to the Synology
The NIC's health is good, as is all the hard drives health.

What happens is (and this is intermittent) is that:
* Around the time the daily backup of the VMs finishes (run by the Synology),
* Windows Server (which is running VirtualBox) reports a delayed write error,
* Which crashes the VMs (which are all running Ubuntu 19.04).

Shutting down and restarting the VMs clears out the problem and life goes on etc. Less than ideal etc

It appears on the surface of it that some sort of timeout limit is being breached, but I'm not at all sure where to look... VirtualBox? Windows Server (Teams)? NIC Drivers?
Am I even close in my presumption?

SPECS
=====
OS: Windows Server 2019 Standard
System: PowerEdge T630
CPU: 2 x Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz, 2098 Mhz, 8 Core(s), 16 Logical Processor(s)
BIOS: UEFI
RAM: 128 GB
NIC: 4x Intel(R) Gigabit 4P I350-t
NAS: SYNOLOGY DS1515

Re: VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 20:17
by scottgus1
Andrew@USOP wrote:Around the time the daily backup of the VMs finishes (run by the Synology),
* Windows Server (which is running VirtualBox) reports a delayed write error,
* Which crashes the VMs
Do I understand correctly that your NAS is backing up the guest VDI files while the guests are running?

Re: VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 20:25
by Andrew@USOP
Yes, that's correct... maybe. I am really not sure what Synology backup is doing.

And 28 days out of 28/29/30/31 it works just fine.
This is not to say that I'm actually doing it completely wrong!

Re: VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 20:43
by scottgus1
OK, thanks. Respectfully, aside from the Delayed Write errors, I think this backup routine needs to be rethought.

An OS keeps using and writing to its files over time. A backup running from outside the running OS's knowledge will make a slewed copy of whatever the OS is in the middle of writing from the start to the end of the period of time that the backup takes.

Also many OS's need to do some cleanup before shutting down: flushing of caches, database closings, etc. A backup taken while the OS is running does not have these housekeeping tasks done. Such a backup looks just like an unexpected power loss with no UPS.

The kind of backup being run is rather like slowly scraping the needle across a playing record (or hitting the ffd>> skip button on a CD player or media player, if I sound too old :lol: ) then recording the sound coming out. It has snippets of gradually-progressing out-of-date data with lost caches and dirty databases.

And to top it off, the backup is tied to the host hardware. The host CPU is seen in the guest OS, and if you try to restore that backup to another host, the guest OS has to recalculate for the new CPU it sees while trying to recover from data that should look like some way at the beginning of the backup but doesn't because the end of the backup finished some time later.

And, as a guess, the host OS can't write to sections the NAS is handling at the time and throws the Delayed Write on occasion.

Even if the NAS could do a complete immediate "snapshot" to backup the already-written data while writing new data to a temporary holding location, the backup is still of a running OS which will look like the power died on it when the backup is restored, which may result in lost data.

I run backups of live OS's using 3rd-party compatible in-the-OS software that can tell the OS what it is going to copy so the OS can wait, or shadow-copy things so the OS can proceed. Where your guests Windows, I'd recommend Macrium Reflect. Linux has to have something equivalent. Such a disk image can be restored by booting the imaging program's rescue CD ISO within the guest environment

If you can shut down the guests, a simple backup copy of the guest folder and any guest drive files that are not in the guest folder. Such a backup can be taken to any host and restored immediately.

Re: VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 20:57
by Andrew@USOP
Age... I remember floppy disks and gas that cost 75c/gallon! Heh

What you wrote made perfect sense.
I should know better, but... when the younger set tell you that you're past your use by date, you say, "Ok, do it then"

Although truth be told, I wasn't that enthusiastic about getting involved in "not my area" etc

Now, I know I was right, I shall go forth and try to be average.
Will respond with how it all went.

Re: VirtualBox running VMs located on a NAS

Posted: 6. Aug 2020, 22:52
by Andrew@USOP
After looking at a few of the backups done as they currently are, surprisingly they were good if the VM didn't disconnect, but there were issues when it did.

Lesson learned...

The simple solution we're going with

1. Set up a daily cron job in the VM(s) to get them politely shut down before the Synology Backup runs (hey, it works, is decently quick etc)
2. Set up a daily Windows Event** to restart the VM(s) - VBoxManage startvm "VM NAME" seems to do the trick.

A quick test of this, and it seems that all is well.

Thanks for the calm helpful response.

** Normally a Windows event would be considered a Win10 update that was particularly dire, or perhaps not so dire.

Re: VirtualBox running VMs located on a NAS

Posted: 7. Aug 2020, 00:08
by scottgus1
That setup looks a lot better, you should get much more solid backups.

If your Windows 10 is Pro, you can get the Group Policy Editor to disable automatic reboot for updates:
gpedit.msc > Local Computer Policy > Administrative Templates > Windows Components > Windows Update > Configure Automatic Updates > set "Configure automatic updating" dropdown to 2 (Notify for download and auto install) or possibly 3 (Auto download and notify for install).

Re: VirtualBox running VMs located on a NAS

Posted: 8. Aug 2020, 00:27
by Andrew@USOP
Well, one day and counting... and it works a treat.

Thank you so much.