Snapshots on running machine fail

Discussions related to using VirtualBox on Linux hosts.
Post Reply
DdB
Posts: 114
Joined: 22. May 2010, 23:27
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: many
Location: Germany

Snapshots on running machine fail

Post by DdB »

Hi,

i seem to have changed my usage to always shutdown a guest, since i had problems with failing live snapshots. Now, while preparing for a change in hardware, i want to prepare another usage, avoiding unnecessary boot storms. But i noticed, that live snapshots almost always lead to the problem looking like this: viewtopic.php?f=7&t=88244&p=423520&hili ... ng#p422627.

i read through it and can say: i did NOT move my VM's ever. They sometimes changed path through grouping though. All machines reside on the same filesystem, only the .Virtualbox/*.xml file is on another disk. Today, i checked live snapshotting a dozen VM's and only one machine succeeded (it was a Windows VM). The faiing ones had different OS's (debian, Ubuntu, and more), different versions though. Now, i am asking: Should the live snapshots be leaving the machine running or is aborting normal? Can i expect to restart such a VM in its running state, or is it normal, they just reboot from disk?

What kind of information can i provide to further track this issue down?
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Snapshots on running machine fail

Post by socratis »

DdB wrote:since i had problems with failing live snapshots.
What "problems" would that be? You don't go into specifics unfortunately...
DdB wrote:But i noticed, that live snapshots almost always lead to the problem looking like this
Almost always? And you quote a single instance, where the user got the whole operation wrong and was using a link? That's not an "almost always" case I'm afraid, not even close actually.
DdB wrote:They sometimes changed path through grouping though.
In past iterations the Group feature of VirtualBox had issues. That's an area that they're addressing by actually re-writing/cleaning-up the API.
DdB wrote:All machines reside on the same filesystem, only the .Virtualbox/*.xml file is on another disk.
Why? Why not keep the VM all together, configuration and files, the whole thing. Makes it much easier to not-mess-up.
DdB wrote:i checked live snapshotting a dozen VM's and only one machine succeeded ... The faiing ones had different OS's
The guest OS has absolutely nothing to do with the guest OS, it's completely irrelevant.
DdB wrote:Should the live snapshots be leaving the machine running or is aborting normal?
Leaving the machine? Where to? A better place with other VMs to play in the grass? ;) What are you doing? What are you trying to do? What's the end goal? What fails? How exactly?
DdB wrote:Can i expect to restart such a VM in its running state, or is it normal, they just reboot from disk?
What does "reboot from disk" mean? I'm not sure I understand what you're saying here...

I've used Live Snapshots many, many, many times. Not a single time have I ever had a single issue. Not sure what you have in mind or how you're doing, whatever it is you're doing. You'll need to get into more details.
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
DdB
Posts: 114
Joined: 22. May 2010, 23:27
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: many
Location: Germany

Re: Snapshots on running machine fail

Post by DdB »

Ok, many questions...

The history first: When i started using virtualbox, i was on Windows (Vista) still and did some investigation of different Linuxes. At that time, i suffered from file corruption of the vdi files several times, which made me look for a better file system than ntfs. Today, i am using ZFS on linux, which has snapshotting and rollbacks on its own, giving me some sense of safety, although i am using it only rarely to check past versions of VM's. Anyway: because it did happen under some rare circumstances, that i had to go back to another FILESYSTEM state on the host, i am heavily using data separation (shared folders or secondary vdi's, even what corresponds to partitioning) in order to be able to rollback only the parts i really want.

The current problem consists in this: while trying to snapshot a running VM, it stops running and displays the "one way street sign" next to it. Obviously, it is not shut down properly, only the vdi remains and rebooting from it without the machine state available is still possible. I said "almost always" because one VM (the Win7 one) eventually created a valid snap and could start off it. The other 10 i tried all failed at snapshot creation time.

About keeping it all in one place: It is vbox that uses the .VirtualBox folder, and it was ME who made it a different filesystem in order to put it into the same ZFS-hierarchy as the machines. See the following excerpt:

Code: Select all

NAME                                                                       USED  AVAIL  REFER  MOUNTPOINT
MAIN/BCKP/vbox-snaps/vbox                                                  582G  64,5G   447G  /mnt/data/vbox
MAIN/BCKP/vbox-snaps/vbox/hidden                                          13,2M  64,5G   844K  /home/datakanja/.VirtualBox
MAIN/BCKP/vbox-snaps/vbox/inet                                             385M  64,5G   385M  /mnt/data/profile/inet
MAIN/BCKP/vbox-snaps/vbox/linuxfox                                         139M  64,5G   139M  /mnt/data/profile/linuxfox
MAIN/BCKP/vbox-snaps/vbox/linuxmail                                       16,4G  64,5G  16,1G  /mnt/data/profile/linuxmail
MAIN/BCKP/vbox-snaps/vbox/mail                                            15,6G  64,5G  15,6G  /mnt/data/profile/mail
MAIN/BCKP/vbox-snaps/vboxISOs                                             77,2G  64,5G  77,2G  /mnt/data/vbox/disk_images
This structure allows for a single (filesystem) snapshot to run over the whole structure of "virtual" filesystems (a.k.a. paths) at once, which makes the timing exact: the "hidden" .VirtualBox folder always gets archived at the same time as the corresponding vbox and vboxISOs folders and i make sure, that no VM is running at that point in time. That has proven robust for many years now and i see no reason to change the architecture. It eases configuration of settings, backup and restore, and, btw. the whole thing lives on a RAID.

Only the VM's do not live under .VirtualBox, but the main config file does.

It would take long to explain, what i am doing, as that changes frequently. The only thing coming up is: My next hardware may be able to run even more VM's concurrently. ATM, i do rarely run more than 3 at a time (they use memory pooling). But in the future, i may have a lot more, and in order to prevent booting all of them at the same time, i would like to cease shutting them down every time. I used to use live snapshots while still on windows (host), but since i am on linux, i had been unable to restart from those, which made me select the full shutdown instead. This is the behavior, i would like to change, but currently cannot.
Last edited by DdB on 19. Oct 2018, 22:17, edited 1 time in total.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Snapshots on running machine fail

Post by socratis »

Can you try a test VM on a different filesystem? I have seen issues with ZFS filesystems before. Don't know what might be causing it, but try with an NTFS or an exFAT hard drive, an external USB3 might do for the test. Just don't expect high performance from an external USB3 HD...
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
DdB
Posts: 114
Joined: 22. May 2010, 23:27
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: many
Location: Germany

Re: Snapshots on running machine fail

Post by DdB »

In short: nothing changed.

longer story: excited, as i am in touch with the zfs dev's and would be surprised, if they would not fix any real problem. So to have a reproducer would be valuable. :-)
  • pick a vm that fails to create a live snapshot
    create an empty ext4 partition
    copy the entire vm folder over to that partition
    create a single file in the folder AND on the partition in order to be able to tell the difference
    mount the partition OVER its originating folder
    verify, that now only ext4 files get accessed
    run VM and snapshot
    voilà (failing again)
No reproducer for zfs, even without going the more complicated path of creating the ext4 filesystem inside a zfs device.
DdB
Posts: 114
Joined: 22. May 2010, 23:27
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: many
Location: Germany

Re: Snapshots on running machine fail

Post by DdB »

Hi,

after 2 weeks time, i am really missing additional input on the aforementionned issue.

Anyway... by now, i do have additional input myself:
I happened to notice, that among my VM's, there is one with an old live snapshot (more than 6 months old) and a test revealed, that even though the vbox version did change in the meantime, the machine can in fact restart just fine. And - surprise - even taking another fresh live snapshot succeeds and a restart from there is also possible. And furthermore: the VM in question is NOT a Windows-, but an Ubuntu-based one!

Among the almost 100 VM's i am keeping, this one is the first non-Windows one, where snapshotting a live machine seems to be working ok. Immediately, i went on to retest another one, but still the same: Even just attempting to create a live snapshot brings the machine(s) down - obviously without creating a valid machine state.

Uptil now, i did not yet have the spare time to make further observations. Just one thing to confirm: ALL my VM's reside on a compressed zfs filesystem, thereby hinting towards that detail being irrelevant.

Any advanced guidance/sensible suggestions for me so far?
Post Reply