Image filesystem corruption with 3.2.0 on Linux amd64 host

Discussions related to using VirtualBox on Linux hosts.
frank
Oracle Corporation
Posts: 3362
Joined: 7. Jun 2007, 09:11
Primary OS: Debian Sid
VBox Version: PUEL
Guest OSses: Linux, Windows
Location: Dresden, Germany
Contact:

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by frank »

It's not dangerous to use VirtualBox 3.2.4 on an ext4 partition in general, but it is dangerous to have the virtual disk image reside on an ext4 partition AND the host cache DISABLED. This setting is default for all non-IDE controllers. Therefore, if you are using ext4 partitions then ensure that the host cache is enabled. When you created a new virtual machine then you have to go to the VM settings and select the controller to which your virtual disk image is connected. Is this is a SATA/SCSI/SAS controller then you have to re-enable the host cache (which is disabled by default for these controller types).
davemb
Posts: 1
Joined: 13. Jun 2010, 12:14
Primary OS: Ubuntu other
VBox Version: OSE other
Guest OSses: Centos

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by davemb »

This is not purely an ext4 issue. I just upgraded to VB 3.2.4 and hit problems on all my centos 5 guests flipping their root (ext3) partitions to read only and in fact could not boot them until then doing a rescue cd boot and forced fsck.

My host is updated Ubuntu 9.10, 2.6.31-22-generic with XFS filesystems (nobarrier option set).

So ext4 is no where to be seen on my system. Sorry I can not post more debugging I need to go back to 3.1 :-(

VB still the best virtuals system out there though :-)
ddehnhard
Posts: 4
Joined: 10. Jun 2010, 12:28
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: nearly everything

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by ddehnhard »

Hello Frank,
Thanks for the clarification. That information about disabling of "host cache" slipped my summary somehow. I've been not convinced first though, so I invested some weekend hours in creating VMs and filling them with more than 4 GB (to check on that other mentioned bug with ext4 and async I/O). With forced fs check on every reboot I've only seen some orphaned inode repairs on the host and none on the guest, that are presumably harmless and probably connected to some minor ext4 bugs mentioned in my previous bugs. Then I installed a 2.6.34 kernel from liquorix.net, as that contained more ext4 bug fixes and after that I had no errors on host or guest anymore.
So I can confirm that Virtualbox 3.2.4 on ext4 with enabled host cache works well and I won't revert to 3.1 for now (I'll post if something happens later on). I have to add that on my host filesystem the feature "delayed allocation" is disabled (nodelalloc) in ext4, since this can delay writing for some time and I want to prevent data loss I did experience recently on this server by premature shutdown of the guest (and my main reason for ext4 over ext3 are fast fs checks and fast file deletion for big files anyway).

@davemb
Have you checked that "host cache" option? Maybe it works the same for you.
And if this is indeed the same bugs this might give an indication to the developers about the cause.

Another note:
As a developer myself, I know how pressuring it can feel when users report bugs and demand fixes.
So I want to state that I do like Virtualbox and am glad that it has been developed and put into the world by you
and it's partly the reason to write here, with the hope my findings are helpful to solve this, so that not much
more people are affected by this and lose time (like me) or data (gladly not like me, although my personal virtual
VCS server has been corrupted by this, I luckily had all data somewhere else).
I mean the current stable Ubuntu Lucid has ext4 as default and although VB OSE is still 3.1 (presumably at least
3.2 in Ubuntu 10.10), a user of the PUEL version will stumble upon this by creating a default VM.

What about a warning dialog when starting a VM with this combination (VB is >=3.2.0 AND disk file is on ext4 AND
host cache is disabled AND kernel version is affected by this bug)?

Ok, now it's time to get back to my own work now...
Daniel
fixedwheel
Volunteer
Posts: 1699
Joined: 13. Sep 2008, 02:18

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by fixedwheel »

Frank Mehnert wrote:It's not dangerous to use VirtualBox 3.2.4 on an ext4 partition in general, but it is dangerous to have the virtual disk image reside on an ext4 partition AND the host cache DISABLED. This setting is default for all non-IDE controllers. Therefore, if you are using ext4 partitions then ensure that the host cache is enabled. When you created a new virtual machine then you have to go to the VM settings and select the controller to which your virtual disk image is connected. Is this is a SATA/SCSI/SAS controller then you have to re-enable the host cache (which is disabled by default for these controller types).
could you post this note as a sticky thread, please?
Sasquatch
Volunteer
Posts: 17798
Joined: 17. Mar 2008, 13:41
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: Windows XP, Windows 7, Linux
Location: /dev/random

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by Sasquatch »

fixedwheel wrote:could you post this note as a sticky thread, please?
The recently released 3.2.6 beta 1 already has this bug fixed. A check is done for the file system and acts accordingly. As long as users upgrade to minor versions properly, there should be no need for it soon.
Read the Forum Posting Guide before opening a topic.
VirtualBox FAQ: Check this before asking questions.
Online User Manual: A must read if you want to know what we're talking about.
Howto: Install Linux Guest Additions
Howto: Use Shared Folders on Linux Guest
See the Tutorials and FAQ section at the top of the Forum for more guides.
Try searching the forums first with Google and add the site filter for this forum.
E.g. install guest additions site:forums.virtualbox.org

Retired from this Forum since OSSO introduction.
Jan Sloep
Posts: 2
Joined: 2. Dec 2008, 16:47

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by Jan Sloep »

Frank Mehnert wrote:It's not dangerous to use VirtualBox 3.2.4 on an ext4 partition in general, but it is dangerous to have the virtual disk image reside on an ext4 partition AND the host cache DISABLED. This setting is default for all non-IDE controllers. Therefore, if you are using ext4 partitions then ensure that the host cache is enabled. When you created a new virtual machine then you have to go to the VM settings and select the controller to which your virtual disk image is connected. Is this is a SATA/SCSI/SAS controller then you have to re-enable the host cache (which is disabled by default for these controller types).
I am sorry to say IT CAN BE DANGEROUS to have the virtual disk image reside on an ext4 partition even with the host cache ENABLED
I have an Ubuntu 10.04 - 64 bit host with 6 GB ram and both an Ubuntu 9.10 - 64 bit and an Ubuntu 10.04 - 64 bit guest server, both with 3 GB ram each. All VM systems with ext4 and the host cache ENABLED.

Unfortunately All VM's filesystems are completely corrupted now. I lost a whole day on this subject because the problems started after some time and only after I installed a lot of software on my virtalbox test servers.

At first I thought the newly installed software was the cause of the corruption, because next to the Webmin filesystem from time to time losing contact with the virtual server, which in general I could solve by going back to the snaphots I made after every install step, the total collapse of ext4 only started at the end, after the installation of Zendserver.

But now it is clear to me that also the Webmin problems have to do with the fiesystem corruption. I will revert to Virtualbox 3.1 untill I am sure the problems with version 3.2 are 100% solved.
spinkham
Posts: 2
Joined: 21. May 2010, 20:09
Primary OS: Ubuntu other
VBox Version: PUEL
Guest OSses: Ubuntu, XP, 2000, freeBSD

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by spinkham »

Can someone confirm this issue is fixed in the 3.2.6 release? I don't see it mentioned in the Changelog..
frank
Oracle Corporation
Posts: 3362
Joined: 7. Jun 2007, 09:11
Primary OS: Debian Sid
VBox Version: PUEL
Guest OSses: Linux, Windows
Location: Dresden, Germany
Contact:

Re: Image filesystem corruption with 3.2.0 on Linux amd64 host

Post by frank »

I cannot prove that the file system corruption is gone but I strongly assume that users who observed this problem with VBox 3.2 in the past had the host I/O cache disabled at least for some time, perhaps after they upgraded the first time to VBox 3.2.x. A disk corruption is sometimes difficult to detect and it is not guaranteed that the corruption reveals immediately.

Even in VBox 3.2.6 Beta 2 the check was not 100% safe as it failed when the VM had never created a snapshot before. The check in the final 3.2.6 release should be safe.
Post Reply