Regular ext4 corruption

Discussions about using Linux guests in VirtualBox.
Post Reply
paul.dorman
Posts: 3
Joined: 26. Oct 2010, 01:43
Primary OS: Ubuntu other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows

Regular ext4 corruption

Post by paul.dorman »

Hi all,

experiencing regular ext4 file system corruption in my Ubuntu Maverick 32-bit VMs. The host system is my i7 laptop (Ubuntu Maverick 64-bit, i7, 8GiB, 500GB BTRFS), and the VMs are Ubuntu Maverick 32-bit, 2VCPU, 2GiB, 8GB ext4. All machines are running 2.6.36-020636-generic kernels, running Virtual Box 3.2.10 r66523. I'm using the AHCI host adapter with Host I/O caching off.

Switching the VMs to a single CPU and no I/O APIC seems to improve things, but I haven't been able to definitely test. In addition to file system corruption, I get extremely high I/O and system loads, primarily from the ext4 jbd2 process.

If you are running similar systems, and have managed to avoid file system corruption and sluggish performance, what are your recommended settings? Are there bug reports, either with VirtualBox or with the Linux kernel which are tracking the underlying fault(s)?

I'm happy to provide a proper technical report if someone could tell me how to collect the right data for diagnosis.

- Paul
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: Regular ext4 corruption

Post by Perryg »

Known problem with certain kernels and Ext4. Make sure that the host IO cache is enabled.
paul.dorman
Posts: 3
Joined: 26. Oct 2010, 01:43
Primary OS: Ubuntu other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows

Re: Regular ext4 corruption

Post by paul.dorman »

It would be great to know what the exact problem is. Enabling host IO cache does not fix it. For instance, I just collected this from a system that's just faulted with host IO cache enabled:

Code: Select all

[ 2209.993822] ata3.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
[ 2209.993856] ata3.00: failed command: WRITE FPDMA QUEUED
[ 2209.993888] ata3.00: cmd 61/18:00:40:95:2a/00:00:00:00:00/40 tag 0 ncq 12288 out 
[ 2209.993910]          res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 2209.993933] ata3.00: status: { DRDY }
[ 2209.993954] ata3.00: failed command: WRITE FPDMA QUEUED
[ 2209.993978] ata3.00: cmd 61/08:08:a0:4d:5c/00:00:00:00:00/40 tag 1 ncq 4096 out 
[ 2209.993994]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2209.994012] ata3.00: status: { DRDY }
[ 2209.994043] ata3: hard resetting link
[ 2210.323630] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2215.321060] ata3.00: qc timeout (cmd 0xec)
[ 2215.321134] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 2215.321158] ata3.00: revalidation failed (errno=-5)
[ 2215.321193] ata3: hard resetting link
[ 2215.656577] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2225.656640] ata3.00: qc timeout (cmd 0xec)
[ 2225.656670] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 2225.656673] ata3.00: revalidation failed (errno=-5)
[ 2225.656681] ata3: limiting SATA link speed to 1.5 Gbps
[ 2225.656689] ata3: hard resetting link
[ 2225.988292] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 2255.991484] ata3.00: qc timeout (cmd 0xec)
[ 2255.991541] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 2255.991548] ata3.00: revalidation failed (errno=-5)
[ 2255.991557] ata3.00: disabled
[ 2255.991574] ata3.00: device reported invalid CHS sector 0
[ 2255.991580] ata3.00: device reported invalid CHS sector 0
[ 2255.991609] ata3: hard resetting link
[ 2256.321651] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 2256.321695] ata3: EH complete
[ 2256.321718] sd 2:0:0:0: [sda] Unhandled error code
[ 2256.321721] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2256.321725] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 5c 4d a0 00 00 08 00
[ 2256.321733] end_request: I/O error, dev sda, sector 6049184
[ 2256.321739] Buffer I/O error on device sda1, logical block 755892
[ 2256.321741] lost page write due to I/O error on sda1
[ 2256.321782] sd 2:0:0:0: [sda] Unhandled error code
[ 2256.321785] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2256.321788] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 2a 95 40 00 00 18 00
[ 2256.321796] end_request: I/O error, dev sda, sector 2790720
[ 2256.321799] Buffer I/O error on device sda1, logical block 348584
[ 2256.321801] lost page write due to I/O error on sda1
[ 2256.321808] Buffer I/O error on device sda1, logical block 348585
[ 2256.321810] lost page write due to I/O error on sda1
[ 2256.321813] Buffer I/O error on device sda1, logical block 348586
[ 2256.321815] lost page write due to I/O error on sda1
[ 2256.321837] sd 2:0:0:0: [sda] Unhandled error code
[ 2256.321839] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2256.321842] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 45 67 88 00 00 28 00
[ 2256.321872] JBD2: Detected IO errors while flushing file data on sda1-8
[ 2256.321872] 
[ 2256.321874] end_request: I/O error, dev sda, sector 4548488
[ 2256.321881] sd 2:0:0:0: [sda] Unhandled error code
[ 2256.321883] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2256.321886] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 00
[ 2256.321894] Aborting journal on device sda1-8.
[ 2256.321892]  84 0a 70 00 00 08 00
[ 2256.321897] end_request: I/O error, dev sda, sector 8653424
[ 2256.321900] Buffer I/O error on device sda1, logical block 1081422
[ 2256.321902] lost page write due to I/O error on sda1
[ 2256.321914] sd 2:0:0:0: [sda] Unhandled error code
[ 2256.321916] sd 2:0:0:0: [sda]  Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 2256.321919] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 00 84 3d 78 00 00 08 00
[ 2256.321927] end_request: I/O error, dev sda, sector 8666488
...
...
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: Regular ext4 corruption

Post by Perryg »

I don't know the exact problem with your install but http://www.google.com/search?q=etx4+file+corruption shows some things that are interesting. I know that turning on the host IO cache solved my problem. From what I hear the fix is in one of the newer kernels, but I don't remember which version it is off hand.
Sasquatch
Volunteer
Posts: 17798
Joined: 17. Mar 2008, 13:41
Primary OS: Debian other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Windows XP, Windows 7, Linux
Location: /dev/random

Re: Regular ext4 corruption

Post by Sasquatch »

So where are you seeing this corruption? On the VM side, or on the Host side? If it's the Host, then it's no wonder. BTRFS isn't final, it's beta. There isn't an fsck utility for it either. Get a file system that is thoroughly tested, out of development and with proper file system integrity checks.
Read the Forum Posting Guide before opening a topic.
VirtualBox FAQ: Check this before asking questions.
Online User Manual: A must read if you want to know what we're talking about.
Howto: Install Linux Guest Additions
Howto: Use Shared Folders on Linux Guest
See the Tutorials and FAQ section at the top of the Forum for more guides.
Try searching the forums first with Google and add the site filter for this forum.
E.g. install guest additions site:forums.virtualbox.org

Retired from this Forum since OSSO introduction.
paul.dorman
Posts: 3
Joined: 26. Oct 2010, 01:43
Primary OS: Ubuntu other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows

Re: Regular ext4 corruption

Post by paul.dorman »

I've had no issues with my btrfs partition on the host. The issue is with the VM ext4 file systems. Documented evidence or a diagnostic procedure would me more useful than conjecture for resolving the issue. Thanks for your input though. Very helpful.
Sasquatch
Volunteer
Posts: 17798
Joined: 17. Mar 2008, 13:41
Primary OS: Debian other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Windows XP, Windows 7, Linux
Location: /dev/random

Re: Regular ext4 corruption

Post by Sasquatch »

Having file system corruption inside the VDI (i.e. guest) is very rare and AFAIK didn't happen yet, unless the VDI itself got corrupted due to Host FS corruption. Can you please test this corruption case with a different Host FS, like ext3?
Read the Forum Posting Guide before opening a topic.
VirtualBox FAQ: Check this before asking questions.
Online User Manual: A must read if you want to know what we're talking about.
Howto: Install Linux Guest Additions
Howto: Use Shared Folders on Linux Guest
See the Tutorials and FAQ section at the top of the Forum for more guides.
Try searching the forums first with Google and add the site filter for this forum.
E.g. install guest additions site:forums.virtualbox.org

Retired from this Forum since OSSO introduction.
skestle
Posts: 1
Joined: 13. Feb 2011, 15:07
Primary OS: Ubuntu other
VBox Version: OSE other
Guest OSses: Ubuntu

Re: Regular ext4 corruption

Post by skestle »

I had these problem(s) ('failed command: WRITE FPDMA QUEUED' boot failures), facing corruption for any change that I made to the hard drive, and tried all above solutions and nothing really worked (different client FS, Host Cache IO) until I moved all VBox files (the .VirtualBox directory, for hard drives, and the VirtualBox VMs folder for the snapshots, which also has the config) to an old ext2 drive.

I then linked the folders back, and everything worked seamlessly since. (Well, after I'd found out that 'sudo fsck -n' will always return errors since your drive is changing while you're running fsck).

BTW, my host VM drive is ext2, and I didn't dare install ext4 on the VM, opting for ext3 instead.
jigglywiggly
Posts: 29
Joined: 18. Aug 2009, 03:29
Primary OS: MS Windows 2008
VBox Version: OSE Debian
Guest OSses: windows 7

Re: Regular ext4 corruption

Post by jigglywiggly »

I am on 4.0.4 also experiencing this, have host i/o.

Having huge troubles installing Debian in a VM. (Ubuntu 10.04 host, 2.6.32.30 kernel or something, I even tried the natty 2.6.38 kernel, to no avail.
Can't install Windows 7 x64, as it just BSOD's in the middle of the install files, or complains about corrupt files.

It is quite a stress thing the VMs are on, 9x500gb hds in RAID 5, mdadm. But I didn't use to have this problem with the older virtualboxes. (HDs are fine, smart reports fine, and I ran benchies on all of them, nothing unusual, the system is perfectly stable)

Putting more cpus like 4, drastically makes it less stable.
Server specs: q6600 @ 3.5 ghz (stable OC, I ran intel burn test for 5 hours straight)
8 gigs of ram 1066
6200 LE
java_artisan
Posts: 2
Joined: 22. Feb 2011, 16:43
Primary OS: Ubuntu other
VBox Version: OSE Debian
Guest OSses: ubuntu

Re: Regular ext4 corruption

Post by java_artisan »

Did you guys succeed in the eliminating the corruption problems ? I'm asking because I'm having it too for the MySQL data files. I'm using ext4 for both the host and the guest. VB 4.0.4, 64bits, ubuntu 10.10.

And can any tell whether they're experiencing something comparable to this ticket ? http://www.virtualbox.org/ticket/8511 My VM's are locking up for a yet unknown reason. But I'm suspecting it's about file system corruptions.

Thanks !

Jan
Sasquatch
Volunteer
Posts: 17798
Joined: 17. Mar 2008, 13:41
Primary OS: Debian other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Windows XP, Windows 7, Linux
Location: /dev/random

Re: Regular ext4 corruption

Post by Sasquatch »

If you want to avoid this while still using EXT4, then don't use SATA for the hard drive controller in the VM settings. Other option is to use a different file system for the VMs storage (host side, of course). This corruption only occurs on the Host side in special cases where a lot of I/O is involved. A database could give that, but it should also cache a lot in memory to minimise the I/O.
Read the Forum Posting Guide before opening a topic.
VirtualBox FAQ: Check this before asking questions.
Online User Manual: A must read if you want to know what we're talking about.
Howto: Install Linux Guest Additions
Howto: Use Shared Folders on Linux Guest
See the Tutorials and FAQ section at the top of the Forum for more guides.
Try searching the forums first with Google and add the site filter for this forum.
E.g. install guest additions site:forums.virtualbox.org

Retired from this Forum since OSSO introduction.
frank
Oracle Corporation
Posts: 3362
Joined: 7. Jun 2007, 09:11
Primary OS: Debian Sid
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows
Location: Dresden, Germany
Contact:

Re: Regular ext4 corruption

Post by frank »

Note that we fixed this problem, see the public ticket 8773 and others. Affected are guest with SATA with guest RAM >= 2GB.
Sasquatch
Volunteer
Posts: 17798
Joined: 17. Mar 2008, 13:41
Primary OS: Debian other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Windows XP, Windows 7, Linux
Location: /dev/random

Re: Regular ext4 corruption

Post by Sasquatch »

Thanks for the fix Frank. That explains why I never got this issue, because none of my VMs have 2 GB of RAM. Lucky me I guess.
Read the Forum Posting Guide before opening a topic.
VirtualBox FAQ: Check this before asking questions.
Online User Manual: A must read if you want to know what we're talking about.
Howto: Install Linux Guest Additions
Howto: Use Shared Folders on Linux Guest
See the Tutorials and FAQ section at the top of the Forum for more guides.
Try searching the forums first with Google and add the site filter for this forum.
E.g. install guest additions site:forums.virtualbox.org

Retired from this Forum since OSSO introduction.
Post Reply