3.0.12->3.1.0 the SATA controller fails

Discussions about using Linux guests in VirtualBox.
Post Reply
philipcraig
Posts: 1
Joined: 12. Nov 2009, 19:56
Primary OS: MS Windows 7
VBox Version: OSE other
Guest OSses: Debian

3.0.12->3.1.0 the SATA controller fails

Post by philipcraig »

Hi,

I run a Debian 64 guest under Windows 7 host. The guest has access to 3 raw PhysicalDevices (running as LVM on RAID) via createrawvmdk. This all worked fine under 3.0.12. The 3 physical drives are all attached to the SATA controller as devices 0 to 2.

After upgrading to 3.1.0, the guest will boot into grub, but then the linux kernel won't run after loading. It shows errors as in the screen shots attached:
The first batch of errors
The first batch of errors
Initial after choosing Rescue.png (8.7 KiB) Viewed 3319 times
After 180 seconds we get more errors as attached:
The attachment After 180 seconds.png is no longer available
Similarly, even the debian squeeze installer CD will boot, but the rescue mode will not run, giving the same errors. Again this is fine in 3.0.12.

The workaround is either to downgrade to 3.0.12, or to attach the vmdks to the SCSI controller (Lsilogic) instead.

Is there any known issue with the SATA controller, or with the linux modules that use it?
Attachments
The errors after 180 seconds
The errors after 180 seconds
After 180 seconds.png (23.72 KiB) Viewed 3322 times
mikevm
Posts: 6
Joined: 4. Dec 2009, 17:21
Primary OS: Debian Lenny
VBox Version: PUEL
Guest OSses: debian

Re: 3.0.12->3.1.0 the SATA controller fails

Post by mikevm »

I came to the forum to report exactly this problem but it occurs for me on earlier versions of virtualbox, and has been ongoing for some years now and I wanted to share my observations.

Firstly, vmware server 1.0 had occasional disk and controller errors inside of the virtual machines, across several different physical hosts. The hosts themselves never had any disk or controller troubles and the vmdk's of course were on raid5 on locally attached media. The issue is relevent because virtualbox 3.x onwards also has identical problems (on yet still different host hardware), and your report exactly matches the symptoms. I have a virtualbox 3.0.8 guest spitting out your errors for example:

[1690993.020333] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen
[1690993.022422] ata1.00: cmd 61/02:00:53:b2:11/00:00:00:00:00/40 tag 0 ncq 1024 out
[1690993.022423] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[1690993.546246] ata1.00: status: { DRDY }
(long list of repeating errors omitted)

Othertimes however, the errors are simply disk read/write which if they happen at the right time, result in an aborted journal under ext3, forcing the disk to be remounted r/o and effectively hosing the vm since no work can get done with a read only disk. This has been the behavior I have experienced from both vmware and virtualbox for several years now. It's just occasional enough that I haven't decided to report it till now thinking it was something that would get addressed, but it's persistient across hardware (at least 3 physically different hosts) and software (every version of virtualbox 3.x onwards, never tried < version 3 however).

My feeling has been that the root of the problem is loading on the host. I have a big nightly job to backup the servers and this causes high disk i/o and cpu (compression), and these issues I have seen appear to only occur during the backup window. I can't as yet reproduce the problem on command but it's there and ongoing and I'd love to help find a fix.
lkraav
Posts: 13
Joined: 21. Feb 2008, 09:35
Primary OS: Mac OS X Leopard
VBox Version: PUEL
Guest OSses: Win7 RC (X86, X64), Gentoo Linux

Re: 3.0.12->3.1.0 the SATA controller fails

Post by lkraav »

experiencing the same, Gentoo host, Gentoo guest, vmdk monolithic disk images.

3.1.0 - controller fails on kernel detection
3.0.12 - everything works just fine

anyone file a ticket yet?

edit: oh yeah, 3.1.2 OS X host worked fine with the same images and configuration.

edit2: this seems to be fixed in 3.1.2, i updated linux host and have sata working. there's a note in changelog also.
Post Reply