I am running OpenSolaris b134 as a VirtualBox host, with a Linux guest.
I have experienced 6-7 instances of my zpool getting corrupted. I am wondering if anyone else has ever seen this before.
This is on a mirrored zpool - using drives from two different manufacturers (i.e. it is very unlikely both drives would fail at the same time, with the same blocks going bad). I initially thought I might have a memory problem - which could explain the simultaneous disk failures. After running memory diagnostics for 24 hours with no errors reported, I am beginning to suspect it might be something else.
I am using shared folders from the guest - mounted at guest boot up time.
Is it possible that the Solaris vboxsf shared folder kernel driver is causing corruption? Being in the kernel, would it allow bypassing of the normal zfs integrity mechanisms? Or is it possible there is some locking issue or race condition that triggers the corruption?
Anecdotally, when I see the corruption the sequence of events seems to be:
- dmesg reports various vbox drivers being loaded (normal - just loading the drivers)
- Guest boots - gets just pass grub boot screen to the initial redhat boot screen.
- The Guest hangs and never boots.
- zpool status -v reports corrupted files. The files are on the zpool containing the shared folders and the VirtualBox images
Thoughts?
Could shared folders cause ZFS zpool corruption?
-
- Posts: 18
- Joined: 20. Aug 2010, 19:36
- Primary OS: OpenSolaris 10
- VBox Version: OSE other
- Guest OSses: Win XP x64, Win7 x64, Win Server 2008 x64
Re: Could shared folders cause ZFS zpool corruption?
I must agree that the vboxsrv is a bit quirky. It is run in a Windows guest and I occasionally get the error message "Parameter is Incorrect" when trying to access "shares" on the vboxsrv. I have so far not experienced any corruption on the storage pools. I'm using the old implementation of ZFS that came with 2009.06 (and I will not upgrade the pool until there is an official stable release of OSOL/OpenIndiana/Illumos that features the newer version of ZFS). It would be interesting if you would get the same problems with other guest OSes and/or different harddrives / controllers (preferrably something LSI based ...).
-
- Volunteer
- Posts: 1064
- Joined: 10. May 2007, 10:27
- Primary OS: MS Windows Vista
- VBox Version: PUEL
- Guest OSses: Windows, Linux, Solaris
Re: Could shared folders cause ZFS zpool corruption?
The shared folder service on your host is using simple file I/O operations on the application level. Only shared folders in guests are implemented as device/file system drivers.
-
- Volunteer
- Posts: 321
- Joined: 31. May 2008, 10:00
- Primary OS: OpenSolaris 11
- VBox Version: OSE other
- Guest OSses: WinXP, RedHat, Ubuntu
Re: Could shared folders cause ZFS zpool corruption?
I never had any zpool corruption ever. I have used several versions of VirtualBox on several different builds of OpenSolaris. Now I am using b134 without any zpool corruption.
-
- Posts: 18
- Joined: 20. Aug 2010, 19:36
- Primary OS: OpenSolaris 10
- VBox Version: OSE other
- Guest OSses: Win XP x64, Win7 x64, Win Server 2008 x64
Re: Could shared folders cause ZFS zpool corruption?
My impression of zpool and zfs is that it is only possible to cause corruption in the stack when you interfere with the internal error handling of the stack which is not likely to happen with applications such as VirtualBox. I suspect that's what you're trying to imply sandervl when you say that I/O operations only take place on the application level.
-
- Posts: 5
- Joined: 7. May 2009, 19:18
- Primary OS: Mac OS X other
- VBox Version: OSE other
- Guest OSses: OpenSolaris
Re: Could shared folders cause ZFS zpool corruption?
An update:
After seeing this error again I reran memory diagnostics and let them run overnight. This time I did see some memory errors - so it looks like this problem is due to faulty hardware.
Apologies for the false alarm
Thanks for all of your input
Warren
After seeing this error again I reran memory diagnostics and let them run overnight. This time I did see some memory errors - so it looks like this problem is due to faulty hardware.
Apologies for the false alarm
Thanks for all of your input
Warren