Could shared folders cause ZFS zpool corruption?

Discussions related to using VirtualBox on Solaris hosts.
Post Reply
wstrange
Posts: 5
Joined: 7. May 2009, 19:18
Primary OS: Mac OS X other
VBox Version: OSE other
Guest OSses: OpenSolaris

Could shared folders cause ZFS zpool corruption?

Post by wstrange »

I am running OpenSolaris b134 as a VirtualBox host, with a Linux guest.

I have experienced 6-7 instances of my zpool getting corrupted. I am wondering if anyone else has ever seen this before.

This is on a mirrored zpool - using drives from two different manufacturers (i.e. it is very unlikely both drives would fail at the same time, with the same blocks going bad). I initially thought I might have a memory problem - which could explain the simultaneous disk failures. After running memory diagnostics for 24 hours with no errors reported, I am beginning to suspect it might be something else.

I am using shared folders from the guest - mounted at guest boot up time.

Is it possible that the Solaris vboxsf shared folder kernel driver is causing corruption? Being in the kernel, would it allow bypassing of the normal zfs integrity mechanisms? Or is it possible there is some locking issue or race condition that triggers the corruption?

Anecdotally, when I see the corruption the sequence of events seems to be:

- dmesg reports various vbox drivers being loaded (normal - just loading the drivers)
- Guest boots - gets just pass grub boot screen to the initial redhat boot screen.
- The Guest hangs and never boots.
- zpool status -v reports corrupted files. The files are on the zpool containing the shared folders and the VirtualBox images


Thoughts?
gu99roax
Posts: 18
Joined: 20. Aug 2010, 19:36
Primary OS: OpenSolaris 10
VBox Version: OSE other
Guest OSses: Win XP x64, Win7 x64, Win Server 2008 x64

Re: Could shared folders cause ZFS zpool corruption?

Post by gu99roax »

I must agree that the vboxsrv is a bit quirky. It is run in a Windows guest and I occasionally get the error message "Parameter is Incorrect" when trying to access "shares" on the vboxsrv. I have so far not experienced any corruption on the storage pools. I'm using the old implementation of ZFS that came with 2009.06 (and I will not upgrade the pool until there is an official stable release of OSOL/OpenIndiana/Illumos that features the newer version of ZFS). It would be interesting if you would get the same problems with other guest OSes and/or different harddrives / controllers (preferrably something LSI based ...).
sandervl
Volunteer
Posts: 1064
Joined: 10. May 2007, 10:27
Primary OS: MS Windows Vista
VBox Version: PUEL
Guest OSses: Windows, Linux, Solaris

Re: Could shared folders cause ZFS zpool corruption?

Post by sandervl »

The shared folder service on your host is using simple file I/O operations on the application level. Only shared folders in guests are implemented as device/file system drivers.
kebabbert
Volunteer
Posts: 321
Joined: 31. May 2008, 10:00
Primary OS: OpenSolaris 11
VBox Version: OSE other
Guest OSses: WinXP, RedHat, Ubuntu

Re: Could shared folders cause ZFS zpool corruption?

Post by kebabbert »

I never had any zpool corruption ever. I have used several versions of VirtualBox on several different builds of OpenSolaris. Now I am using b134 without any zpool corruption.
gu99roax
Posts: 18
Joined: 20. Aug 2010, 19:36
Primary OS: OpenSolaris 10
VBox Version: OSE other
Guest OSses: Win XP x64, Win7 x64, Win Server 2008 x64

Re: Could shared folders cause ZFS zpool corruption?

Post by gu99roax »

My impression of zpool and zfs is that it is only possible to cause corruption in the stack when you interfere with the internal error handling of the stack which is not likely to happen with applications such as VirtualBox. I suspect that's what you're trying to imply sandervl when you say that I/O operations only take place on the application level.
wstrange
Posts: 5
Joined: 7. May 2009, 19:18
Primary OS: Mac OS X other
VBox Version: OSE other
Guest OSses: OpenSolaris

Re: Could shared folders cause ZFS zpool corruption?

Post by wstrange »

An update:


After seeing this error again I reran memory diagnostics and let them run overnight. This time I did see some memory errors - so it looks like this problem is due to faulty hardware.

Apologies for the false alarm :-)

Thanks for all of your input

Warren
Post Reply