Page 1 of 1

Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 13:22
by danf84
Hello,

I've recently upgraded my VirtualBox (from an older 4th version), and yesterday did a file system intensive operation (converting a SourceSafe repository with 32600 revisions into SVN). By day's end, most of my VMs BSoD.

The first BSoD was a KERNEL_DATA_INPAGE_ERROR/0x00000077 (arg1: 0xC0000056); see the following screenshot:
0x00000077 BSOD1
0x00000077 BSOD1
BSOD1.png (18.24 KiB) Viewed 1702 times
Today, new kind of BSoD: 0x000000F4 (arg1: 0x00000003); see the following screenshot:
0x000000F4 BSOD2
0x000000F4 BSOD2
BSOD2.png (19.28 KiB) Viewed 1702 times
The dump never gets completed (guests are configured to produce 64K minidump); VirtualBox GUI is showing continuous HDD access during the BSoD.

So to the suspects:

1) The VB version upgrade? - I doubt that it's the cause, but it's what has changed.
2) SATA:
a) This is the first time I tried SATA in an I/O intensive environment; perhaps switch back to IDE?
b) I always had Host I/O cache setting enabled for IDE (default) and SATA (default is off) controllers. This time I actually RTFM and disabled Host I/O cache for SATA since SATA is already multithreaded and the cache function itself is pretty useless and can only cause problems (apparently). So perhaps I should revert to the former setting and *enable* Host I/O cache for SATA controller?
c) Try older Intel SATA drivers, i.e. 7.8.0.1012; current version is 10.1.0.1008.
3) Long shot, but I enabled 4 processors to the guest; used to be 1.

At the moment a guest is running with the good old trusty PATA/IDE controller and has gone a little further than the last crash point. Having said that, I ran the thing this morning under SATA and it's gone slightly further than the last time when it crashed. Point is: it's not 100% consistent. What *is* 100% consistent is this: I have a Saved State about 30 seconds before the BSoD occurs. I saved it in hope that someone might advise me how can I identify the problem since I can reproduce it consistently and quickly under a particular snapshot.

Btw, the BSoD looks weird in VirtualBox. I get an odd thick red line horizontally across the top of the frozen desktop; to see the actual BSoD I have to go full screen for a second for VB to repaint the BSoD.
VirtualBox not showing BSoD properly
VirtualBox not showing BSoD properly
BSOD-OddLooking.png (81.16 KiB) Viewed 1702 times
Last week I replaced a hard drive on my host; so just in case I did a full HDD test on the new drive (using SeaTools). I also ran a Memtest86+ for a few minutes this morning (will run it over night to test some more).
I have a sneaky feeling these problems are being caused by SATA (not sure which component: Intel's driver or VBOX Controller). What I can't quite explain is how after the BSoD on the main culprit guest, 2 other guests froze later on. On another guest I was copying 20000 little files from a Shared Folder - to see if it's the high I/O that's causing the crash; it did only a few minutes of copying (10%) and the same BSoD (1) occured. Another VM froze which wasn't copying or doing anything; it was just having a big program open. It was a poorly configured VM; it was running a big IDE (Embarcadero Delphi XE) with only 192MB of RAM; it must've been paging a LOT which again would cause much I/O.

I thought I'll be good with my config this time, get Paravirtualised Network drivers, get Intel SATA drivers, to find what? Problems! It's always safer to go with the most standard config because that's what get tested most. *eyeing up the old IDE controller*

Many thanks

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 13:44
by michaln
Rule of thumb - use the disk controller that the OS supports natively. XP with IDE got a heck of a lot more testing (both in physical and in virtualized environments) than XP with AHCI.

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 13:49
by danf84
It's just there was so much "noise" about using SATA; getting the right Intel drivers, countless posts about SATA BSoDs, but no one who is in the know came out and recommended outright to use IDE controllers instead of SATA ones (where Windows XP is concerned). Still, I'd like to conclude that it is SATA therefore if anyone could suggest what I could do in my snapshot (which when resumed within 30 seconds will cause the BSoD) to determine what is the problem?

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 13:59
by michaln
Looking at viewtopic.php?f=24&t=48476 should be a good start...

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 15:31
by danf84
Attached some logs. All showing cancelled AHCI buffer; but I wonder if that's normal if the guest crashes. :/

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 15:44
by Perryg
What happens if you set the guests processor count to no more than 2?

Re: Random BSoDs - High I/O - XP SP3

Posted: 2. Aug 2012, 15:50
by danf84
Perryg: For that I'll need to lose the snapshot which reliably produces the BSoD.

I am currently trying to reproduce the problem by simply copying files (in another VM to the one I attached logs for).

Edit: the snapshot that produces the BSoD is literally 20-30 seconds before crashing. If I pause the process that does a lot of I/O, I avoid the crash. If I start copying a bunch of files from Shared Folders, it crashes.

Re: Random BSoDs - High I/O - XP SP3

Posted: 3. Aug 2012, 14:03
by danf84
Got 2 VMs BSoD today, one VM is called "PARIS-XE2" and the other is called "VSS". The "VSS" VM is the one where lots of I/O was going on. But "PARIS-XE2" VM, where I was doing some light text editing work, BSoD'ed first; it stopped responding to my double clicking to launch an app that compares 2 text files; and after things seemed to grind to a halt, the expected crash occured. It "felt" as though a timeout was caused by the other VM which was hogging up the I/O. Few minutes later the other I/O intensive VM ("VSS") crashed. It explains the symptoms 2 days ago when it all started: 3 VMs crashed, but only 1 was really busy, the others crashed after my attempts to open even the smallest of files on them.
Both logs attached. Screenshot showing 2 different BSoDs also attached.