SBS2003 guest crashes post snapshot, assertion failed

Discussions related to using VirtualBox on Linux hosts.
Post Reply
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

Hi All

SBS2003 guest crashed 40 mins after snapshot.

I had been pigz zipping the base vmdk for the last 20mins when the crash occured

64bit xubuntu , kernel 3.0.0-15 (kernel 3, really??!)


Code: Select all

02:05:37.075 Changing the VM state from 'RUNNING' to 'SUSPENDING'.
02:05:37.076 AIOMgr: Endpoint for file '/mnt/ssd1/VMDisks/sbsw2k3live.vmdk' (flags 000c0781) created successfully
02:05:38.186 PDMR3Suspend: 1 111 267 977 ns run time
02:05:38.186 Changing the VM state from 'SUSPENDING' to 'SUSPENDED'.
02:05:47.803 Changing the VM state from 'SUSPENDED' to 'SAVING'.
02:06:16.878 SSM: Footer at 0x786fc0a1 (2020589729), 35 directory entries.
02:06:16.878 VUSB: attached 'HidMouse' to port 1
02:06:16.878 SSM: Successfully saved the VM state to '/mnt/satavms/VBox/sbsw2k3Live/Snapshots/2012-02-20T15-26-33-537189000Z.sav'
02:06:16.878 Changing the VM state from 'SAVING' to 'SUSPENDED'.
02:06:16.882 DrvBlock: Flushes will be ignored
02:06:16.882 DrvBlock: Async flushes will be passed to the disk
02:06:16.882 AIOMgr: Endpoint for file '/mnt/ssd1/VMDisks/sbsw2k3live.vmdk' (flags 000c0781) created successfully
02:06:18.006 AIOMgr: Endpoint for file '/mnt/satavms/VBox/sbsw2k3Live/Snapshots/{d7797e24-21fd-471c-bbc7-7693751c3f44}.vmdk' (flags 000c0723) created successfully
02:06:21.503 AHCI: LUN#0: disk, PCHS=16383/16/63, total number of sectors 284508160
02:06:21.503 AHCI ATA: LUN#0: disk, PCHS=16383/16/63, total number of sectors 284508160
02:06:21.504 ************************* CFGM dump *************************
02:06:21.504 [/Devices/ahci/0/] (level 0)
02:06:21.504   PCIBusNo      <integer> = 0x0000000000000000 (0)
02:06:21.504   PCIDeviceNo   <integer> = 0x000000000000000d (13)
02:06:21.504   PCIFunctionNo <integer> = 0x0000000000000000 (0)
02:06:21.504   Trusted       <integer> = 0x0000000000000001 (1)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/Config/] (level 1) (restricted root)
02:06:21.504   Bootable        <integer> = 0x0000000000000001 (1)
02:06:21.504   PortCount       <integer> = 0x0000000000000001 (1)
02:06:21.504   PrimaryMaster   <integer> = 0x0000000000000000 (0)
02:06:21.504   PrimarySlave    <integer> = 0x0000000000000001 (1)
02:06:21.504   SecondaryMaster <integer> = 0x0000000000000002 (2)
02:06:21.504   SecondarySlave  <integer> = 0x0000000000000003 (3)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/Config/Port0/] (level 2)
02:06:21.504   NonRotationalMedium <integer> = 0x0000000000000000 (0)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#0/] (level 1)
02:06:21.504   Driver <string>  = "Block" (cb=6)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#0/AttachedDriver/] (level 2)
02:06:21.504   Driver <string>  = "VD" (cb=3)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#0/AttachedDriver/Config/] (level 3) (restricted root)
02:06:21.504   BlockCache <integer> = 0x0000000000000001 (1)
02:06:21.504   Format     <string>  = "VMDK" (cb=5)
02:06:21.504   Path       <string>  = "/mnt/satavms/VBox/sbsw2k3Live/Snapshots/{d7797e24-21fd-471c-bbc7-7693751c3f44}.vmdk" (cb=84)
02:06:21.504   Type       <string>  = "HardDisk" (cb=9)
02:06:21.504   UseNewIo   <integer> = 0x0000000000000001 (1)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#0/AttachedDriver/Config/Parent/] (level 4)
02:06:21.504   Format <string>  = "VMDK" (cb=5)
02:06:21.504   Path   <string>  = "/mnt/ssd1/VMDisks/sbsw2k3live.vmdk" (cb=35)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#0/Config/] (level 2) (restricted root)
02:06:21.504   Mountable <integer> = 0x0000000000000000 (0)
02:06:21.504   Type      <string>  = "HardDisk" (cb=9)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#999/] (level 1)
02:06:21.504   Driver <string>  = "MainStatus" (cb=11)
02:06:21.504
02:06:21.504 [/Devices/ahci/0/LUN#999/Config/] (level 2) (restricted root)
02:06:21.504   DeviceInstance        <string>  = "ahci/0" (cb=7)
02:06:21.504   First                 <integer> = 0x0000000000000000 (0)
02:06:21.504   Last                  <integer> = 0x0000000000000000 (0)
02:06:21.504   pConsole              <integer> = 0x00007fa7a000d300 (140357920674560)
02:06:21.504   papLeds               <integer> = 0x00007fa7a000d600 (140357920675328)
02:06:21.504   pmapMediumAttachments <integer> = 0x00007fa7a000d808 (140357920675848)
02:06:21.504
02:06:21.504 ********************* End of CFGM dump **********************
02:06:22.711 Changing the VM state from 'SUSPENDED' to 'RESUMING'.
02:06:22.711 Changing the VM state from 'RESUMING' to 'RUNNING'.
02:07:26.275 AHCI#0: Canceled write at offset 10829262848 (65536 bytes left) returned rc=VINF_SUCCESS
02:07:26.467 AHCI#0: Canceled write at offset 10829393920 (65536 bytes left) returned rc=VINF_SUCCESS
02:07:26.722 AHCI#0: Canceled write at offset 1019305984 (16384 bytes left) returned rc=VINF_SUCCESS
02:36:42.043 RTC: period=0x20 (32) 1024 Hz
02:37:32.935 AHCI#0: Canceled read at offset 61412016128 (4096 bytes left) returned rc=VINF_SUCCESS
02:37:53.646 AHCI#0: Canceled write at offset 2055139328 (49152 bytes left) returned rc=VINF_SUCCESS
02:37:56.801 AHCI#0: Canceled read at offset 48932245504 (4096 bytes left) returned rc=VINF_SUCCESS
02:37:56.801 AHCI#0: Canceled read at offset 15944327168 (4096 bytes left) returned rc=VINF_SUCCESS
02:37:59.145 AHCI#0: Canceled write at offset 61337858048 (8192 bytes left) returned rc=VINF_SUCCESS
02:37:59.258 AHCI#0: Canceled write at offset 61337858048 (8192 bytes left) returned rc=VINF_SUCCESS
02:38:08.707 AHCI#0: Canceled write at offset 13769650176 (4096 bytes left) returned rc=VINF_SUCCESS
02:38:23.791 AHCI#0: Canceled write at offset 22092750848 (4096 bytes left) returned rc=VINF_SUCCESS
02:38:23.791 AHCI#0: Canceled write at offset 22092750848 (4096 bytes left) returned rc=VINF_SUCCESS
02:38:24.273 AHCI#0: Canceled write at offset 2467938304 (53248 bytes left) returned rc=VINF_SUCCESS
02:38:24.273 AHCI#0: Canceled write at offset 2467938304 (53248 bytes left) returned rc=VINF_SUCCESS
02:38:24.490 AHCI#0: Canceled write at offset 27260436480 (4096 bytes left) returned rc=VINF_SUCCESS
02:38:24.490 AHCI#0: Canceled write at offset 27260436480 (4096 bytes left) returned rc=VINF_SUCCESS
02:39:07.353 RTC: period=0x200 (512) 64 Hz
02:40:07.024
02:40:07.024 !!Assertion Failed!!
02:40:07.024 Expression: pSgBuf->cbSegLeft <= 5 * _1M && (uintptr_t)pSgBuf->pvSegCur >= (uintptr_t)pSgBuf->paSegs[pSgBuf->idxSeg].pvSeg && (uintptr_t)pSgBuf->pvSegCur + pSgBuf->cbSegLeft <= (uintptr_t)pSgBuf->paSegs[pSgBuf->idxSeg].pvSeg + pSgBuf->paSegs[pSgBuf->idxSeg].cbSeg
02:40:07.024 Location  : /home/vbox/vbox-4.1.8/src/VBox/Runtime/common/misc/sg.cpp(54) void* sgBufGet(PRTSGBUF, size_t*)
02:40:07.024 pSgBuf->idxSeg=0 pSgBuf->cSegs=1 pSgBuf->pvSegCur=00007fa6b530a000 pSgBuf->cbSegLeft=131072 pSgBuf->paSegs[0].pvSeg=00007fa7a001fd90 pSgBuf->paSegs[0].cbSeg=1
Any ideas??

I can't reliably snapshot my machine :( lol

Mic.
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by Perryg »

Are you really trying to make a compressed image of the guest while it is running?
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

Yes :)

Why not?

The machine has been snapshotted and all updates are written to the snapshot vdi.

Presumably therefore the base vdi is 'read-only'

Am I missing something??
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

Is this seriously a dumb idea?

I have the vdi on an SSD, so I presumed that there would be certainly no IO issues. Also given that the copy rate is limited by the destination drive to 37mb/s.

I don't see why this shouldn't work. Can anyone shed some light?

Cheers
Michael
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by mpack »

I think there is no reason why it should not work in principle. However the details are everything: the archiving tool or whatever must not block access to the base VDI while this is going on, because VirtualBox needs read access to it. I can't tell if that is what is going on in your case without careful debugging, which of course I have no intention of doing until I have the need!
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

Thanks mpack. Thats what I thought... locking blocking etc, I couldn't imagine that this would be occuring. But who knows.

I was just checking my sanity. perryg sounded very surprised about what I was doing.

I'm using pigz in order to keep up a good 30mb/s, as I seem to only be able to pul 10mb/s with 7z. Naturally these days, cores are a dime a dozen but ghz are harder to come by (that said, I do have an E3 1220)

God knows what pigz is doing with the file, only way to find out is to play. Maybe even a strace is in order. Can't say that I'll be debugging any code though :)

Ill report back any findings

Cheers

Edit: I don't actually have any faith even in my plain snapshots yet. The image is a vmware converted vmdk, and my kernel is linux 3. My initial test with pause actually crashed the machine (viewtopic.php?f=6&t=45224)
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by Perryg »

My surprise is that anyone trying to keep a reliable backup on a running anything can't be too concerned with their data.
Not to mention that pigz is gzip (compression program) on steroids and tries to utilise multiple processors for speed, which should not be confused with a backup program that actually takes into account that some things should not be touched on a running machine.

But that is just an opinion from an old hand.
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

That's the second time some one has said that! lol... Although, imo, the last person was rather rude and said "You must really HATE you're data"..!

You probably won't believe me, but I know what I'm doing. I know the risks involved.
But I would suggest that the risk of actually screwing the vdi is probably higher due to vbox crashing whilst taking a snapshot or merging snapshotsr. As long as neither of those occur, I can't see why my copy of the frozen-in-time vdi wouldn't be perfectly fine.

I explained a few of my thoughts, i.e how it's similar to taking a shadow copy (without the obvious perks of the VSS service..), in this post: viewtopic.php?f=1&t=48075

At the end of the day, taking a copy of the frozen-in-time vdi is better than nothing. It's not like a large amount of data is going to be mixed up or out of order... maybe a couple of K from the last 5minutes at worst...

I have real/normal sql and exchange backups running from inside the guest :) I'm not completely reckless :)
Perryg
Site Moderator
Posts: 34369
Joined: 6. Sep 2008, 22:55
Primary OS: Linux other
VBox Version: OSE self-compiled
Guest OSses: *NIX

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by Perryg »

Like I said it is just my opinion. Its not my intent to be rude just honest in my belief.
But I am fairly familiar with virtualization and some of the things that you should and should not do.
One of them being to use the host to compress the guests file structure while it is hot. You have to realise that as far as the host is concerned the guest is nothing but a huge file and pigz will have no checks to prevent against damage. That said it is yours and you should do what you want.
UncleBen
Posts: 15
Joined: 11. Feb 2012, 14:29

Re: SBS2003 guest crashes post snapshot, assertion failed

Post by UncleBen »

No no sorry, I didn't mean to imply that you were rude at all.

Mic.
Post Reply