Kernel panic on GCP Ubuntu 16.04.6 + VBox 6.1.4r136177 + Win10

Discussions related to using VirtualBox on Linux hosts.
Post Reply
Jankins
Posts: 3
Joined: 29. Mar 2020, 06:54
Primary OS: Mac OS X other
VBox Version: OSE other
Guest OSses: Ubunt/Windows
Location: CA, USA

Kernel panic on GCP Ubuntu 16.04.6 + VBox 6.1.4r136177 + Win10

Post by Jankins »

Host OS:
Google cloud nested virtualization enabled
Ubuntu 16.04.6
VirtualBox 6.1.4r136177
Kernel 4.15.0-1058-gcp

Guest OS:
Windows 10
Guest add-on installed
Kaspersky Anti-virus V20(maybe worth mentioning)

Symptom:
Start VM in headless mode from command line and it works well for about 22 minutes. And all in a sudden, host OS crashed.

I enabled kernel dump and got the following information. I checkout multiple kernel dump the stack trace shows almost the same.

From the kernel dump, the problem seems to be vboxdrv driver:
WARNING: kernel relocated [600MB]: patching 99809 gdb minimal_symbol values

KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-1058-gcp
DUMPFILE: dump.202003290103 [PARTIAL DUMP]
CPUS: 6
DATE: Sun Mar 29 01:03:18 2020
UPTIME: 00:11:58
LOAD AVERAGE: 1.05, 0.96, 0.81
TASKS: 289
NODENAME: my-debug-host
RELEASE: 4.15.0-1058-gcp
VERSION: #62-Ubuntu SMP Mon Mar 2 05:29:31 UTC 2020
MACHINE: x86_64 (2200 Mhz)
MEMORY: 9.7 GB
PANIC: ""
PID: 5198
COMMAND: "EMT-0"
TASK: ffff9d51e5cb1740 [THREAD_INFO: ffff9d51e5cb1740]
CPU: 1
STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 5198 TASK: ffff9d51e5cb1740 CPU: 1 COMMAND: "EMT-0"
#0 [ffffad0f04cf76f0] machine_kexec at ffffffffa6866f0e
#1 [ffffad0f04cf7750] __crash_kexec at ffffffffa6933d19
#2 [ffffad0f04cf7818] crash_kexec at ffffffffa6934b4b
#3 [ffffad0f04cf7838] oops_end at ffffffffa683232b
#4 [ffffad0f04cf7860] die at ffffffffa6832a12
#5 [ffffad0f04cf7890] do_trap at ffffffffa682e819
#6 [ffffad0f04cf78d8] do_error_trap at ffffffffa682ec71
#7 [ffffad0f04cf7998] do_error_trap at ffffffffa682ed27
#8 [ffffad0f04cf79d0] do_invalid_op at ffffffffa682f310
#9 [ffffad0f04cf79e0] invalid_op at ffffffffa7200f5b
#10 [ffffad0f04cf7af8] hrtimer_try_to_cancel at ffffffffa69145e9
#11 [ffffad0f04cf7b30] schedule_hrtimeout_range_clock at ffffffffa71bbcd8
#12 [ffffad0f04cf7bd0] rtR0SemEventMultiLnxWait at ffffffffc04ba849 [vboxdrv]
#13 [ffffad0f04cf7c70] VBoxHost_RTSemEventMultiWaitEx at ffffffffc04ba89e [vboxdrv]
#14 [ffffad0f04cf7cc8] SUPR0GetCurrentGdtRw at ffffffffc04a533e [vboxdrv]
#15 [ffffad0f04cf7d30] VBoxHost_RTThreadCtxHookEnable at ffffffffc04bb9d5 [vboxdrv]
#16 [ffffad0f04cf7d58] rtR0MemAllocEx at ffffffffc04b779f [vboxdrv]
#17 [ffffad0f04cf7da0] __check_object_size at ffffffffa6a801d4
#18 [ffffad0f04cf7de8] supdrvIOCtlFast at ffffffffc04a7a15 [vboxdrv]
#19 [ffffad0f04cf7df8] VBoxDrvLinuxIOCtl_6_1_4 at ffffffffc04a3537 [vboxdrv]
#20 [ffffad0f04cf7e60] do_vfs_ioctl at ffffffffa6a9ce64
#21 [ffffad0f04cf7ee8] sys_ioctl at ffffffffa6a9d439
#22 [ffffad0f04cf7f28] do_syscall_64 at ffffffffa6803c5b
#23 [ffffad0f04cf7f50] entry_SYSCALL_64_after_hwframe at ffffffffa7200086
RIP: 00007f3f6388df47 RSP: 00007f3f4b7a2d58 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00007f3f63fd3000 RCX: 00007f3f6388df47
RDX: 0000000000000000 RSI: 00000000000056c0 RDI: 0000000000000007
RBP: 00007f3f4b7a2d60 R8: 0000000000000001 R9: 00007f3f58037000
R10: 000000000000002f R11: 0000000000000246 R12: 00007f3f63fd3000
R13: 00007f3f63fe8000 R14: ffffffffefffffff R15: 00007f3f4b7a2e4f
ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
The panic happened pretty frequent:
reboot system boot 4.15.0-1058-gcp Sun Mar 29 01:05 still running
reboot system boot 4.15.0-1058-gcp Sun Mar 29 01:03 still running
reboot system boot 4.15.0-1058-gcp Sun Mar 29 00:30 still running
reboot system boot 4.15.0-1058-gcp Sun Mar 29 00:28 still running
reboot system boot 4.15.0-1058-gcp Sun Mar 29 00:16 still running
reboot system boot 4.15.0-1058-gcp Sun Mar 29 00:15 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 23:10 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 23:08 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 19:39 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 19:37 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 18:10 still running
reboot system boot 4.15.0-1058-gcp Sat Mar 28 06:10 still running
The kernel dump is from reboot Mar 29 00:30 and panic at 01:03. From the `dmesg`, the tail content is:
[ 539.640842] vboxdrv: 0000000000000000 VBoxEhciR0.r0
[ 539.642608] VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)
[ 539.653535] device vboxnet1 entered promiscuous mode
[ 1975.240598] invalid opcode: 0000 [#1] SMP PTI
[ 1975.245255] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat br_netfilter bridge stp llc aufs ip6table_filter ip6_tables vboxnetadp(OE) vboxnetflt(OE) iptable_filter ip_tables x_tables vboxdrv(OE) overlay kvm_intel kvm irqbypass virtio_rng input_leds pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse virtio_net
[ 1975.311228] CPU: 1 PID: 5198 Comm: EMT-0 Tainted: G OE 4.15.0-1058-gcp #62-Ubuntu
[ 1975.319965] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 1975.329305] RIP: 0010:0xffffffffc060cf70
[ 1975.333347] RSP: 0018:ffffad0f04cf7a98 EFLAGS: 00010002
[ 1975.338849] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000074dc8358
[ 1975.346088] RDX: 00000000012196b8 RSI: 0000000074dc8358 RDI: 00000000033321e8
[ 1975.353326] RBP: 0000000000efe940 R08: 000000000000002b R09: 00000000778b05fc
[ 1975.360563] R10: 00000000ffffffff R11: 0000000000bfdec0 R12: 0000000000de4000
[ 1975.367811] R13: 0000000000bffda0 R14: 0000000000bfe7a0 R15: 0000000077833600
[ 1975.375052] FS: 00007f3f4b7a3700(0000) GS:ffff9d522fc40000(0000) knlGS:fffff8016ad86000
[ 1975.383247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1975.389120] CR2: 0000000074d51df9 CR3: 0000000284960005 CR4: 00000000003626e0
[ 1975.396360] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1975.403598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1975.410841] Call Trace:
[ 1975.413434] ? hrtimer_try_to_cancel+0xc9/0x120
[ 1975.418086] ? schedule_hrtimeout_range_clock+0xc8/0x1b0
[ 1975.423595] ? rtR0SemEventMultiLnxWait.isra.3+0x329/0x370 [vboxdrv]
[ 1975.430349] ? VBoxHost_RTSemEventMultiWaitEx+0xe/0x10 [vboxdrv]
[ 1975.436487] ? SUPR0GetCurrentGdtRw+0xe/0x10 [vboxdrv]
[ 1975.441744] ? VBoxHost_RTThreadCtxHookEnable+0x35/0x40 [vboxdrv]
[ 1975.448050] ? rtR0MemAllocEx+0x17f/0x250 [vboxdrv]
[ 1975.453039] ? rtR0MemAllocEx+0x17f/0x250 [vboxdrv]
[ 1975.458023] ? __check_object_size+0x114/0x1a0
[ 1975.462585] ? supdrvIOCtlFast+0x65/0xb0 [vboxdrv]
[ 1975.467485] ? VBoxDrvLinuxIOCtl_6_1_4+0x57/0x230 [vboxdrv]
[ 1975.473163] ? do_vfs_ioctl+0xa4/0x600
[ 1975.477017] ? SyS_futex+0x7f/0x180
[ 1975.480635] ? SyS_ioctl+0x79/0x90
[ 1975.484163] ? do_syscall_64+0x7b/0x150
[ 1975.488117] ? entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1975.493449] Code: 76 30 74 11 0f 01 c3 0f 82 d7 00 00 00 0f 84 8c 01 00 00 eb 16 0f 01 c2 0f 82 c6 00 00 00 0f 84 7b 01 00 00 eb 05 cc cc cc cc cc <57> 48 8b 7c 24 10 48 89 07 48 b8 ff ff 7f 20 04 42 20 02 48 89
[ 1975.512443] RIP: 0xffffffffc060cf70 RSP: ffffad0f04cf7a98
Thanks.
Attachments
VBoxSVC.log
VBoxSVC.log
(2.51 KiB) Downloaded 4 times
VBox.log
VBox.log
(80.3 KiB) Downloaded 5 times
fth0
Volunteer
Posts: 5690
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: Kernel panic on GCP Ubuntu 16.04.6 + VBox 6.1.4r136177 + Win10

Post by fth0 »

Jankins wrote:Google cloud nested virtualization
AFAIK, nested virtualization is generally only supported when using the same hypervisor inside and outside, e.g. VirtualBox inside VirtualBox, Hyper-V inside Hyper-V, KVM inside KVM. Google Cloud Compute Engine uses KVM and expects a KVM-based hypervisor (e.g. QEMU) for nested virtualization.
Jankins
Posts: 3
Joined: 29. Mar 2020, 06:54
Primary OS: Mac OS X other
VBox Version: OSE other
Guest OSses: Ubunt/Windows
Location: CA, USA

Re: Kernel panic on GCP Ubuntu 16.04.6 + VBox 6.1.4r136177 + Win10

Post by Jankins »

AFAIK, nested virtualization is generally only supported when using the same hypervisor inside and outside, e.g. VirtualBox inside VirtualBox, Hyper-V inside Hyper-V, KVM inside KVM. Google Cloud Compute Engine uses KVM and expects a KVM-based hypervisor (e.g. QEMU) for nested virtualization.
Thanks. It's not specified in Google manual not using Virtualbox. I talked to my colleagues, it seems that they have been using the same configuration like I did(Ubuntu 16.04.6 + VBox + Win10) for a long time and it works good.
Maybe the VirtualBox version matters.

Another interesting fact is that their Win10 don't have Kaspersky AV.
fth0
Volunteer
Posts: 5690
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: Kernel panic on GCP Ubuntu 16.04.6 + VBox 6.1.4r136177 + Win10

Post by fth0 »

Jankins wrote:It's not specified in Google manual not using Virtualbox.
I beg to differ: Enabling Nested Virtualization for VM Instances > Restrictions states the KVM-based hypervisor requirement in the 2nd bullet point, and enumerates some well-known hypervisors that don't fulfill it. It seems that VirtualBox is not well-known enough to be part of this enumeration. ;)

AFAIK, the manufacturers of several well-known hypervisors are not interested in actively supporting mixed environments (or in fixing bugs arising from it) so far. But it is possible that it works nonetheless.

PS: Like most volunteers or moderators on this forum, I do not work for Oracle and cannot speak for them.
Post Reply