crash when resuming from "savestate" (invalid opcode: 0000)

Discussions related to using VirtualBox on Linux hosts.
Post Reply
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

We are experiencing the odd crash from several linux guests that we put into "savestate" to do backups on. Running 5.2.42.
Commands used:

Code: Select all

vboxmanage controlvm test01 savestate
vboxmanage clonevm test01 --mode all --basefolder /mnt/lv001-r0/backup/vms --name test01-20200605-153145-vbbu
vboxmanage startvm test01 --type headless
When the guest resumes, the guest will sometimes crash immediately with the following on in the console file:

Code: Select all

[557808.144173] invalid opcode: 0000 [#1] SMP 
[557808.152244] Modules linked in: vboxvideo ttm drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect input_leds sysimgblt vboxguest i2c_piix4 8250_fintek serio_raw mac_hid ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast ib_iser nf_nat_ftp nf_nat rdma_cm iw_cm nf_conntrack_ftp ib_cm nf_conntrack ib_sa ib_mad iptable_filter ib_core ib_addr ip_tables iscsi_tcp x_tables libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ahci ablk_helper cryptd e1000 psmouse libahci pata_acpi video
[557808.154748] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.0-178-generic #208-Ubuntu
[557808.154748] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[557808.154748] task: ffff88003e358e00 ti: ffff88003e360000 task.ti: ffff88003e360000
[557808.154748] RIP: 0010:[<ffffffff8154116b>]  [<ffffffff8154116b>] add_interrupt_randomness+0x14b/0x1e0
[557808.154748] RSP: 0018:ffff88003fc83e90  EFLAGS: 00010082
[557808.154748] RAX: 0000000000000000 RBX: ffff88003fc94f60 RCX: 0000000000000003
[557808.154748] RDX: 000000004c380353 RSI: ffff88003fc94f70 RDI: ffffffff81eca6a0
[557808.154748] RBP: ffff88003fc83ec0 R08: ffffffff82201c78 R09: 0000000000000036
[557808.154748] R10: 000000000000007f R11: 0000000000000002 R12: 00000001084eb98c
[557808.154748] R13: ffffffff81eca6a0 R14: ffffffff81eca6e8 R15: 0000000000000000
[557808.154748] FS:  00007f2b6609f700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000
[557808.154748] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[557808.154748] CR2: 00007f4907cf1a9f CR3: 000000003a926000 CR4: 00000000000406f0
[557808.154748] Stack:
[557808.154748]  ffff88003fc83ea8 d8e8a45fa1ef415a 0000000000000000 0000000000000011
[557808.154748]  0000000000000001 0000000000000001 ffff88003fc83f10 ffffffff810e307d
[557808.154748]  ffff880035791c00 ffffffff81f43800 0000008081038119 ffff880035791c00
[557808.154748] Call Trace:
[557808.154748]  <IRQ> 
[557808.154748]  [<ffffffff810e307d>] handle_irq_event_percpu+0x15d/0x1e0
[557808.154748]  [<ffffffff810e313c>] handle_irq_event+0x3c/0x60
[557808.154748]  [<ffffffff810e6635>] handle_fasteoi_irq+0xa5/0x190
[557808.154748]  [<ffffffff81031173>] handle_irq+0x23/0x30
[557808.154748]  [<ffffffff8186aba3>] do_IRQ+0x53/0xf0
[557808.154748]  [<ffffffff81867fd4>] common_interrupt+0xd4/0xd4
[557808.154748]  <EOI> 
[557808.154748]  [<ffffffff81039130>] ? speculation_ctrl_update_tif+0x80/0x80
[557809.600723]  [<ffffffff81067af2>] ? native_safe_halt+0x12/0x20
[557809.600723]  [<ffffffff8103914e>] default_idle+0x1e/0xe0
[557809.600723]  [<ffffffff81039ff5>] arch_cpu_idle+0x15/0x20
[557809.600723]  [<ffffffff810cc02a>] default_idle_call+0x2a/0x40
[557809.600723]  [<ffffffff810cc3a3>] cpu_startup_entry+0x303/0x360
[557809.600723]  [<ffffffff81053ce7>] start_secondary+0x177/0x1b0
[557809.600723] Code: a5 ec 81 4c 0f 45 e8 4d 8d 75 48 4c 89 f7 e8 dd 59 32 00 85 c0 74 b7 4c 89 63 10 ba 10 00 00 00 48 89 de 4c 89 ef e8 a5 f0 ff ff <48> 0f c7 f8 0f 92 c2 84 d2 48 89 45 d0 41 bc 01 00 00 00 74 17 
[557809.600723] RIP  [<ffffffff8154116b>] add_interrupt_randomness+0x14b/0x1e0
[557809.600723]  RSP <ffff88003fc83e90>
[557809.600723] fbcon_switch: detected unhandled fb_set_par error, error code -16
[557809.600723] fbcon_switch: detected unhandled fb_set_par error, error code -16
[557809.600723] ---[ end trace 1298018cdcbef030 ]---
[557809.600723] Kernel panic - not syncing: Fatal exception in interrupt
[557809.600723] Shutting down cpus with NMI
[557809.600723] Kernel Offset: disabled
[557809.600723] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
From a different guest, but same error:

Code: Select all

[480894.042449] invalid opcode: 0000 [#1] SMP 
[480894.042449] Modules linked in: veth xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter bridge stp llc aufs overlay vboxvideo ttm drm_kms_helper drm fb_sys_fops input_leds syscopyarea sysfillrect ip6t_REJECT nf_reject_ipv6 vboxguest sysimgblt serio_raw 8250_fintek mac_hid i2c_piix4 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ib_iser iptable_filter ip_tables x_tables rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper psmouse ahci libahci pata_acpi cryptd e1000 video
[480894.042449] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.0-178-generic #208-Ubuntu
[480894.042449] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[480894.042449] task: ffff88003e359c00 ti: ffff88003e364000 task.ti: ffff88003e364000
[480894.042449] RIP: 0010:[<ffffffff8154116b>]  [<ffffffff8154116b>] add_interrupt_randomness+0x14b/0x1e0
[480894.042449] RSP: 0018:ffff88003fd03e90  EFLAGS: 00010082
[480894.042449] RAX: 0000000000000000 RBX: ffff88003fd14f60 RCX: 0000000000000001
[480894.042449] RDX: 000000002b5cfb94 RSI: ffff88003fd14f70 RDI: ffffffff81eca6a0
[480894.042449] RBP: ffff88003fd03ec0 R08: ffffffff82201d70 R09: 0000000000000074
[480894.042449] R10: 000000000000007f R11: 000000000000000b R12: 000000010729521c
[480894.042449] R13: ffffffff81eca6a0 R14: ffffffff81eca6e8 R15: 0000000000000000
[480894.042449] FS:  00007f07b1a8bd48(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
[480894.042449] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[480894.042449] CR2: 000000c000197010 CR3: 0000000038d22000 CR4: 00000000000406f0
[480894.042449] Stack:
[480894.042449]  ffff88003fd03ea8 46452de69ea26434 0000000000000000 0000000000000011
[480894.042449]  0000000000000001 0000000000000001 ffff88003fd03f10 ffffffff810e307d
[480894.042449]  ffff8800351ab800 ffffffff81f43800 0000008081038119 ffff8800351ab800
[480894.042449] Call Trace:
[480894.042449]  <IRQ> 
[480894.042449]  [<ffffffff810e307d>] handle_irq_event_percpu+0x15d/0x1e0
[480894.042449]  [<ffffffff810e313c>] handle_irq_event+0x3c/0x60
[480894.042449]  [<ffffffff810e6635>] handle_fasteoi_irq+0xa5/0x190
[480894.042449]  [<ffffffff81031173>] handle_irq+0x23/0x30
[480894.042449]  [<ffffffff8186aba3>] do_IRQ+0x53/0xf0
[480894.042449]  [<ffffffff81867fd4>] common_interrupt+0xd4/0xd4
[480894.042449]  <EOI> 
[480894.042449]  [<ffffffff81039130>] ? speculation_ctrl_update_tif+0x80/0x80
[480894.042449]  [<ffffffff81067af2>] ? native_safe_halt+0x12/0x20
[480894.042449]  [<ffffffff8103914e>] default_idle+0x1e/0xe0
[480894.042449]  [<ffffffff81039ff5>] arch_cpu_idle+0x15/0x20
[480894.042449]  [<ffffffff810cc02a>] default_idle_call+0x2a/0x40
[480894.042449]  [<ffffffff810cc3a3>] cpu_startup_entry+0x303/0x360
[480894.042449]  [<ffffffff81053ce7>] start_secondary+0x177/0x1b0
[480894.042449] Code: a5 ec 81 4c 0f 45 e8 4d 8d 75 48 4c 89 f7 e8 dd 59 32 00 85 c0 74 b7 4c 89 63 10 ba 10 00 00 00 48 89 de 4c 89 ef e8 a5 f0 ff ff <48> 0f c7 f8 0f 92 c2 84 d2 48 89 45 d0 41 bc 01 00 00 00 74 17 
[480894.042449] RIP  [<ffffffff8154116b>] add_interrupt_randomness+0x14b/0x1e0
[480894.042449]  RSP <ffff88003fd03e90>
[480894.042449] fbcon_switch: detected unhandled fb_set_par error, error code -16
[480894.042449] fbcon_switch: detected unhandled fb_set_par error, error code -16
[480894.042449] ---[ end trace d2ef6a5533c3760a ]---
[480894.042449] Kernel panic - not syncing: Fatal exception in interrupt
[480894.042449] Kernel Offset: disabled
[480894.042449] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
It's the same for many that are crashing. First error in the crash is "invalid opcode: 0000 [#1] SMP"
Again, this only happens when resuming the guest from "savestate". The only way to get them going again, is to force a power off on the guest, followed by power on.
Note: All guests are on local ext4 disk.
I'v attached the vbox file and the vbox.log file (split into two as the forum has an attachement size limit.)
Attachments
VBox.log.2.txt
(96.2 KiB) Downloaded 10 times
VBox.log.1.txt
(80.36 KiB) Downloaded 11 times
guest.vbox.txt
(3.15 KiB) Downloaded 6 times
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

Happened again last night. Another different server resuming from savestate, crashed with the exact same error:

Code: Select all

[740062.732744] invalid opcode: 0000 [#1] SMP NOPTI
[740063.457405] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addr
type iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter bridge stp llc aufs overlay vboxvideo(CE) ttm drm_kms_he
lper drm fb_sys_fops syscopyarea sysfillrect sysimgblt input_leds mac_hid vboxguest serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq li
bcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd ahci
[740077.057614]  psmouse libahci e1000 i2c_piix4 pata_acpi video
[740078.250569] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C  E    4.15.0-99-generic #100-Ubuntu
[740080.077507] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[740081.871934] RIP: 0010:kvm_kick_cpu+0x27/0x30
[740081.939428] RSP: 0018:ffff946e9fc03e50 EFLAGS: 00010046
[740081.940509] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[740081.941666] RDX: ffff946e9fd00000 RSI: 0000000000000100 RDI: 0000000000000001
[740081.942833] RBP: ffff946e9fc03e58 R08: 0000000000000100 R09: ffff946e9ff48f00
[740081.949103] R10: 00000000000000d4 R11: 001dcd6500000000 R12: 0000000000000000
[740082.939326] R13: 00000000003885c5 R14: 0000000000000000 R15: 00000000007fffa5
[740083.940386] FS:  0000000000000000(0000) GS:ffff946e9fc00000(0000) knlGS:0000000000000000
[740083.949503] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[740084.940333] CR2: 00007f93162ce0d0 CR3: 00000000cf69c000 CR4: 00000000000406f0
[740084.941527] Call Trace:
[740084.942244]  <IRQ>
[740084.942910]  __pv_queued_spin_unlock_slowpath+0xa4/0xd0
[740084.943800]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x15/0x24
[740084.949377]  .slowpath+0x9/0x15
[740085.653323]  _raw_spin_unlock_irqrestore+0xe/0x20
[740085.939494]  update_wall_time+0x474/0x6f0
[740085.940382]  tick_do_update_jiffies64.part.11+0x8e/0xf0
[740085.941324]  tick_irq_enter+0xc6/0xd0
[740085.942083]  irq_enter+0x48/0x50
[740085.942772]  do_IRQ+0x2e/0xe0
[740085.943415]  common_interrupt+0x8c/0x8c
[740085.944133]  </IRQ>
[740085.944693] RIP: 0010:native_safe_halt+0x12/0x20
[740085.949161] RSP: 0018:ffffffffa1e03e28 EFLAGS: 00010246 ORIG_RAX: ffffffffffffffd9
[740086.474009] RAX: ffffffffa13c1500 RBX: 0000000000000000 RCX: 0000000000000000
[740086.892100] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[740088.316432] RBP: ffffffffa1e03e28 R08: 000499de14f26763 R09: ffff946e9533a100
[740089.870126] R10: 0000000000000000 R11: 0002a1e09bb66cfb R12: 0000000000000000
[740091.311151] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[740092.240506]  ? __sched_text_end+0x1/0x1
[740093.037966]  default_idle+0x20/0x100
[740093.883282]  arch_cpu_idle+0x15/0x20
[740094.606063]  default_idle_call+0x23/0x30
[740095.429715]  do_idle+0x172/0x1f0
[740095.938835]  cpu_startup_entry+0x73/0x80
[740096.349147]  rest_init+0xae/0xb0
[740096.938804]  start_kernel+0x4dc/0x4fd
[740097.733288]  x86_64_start_reservations+0x24/0x26
[740097.939097]  x86_64_start_kernel+0x74/0x77
[740097.939884]  secondary_startup_64+0xa5/0xb0
[740097.940573] Code: 5d c3 66 90 0f 1f 44 00 00 48 63 ff 55 48 c7 c0 44 f1 00 00 48 8b 14 fd e0 d6 ba a1 48 89 e5 53 31 db 0f b7 0c 02 b8 05 00 00 00 <0f> 01 d9 5b 5
d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 
[740097.949259] RIP: kvm_kick_cpu+0x27/0x30 RSP: ffff946e9fc03e50
[740097.950110] ---[ end trace 554b7b9e8cd8bbfb ]---
[740097.950867] Kernel panic - not syncing: Fatal exception in interrupt
[740099.018462] Shutting down cpus with NMI
[740099.020473] Kernel Offset: 0x1fa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[740099.939039] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

And again.. on yet another different server. virtualbox folks.. any help on this would be great.
Same issue.. when resuming from savestate, the guest kernal panics with this in the guest console log:

Code: Select all

[232956.782180] invalid opcode: 0000 [#1] SMP 
[232956.782180] Modules linked in: xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_net
filter bridge stp llc aufs overlay vboxvideo ttm drm_kms_helper drm fb_sys_fops input_leds serio_raw syscopyarea sysfillrect sysimgblt vboxguest i2c_piix4 8250_fintek
 mac_hid ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defra
g_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns ib_iser nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter i
p_tables x_tables rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov
 async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64
 lrw gf128mul glue_helper ablk_helper cryptd psmouse ahci e1000 libahci pata_acpi video
[232956.782180] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.0-184-generic #214-Ubuntu
[232956.782180] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[232956.782180] task: ffff88007c8e1c80 ti: ffff88007c8ec000 task.ti: ffff88007c8ec000
[232956.782180] RIP: 0010:[<ffffffff815419db>]  [<ffffffff815419db>] add_interrupt_randomness+0x14b/0x1e0
[232956.782180] RSP: 0018:ffff88007fd03e90  EFLAGS: 00010082
[232956.782180] RAX: 0000000000000000 RBX: ffff88007fd14f60 RCX: 0000000000000002
[232956.782180] RDX: 0000000077843068 RSI: ffff88007fd14f70 RDI: ffffffff81eca6e0
[232956.782180] RBP: ffff88007fd03ec0 R08: ffffffff82202cc0 R09: 0000000000000048
[232956.782180] R10: 000000000000007f R11: 0000000000000005 R12: 00000001037783e0
[232956.782180] R13: ffffffff81eca6e0 R14: ffffffff81eca728 R15: 0000000000000000
[232956.782180] FS:  00007f86df158700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[232956.782180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[232956.782180] CR2: 000000c000187010 CR3: 000000007acfc000 CR4: 00000000000406f0
[232956.782180] Stack:
[232956.782180]  ffff88007fd03ea8 bdea6c4a5126d8aa 0000000000000000 0000000000000011
[232956.782180]  0000000000000001 0000000000000001 ffff88007fd03f10 ffffffff810e308d
[232956.782180]  ffff880075bf0200 ffffffff81f43840 0000008081038119 ffff880075bf0200
[232956.782180] Call Trace:
[232956.782180]  <IRQ> 
[232956.782180]  [<ffffffff810e308d>] handle_irq_event_percpu+0x15d/0x1e0
[232956.782180]  [<ffffffff810e314c>] handle_irq_event+0x3c/0x60
[232956.782180]  [<ffffffff810e66c5>] handle_fasteoi_irq+0xa5/0x190
[232956.782180]  [<ffffffff81031183>] handle_irq+0x23/0x30
[232956.782180]  [<ffffffff8186c163>] do_IRQ+0x53/0xf0
[232956.782180]  [<ffffffff81869594>] common_interrupt+0xd4/0xd4
[232956.782180]  <EOI> 
[232956.782180]  [<ffffffff81039130>] ? speculation_ctrl_update_tif+0x80/0x80
[232956.782180]  [<ffffffff81067af2>] ? native_safe_halt+0x12/0x20
[232956.782180]  [<ffffffff8103914e>] default_idle+0x1e/0xe0
[232956.782180]  [<ffffffff81039ff5>] arch_cpu_idle+0x15/0x20
[232956.782180]  [<ffffffff810cc03a>] default_idle_call+0x2a/0x40
[232956.782180]  [<ffffffff810cc3b3>] cpu_startup_entry+0x303/0x360
[232956.782180]  [<ffffffff81053e67>] start_secondary+0x177/0x1b0
[232956.782180] Code: a5 ec 81 4c 0f 45 e8 4d 8d 75 48 4c 89 f7 e8 3d 67 32 00 85 c0 74 b7 4c 89 63 10 ba 10 00 00 00 48 89 de 4c 89 ef e8 65 ef ff ff <48> 0f c7 f8 0
f 92 c2 84 d2 48 89 45 d0 41 bc 01 00 00 00 74 17 
[232956.782180] RIP  [<ffffffff815419db>] add_interrupt_randomness+0x14b/0x1e0
[232956.782180]  RSP <ffff88007fd03e90>
[232956.782180] fbcon_switch: detected unhandled fb_set_par error, error code -16
[232956.782180] fbcon_switch: detected unhandled fb_set_par error, error code -16
[232957.000027] ---[ end trace 1f643a23606330eb ]---
[232957.008159] e1000: enp0s17 NIC Link is Down
[232957.000027] Kernel panic - not syncing: Fatal exception in interrupt
[232957.132713] Kernel Offset: disabled
[232957.132713] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
scottgus1
Site Moderator
Posts: 20945
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by scottgus1 »

Linux Hosts has had some reports of failure resuming from saved state before. Being a Windows guy I am not certain of what the solution was.

The best workaround is to not save state.

And saving state to do a backup is not a good idea. A saved state is extremely tied to the host computer hardware and the Virtualbox version. You must have the same host PC, the same hardware setup and the same Virtualbox version installed to have the slightest chance to restore the backup. (Unless you plan to discard the saved state, which will leave all of your databases and guest OS in a dirty state consistent with an accidental power-plug-pull.) Bad things happen when a modern OS has the power fail.

Cloning as a backup routine is also not good. You cannot confirm the integrity of the backup copy using FC or hashing.

The best backup is a file & folder copy of the guest after the guest has been fully shut down from within the guest OS using the guest OS's Shut Down command (or an ACPI power button 'push' from Vboxmanage, which the guest OS interprets as a shut down command). Then you can FC the copy to confirm the copy has no glitches, and/or hash the originals so you can restart the guest the copy the backup copy to offsite storage and hash the offsite copy to confirm integrity.

So, I unfortunately do not know the solution to the save-state problem. But you really need to rethink your backup routine - it is not good in its present form.
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

Thanks for the reply. Good to know at least someone (even if not from Virtualbox) reads this..
Perhaps I need to clarify that this is for point in time restore, not actual "backups". Yes, proper backups mean running a client for the individual files/folders on the guest.

That aside.. the issue still remains that sometimes when resuming a guest from savestate, the guest kernel dumps.
I'm going to try upgrading to 6.1.X and see if that stablizes things.
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

And another...

Code: Select all

[577884.681727] invalid opcode: 0000 [#1] SMP NOPTI
[577884.685055] Modules linked in: xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter ip
table_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter bridge stp llc aufs overlay binfmt_misc vboxvideo(CE) input_leds ttm serio_raw
 drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid vboxguest sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi s
csi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c ra
id1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 ahci crypto_simd glue_helper cryptd psmouse
[577885.176778]  libahci i2c_piix4 e1000 video pata_acpi
[577885.219993] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G         C  E    4.15.0-106-generic #107-Ubuntu
[577885.301057] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[577885.376736] RIP: 0010:kvm_kick_cpu+0x27/0x30
[577885.420617] RSP: 0018:ffffb8cc00697df0 EFLAGS: 00010046
[577885.448756] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[577885.512543] RDX: ffff8e079fc80000 RSI: 0000000000000100 RDI: 0000000000000001
[577885.585888] RBP: ffffb8cc00697df8 R08: 0000000000000100 R09: ffff8e079ff48f00
[577885.656576] R10: 00000000000000dc R11: 0000000000000000 R12: 00000001089b4913
[577885.700621] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[577885.768752] FS:  0000000000000000(0000) GS:ffff8e079fd00000(0000) knlGS:0000000000000000
[577885.798576] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[577885.841927] CR2: 0000563591d017b8 CR3: 00000001165cc000 CR4: 00000000000406e0
[577885.846674] Call Trace:
[577885.848797]  __pv_queued_spin_unlock_slowpath+0xa4/0xd0
[577885.876636]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x15/0x24
[577885.930912]  .slowpath+0x9/0x15
[577885.961823]  cpu_load_update_nohz_stop+0x95/0xa0
[577885.987038] e1000: enp0s3 NIC Link is Down
[577885.995832]  tick_nohz_idle_exit+0x8d/0x180
[577885.995834]  do_idle+0x13e/0x1f0
[577885.995835]  cpu_startup_entry+0x73/0x80
[577885.995837]  start_secondary+0x1ab/0x200
[577885.995839]  secondary_startup_64+0xa5/0xb0
[577885.995840] Code: 5d c3 66 90 0f 1f 44 00 00 48 63 ff 55 48 c7 c0 44 f1 00 00 48 8b 14 fd e0 e6 ba 86 48 89 e5 53 31 db 0f b7 0c 02 b8 05 00 00 00 <0f> 01 d9 5b 5
d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 
[577886.264078] RIP: kvm_kick_cpu+0x27/0x30 RSP: ffffb8cc00697df0
[577886.291873] ---[ end trace 37598eee6c50b825 ]---
[577886.314093] Kernel panic - not syncing: Attempted to kill the idle task!
[577887.407677] Shutting down cpus with NMI
[577887.902949] Kernel Offset: 0x4a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[577890.070398] ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
scottgus1
Site Moderator
Posts: 20945
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by scottgus1 »

There's probably a couple of things to keep in mind:
Guideloom wrote:at least someone (even if not from Virtualbox) reads this..
Just about everyone here is not from Virtualbox. Only the red posters are. And they are few and far between, because free Virtualbox doesn't have any support channels. The rest of us are anonymous internet users who for various reasons like helping other anonymous internet users get their Virtualbox working.

Also, I recognize the posted error message lists as text output from the guest OS. If the guest Virtualbox window stays open when these kernel panics happen, then it is highly unlikely that Virtualbox is aware of the guest OS kernel panic. This log section seems to imply that the guest Virtualbox window does stay open:
00:00:01.437456 Console: Machine state changed to 'Running'
00:00:04.612232 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago
11:18:40.410589 VRDP: New connection:
It could be that Virtualbox is not setting up the guest environment correctly. It could also be that the guest OS is not setting itself up correctly. Have you tried troubleshooting directly in the guest OS too?

Has this save-state-crash thing been happening from the beginning of your using Virtualbox for this project? Or was the setup stable for a while then something changed?

One other thing, logs fit in toto when zipped.
Guideloom
Posts: 29
Joined: 5. Sep 2018, 18:42

Re: crash when resuming from "savestate" (invalid opcode: 0000)

Post by Guideloom »

You are right, vbox thinks the machine is still running, but it actually crashed.
I can't ad hoc reproduce this using 5.2.X. Sometimes it happens, sometimes it doesn't.

However, I just upgraded to 6.1.10 and everything is even worse. I'll have to open a new thread on that as it's a different error.
New Thread: VBOX_E_IPRT_ERROR 0x80BB0005 when cloning a vm that is in savestate
Post Reply