[Solved] RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Discussions related to using VirtualBox on Windows hosts.
Post Reply
VoltHertz
Posts: 3
Joined: 21. May 2022, 19:56

[Solved] RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Post by VoltHertz »

Hi, im using RHEL 8.6 on VirtualBox 6.1.34 on my Windows 10 to start studing for my RHCSA certification. But since the begginig im having this issue where the RHEL 8 hangs(or stall or have a loop on a thread) and nothing works anymore. I cant control the system on the virtual box, SSH stops working, and RHEL cockpit stops responding. VitualBox still responsive.

I need help to solve this, i follow many instructions on instaling RHEL 8 on Oracle\VirtualBox, but nothing solved it. I even disabled Server with a GUI instalation. Im very noew at VirtualBox and Linux managing. Still I have big experience with systems.


It happens without any reason at any given moment. Sometimes the system resolve himself and I could get this journalctl output of the problem:

RHEL 8 journalctl:

Code: Select all

May 21 14:18:46 rhel-serverxbr kernel: rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
May 21 14:18:46 rhel-serverxbr kernel:         (detected by 0, t=556708 jiffies, g=47497, q=219)
May 21 14:18:46 rhel-serverxbr kernel: rcu: All QSes seen, last rcu_sched kthread activity 556708 (4296124962-4295568254), jiffies_till_next_fq>
May 21 14:18:46 rhel-serverxbr kernel: rcu: rcu_sched kthread starved for 556708 jiffies! g47497 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
May 21 14:18:46 rhel-serverxbr kernel: rcu:         Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
May 21 14:18:46 rhel-serverxbr kernel: rcu: RCU grace-period kthread stack dump:
May 21 14:18:46 rhel-serverxbr kernel: task:rcu_sched       state:R  running task     stack:    0 pid:   12 ppid:     2 flags:0x80004000
May 21 14:18:46 rhel-serverxbr kernel: Call Trace:
May 21 14:18:46 rhel-serverxbr kernel:  __schedule+0x2d1/0x830
May 21 14:18:46 rhel-serverxbr kernel:  schedule+0x35/0xa0
May 21 14:18:46 rhel-serverxbr kernel:  schedule_timeout+0x197/0x300
May 21 14:18:46 rhel-serverxbr kernel:  ? __next_timer_interrupt+0xf0/0xf0
May 21 14:18:46 rhel-serverxbr kernel:  ? __prepare_to_swait+0x4b/0x70
May 21 14:18:46 rhel-serverxbr kernel:  rcu_gp_kthread+0x4e5/0xab0
May 21 14:18:46 rhel-serverxbr kernel:  ? rcu_accelerate_cbs_unlocked+0x80/0x80
May 21 14:18:46 rhel-serverxbr kernel:  kthread+0x10a/0x120
May 21 14:18:46 rhel-serverxbr kernel:  ? set_kthread_struct+0x40/0x40
May 21 14:18:46 rhel-serverxbr kernel:  ret_from_fork+0x35/0x40
May 21 14:18:46 rhel-serverxbr kernel: rcu: Stack dump where RCU GP kthread last ran:
May 21 14:18:46 rhel-serverxbr kernel: NMI backtrace for cpu 0
May 21 14:18:46 rhel-serverxbr kernel: CPU: 0 PID: 1210 Comm: gmain Not tainted 4.18.0-372.9.1.el8.x86_64 #1
May 21 14:18:46 rhel-serverxbr kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
May 21 14:18:46 rhel-serverxbr kernel: Call Trace:
May 21 14:18:46 rhel-serverxbr kernel:  <IRQ>
May 21 14:18:46 rhel-serverxbr kernel:  dump_stack+0x41/0x60
May 21 14:18:46 rhel-serverxbr kernel:  nmi_cpu_backtrace.cold.8+0x13/0x4f
May 21 14:18:46 rhel-serverxbr kernel:  ? lapic_can_unplug_cpu.cold.30+0x37/0x37
May 21 14:18:46 rhel-serverxbr kernel:  nmi_trigger_cpumask_backtrace+0xde/0xe0
May 21 14:18:46 rhel-serverxbr kernel:  rcu_check_gp_kthread_starvation+0x106/0x113
May 21 14:18:46 rhel-serverxbr kernel:  rcu_sched_clock_irq.cold.99+0x2c1/0x39d
May 21 14:18:46 rhel-serverxbr kernel:  ? tick_sched_do_timer+0x50/0x50
May 21 14:18:46 rhel-serverxbr kernel:  ? tick_sched_do_timer+0x50/0x50
May 21 14:18:46 rhel-serverxbr kernel:  update_process_times+0x55/0x80
May 21 14:18:46 rhel-serverxbr kernel:  tick_sched_handle+0x22/0x60
May 21 14:18:46 rhel-serverxbr kernel:  tick_sched_timer+0x37/0x70
May 21 14:18:46 rhel-serverxbr kernel:  __hrtimer_run_queues+0x100/0x280
May 21 14:18:46 rhel-serverxbr kernel:  hrtimer_interrupt+0x100/0x220
May 21 14:18:46 rhel-serverxbr kernel:  smp_apic_timer_interrupt+0x6a/0x130
May 21 14:18:46 rhel-serverxbr kernel:  apic_timer_interrupt+0xf/0x20
May 21 14:18:46 rhel-serverxbr kernel:  </IRQ>
May 21 14:18:46 rhel-serverxbr kernel: RIP: 0010:smp_call_function_single+0xce/0xf0
May 21 14:18:46 rhel-serverxbr kernel: Code: 8b 4c 24 38 65 48 33 0c 25 28 00 00 00 75 34 c9 c3 48 89 d1 48 89 f2 48 89 e6 e8 7d fe ff ff 8b 54>
May 21 14:18:46 rhel-serverxbr kernel: RSP: 0000:ffffa87e8125fc80 EFLAGS: 00010202 ORIG_RAX: ffffffffffffff13
May 21 14:18:46 rhel-serverxbr kernel: RAX: 0000000000000000 RBX: ffff9a82c5fea858 RCX: 0000000000000000
May 21 14:18:46 rhel-serverxbr kernel: RDX: 0000000000000001 RSI: 00000000000000fb RDI: 0000000000000206
May 21 14:18:46 rhel-serverxbr kernel: RBP: ffffa87e8125fcc0 R08: 0000000000000001 R09: 000000000005ebe6
May 21 14:18:46 rhel-serverxbr kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a82c5fea400
May 21 14:18:46 rhel-serverxbr kernel: R13: ffffd04cc0e85840 R14: ffffd04cc0eb96c0 R15: 800000003ae5b867
May 21 14:18:46 rhel-serverxbr kernel:  ? flush_tlb_func_common.constprop.9+0x220/0x220
May 21 14:18:46 rhel-serverxbr kernel:  flush_tlb_mm_range+0xda/0x110
May 21 14:18:46 rhel-serverxbr kernel:  ptep_clear_flush+0x54/0x60
May 21 14:18:46 rhel-serverxbr kernel:  wp_page_copy+0x1f6/0x4d0
May 21 14:18:46 rhel-serverxbr kernel:  do_wp_page+0xef/0x400
May 21 14:18:46 rhel-serverxbr kernel:  __handle_mm_fault+0x7c4/0x7e0
May 21 14:18:46 rhel-serverxbr kernel:  ? kernel_wait4+0xb1/0x140
May 21 14:18:46 rhel-serverxbr kernel:  handle_mm_fault+0xc1/0x1e0
May 21 14:18:46 rhel-serverxbr kernel:  do_user_addr_fault+0x1b5/0x440
May 21 14:18:46 rhel-serverxbr kernel:  do_page_fault+0x37/0x130
May 21 14:18:46 rhel-serverxbr kernel:  ? page_fault+0x8/0x30
May 21 14:18:46 rhel-serverxbr kernel:  page_fault+0x1e/0x30
May 21 14:18:46 rhel-serverxbr kernel: RIP: 0033:0x7f5b57d5dbd9
May 21 14:18:46 rhel-serverxbr kernel: Code: 00 5b e9 7a 6b fd ff 66 2e 0f 1f 84 00 00 00 00 00 48 8b 7f 30 e8 67 71 f8 ff eb d4 0f 1f 44 00 00>
May 21 14:18:46 rhel-serverxbr kernel: RSP: 002b:00007f5b51e859c8 EFLAGS: 00010246
May 21 14:18:46 rhel-serverxbr kernel: RAX: 0000000000000001 RBX: 0000563e94fad560 RCX: 0000000000000002
May 21 14:18:46 rhel-serverxbr kernel: RDX: 0000563e94f89c70 RSI: 000000007fffffff RDI: 0000563e94fad560
May 21 14:18:46 rhel-serverxbr kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000563e9500d8f0
May 21 14:18:46 rhel-serverxbr kernel: R10: 0000563e9500b300 R11: 0000000000000000 R12: 000000007fffffff
May 21 14:18:46 rhel-serverxbr kernel: R13: 0000563e94f89c70 R14: 0000000000000002 R15: 0000000000000002
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Killing process 671 (systemd-udevd) with signal SIGABRT.
May 21 14:18:46 rhel-serverxbr systemd[1]: timedatex.service: Succeeded.
May 21 14:18:46 rhel-serverxbr systemd[1]: Created slice system-systemd\x2dcoredump.slice.
May 21 14:18:46 rhel-serverxbr systemd[1]: Started Process Core Dump (PID 3160/UID 0).
May 21 14:18:46 rhel-serverxbr systemd-coredump[3174]: Resource limits disable core dumping for process 671 (systemd-udevd).
May 21 14:18:46 rhel-serverxbr systemd-coredump[3174]: Process 671 (systemd-udevd) of user 0 dumped core.
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-coredump@0-3160-0.service: Succeeded.
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Main process exited, code=dumped, status=6/ABRT
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Failed with result 'watchdog'.
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Service has no hold-off time (RestartSec=0), scheduling restart.
May 21 14:18:46 rhel-serverxbr systemd[1]: systemd-udevd.service: Scheduled restart job, restart counter is at 1.
May 21 14:18:46 rhel-serverxbr systemd[1]: Stopped udev Kernel Device Manager.
May 21 14:18:46 rhel-serverxbr systemd[1]: Starting udev Kernel Device Manager...
May 21 14:18:46 rhel-serverxbr systemd[1]: Started udev Kernel Device Manager.
May 21 14:22:01 rhel-serverxbr anacron[2951]: Job `cron.daily' started
May 21 14:22:01 rhel-serverxbr run-parts[3188]: (/etc/cron.daily) starting logrotate
May 21 14:22:01 rhel-serverxbr run-parts[3193]: (/etc/cron.daily) finished logrotate
VirtualBox version:
6.1.34r150636

Guest: RedHat Entreprise Linux 8.6, 64 Bits, 2 CPUs, 2,5 GB RAM
[volt@rhel-server ~]$ hostnamectl
Static hostname: rhel-serverxbr
Icon name: computer-vm
Chassis: vm
Machine ID: xxxxxxxxx
Boot ID: xxxxxxxx
Virtualization: oracle
Operating System: Red Hat Enterprise Linux 8.6 (Ootpa)
CPE OS Name: cpe:/o:redhat:enterprise_linux:8::baseos
Kernel: Linux 4.18.0-372.9.1.el8.x86_64
Architecture: x86-64
[volt@rhel-server ~]$

Host: Windows 10 PRO - Versao 21H2, 64 Bits, 16Gb RAM, 8 CPUs(I7 9700k)

VM info:

Code: Select all

PS C:\Program Files\Oracle\VirtualBox> .\VBoxManage.exe showvminfo rhel-server --details
Name:                        rhel-server
Groups:                      /
Guest OS:                    Red Hat (64-bit)
UUID:                        xxxxxxx
Config file:                 C:\Users\T-GAMER\VirtualBox VMs\rhel-server\rhel-server.vbox
Snapshot folder:             C:\Users\T-GAMER\VirtualBox VMs\rhel-server\Snapshots
Log folder:                  C:\Users\T-GAMER\VirtualBox VMs\rhel-server\Logs
Hardware UUID:               6fe14663-08e7-4d1d-9680-63ec3dbfc92f
Memory size:                 2560MB
Page Fusion:                 disabled
VRAM size:                   32MB
CPU exec cap:                100%
HPET:                        disabled
CPUProfile:                  host
Chipset:                     piix3
Firmware:                    BIOS
Number of CPUs:              2
PAE:                         enabled
Long Mode:                   enabled
Triple Fault Reset:          disabled
APIC:                        enabled
X2APIC:                      enabled
Nested VT-x/AMD-V:           disabled
CPUID Portability Level:     0
CPUID overrides:             None
Boot menu mode:              message and menu
Boot Device 1:               HardDisk
Boot Device 2:               DVD
Boot Device 3:               Floppy
Boot Device 4:               Not Assigned
ACPI:                        enabled
IOAPIC:                      enabled
BIOS APIC mode:              APIC
Time offset:                 0ms
RTC:                         UTC
Hardware Virtualization:     enabled
Nested Paging:               enabled
Large Pages:                 enabled
VT-x VPID:                   enabled
VT-x Unrestricted Exec.:     enabled
Paravirt. Provider:          Default
Effective Paravirt. Prov.:   KVM
State:                       running (since 2022-05-21T18:51:01.050000000)
Graphics Controller:         VMSVGA
Monitor count:               1
3D Acceleration:             disabled
2D Video Acceleration:       disabled
Teleporter Enabled:          disabled
Teleporter Port:             0
Teleporter Address:
Teleporter Password:
Tracing Enabled:             disabled
Allow Tracing to Access VM:  disabled
Tracing Configuration:
Autostart Enabled:           disabled
Autostart Delay:             0
Default Frontend:
VM process priority:         default
Storage Controller Name (0):            IDE
Storage Controller Type (0):            PIIX4
Storage Controller Instance Number (0): 0
Storage Controller Max Port Count (0):  2
Storage Controller Port Count (0):      2
Storage Controller Bootable (0):        on
Storage Controller Name (1):            SATA
Storage Controller Type (1):            IntelAhci
Storage Controller Instance Number (1): 0
Storage Controller Max Port Count (1):  30
Storage Controller Port Count (1):      1
Storage Controller Bootable (1):        on
IDE (1, 0): Empty
SATA (0, 0): C:\Users\T-GAMER\VirtualBox VMs\rhel-server\rhel-server.vdi (UUID: 6b82b2f4-7338-4ad2-8d88-25f3b5aba8a2)
NIC 1:                       MAC: 0800279BF0FE, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 2:                       MAC: 0800279E6C6A, Attachment: Host-only Interface 'VirtualBox Host-Only Ethernet Adapter', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 3:                       disabled
NIC 4:                       disabled
NIC 5:                       disabled
NIC 6:                       disabled
NIC 7:                       disabled
NIC 8:                       disabled
Pointing Device:             PS/2 Mouse
Keyboard Device:             PS/2 Keyboard
UART 1:                      disabled
UART 2:                      disabled
UART 3:                      disabled
UART 4:                      disabled
LPT 1:                       disabled
LPT 2:                       disabled
Audio:                       enabled (Driver: DSOUND, Controller: AC97, Codec: AD1980)
Audio playback:              enabled
Audio capture:               disabled
Clipboard Mode:              disabled
Drag and drop Mode:          disabled
Session name:                GUI/Qt
Video mode:                  800x600x32 at 0,0 enabled
VRDE:                        disabled
OHCI USB:                    enabled
EHCI USB:                    disabled
xHCI USB:                    disabled

USB Device Filters:

<none>

Available remote USB devices:

<none>

Currently Attached USB Devices:

<none>

Bandwidth groups:  <none>

Shared folders:<none>

VRDE Connection:             not active
Clients so far:              0

Capturing:                   active
Capture audio:               active
Capture screens:             0
Capture file:                C:\Users\T-GAMER\VirtualBox VMs\rhel-server\rhel-server.webm
Capture dimensions:          1024x768
Capture rate:                512kbps
Capture FPS:                 25kbps
Capture options:             vc_enabled=true,ac_enabled=true,ac_profile=med

Guest:

Configured memory balloon size: 0MB
OS type:                     RedHat_64
Additions run level:         0

Guest Facilities:

No active facilities.


PS C:\Program Files\Oracle\VirtualBox>

Here are the logs from the VirtualBox I uploaded:
VirtualBox_logs.rar

I posted a thread on RedHat Custumer Portal too:
access redhat com/discussions/6960285?tour=8

From a very concerned person,

Volt
Attachments
VirtualBox_Logs.rar
VirtualBox_logs.rar
(39.52 KiB) Downloaded 4 times
Last edited by VoltHertz on 21. May 2022, 21:26, edited 1 time in total.
fth0
Volunteer
Posts: 5668
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Post by fth0 »

Please try the VirtualBox test build 6.1.35r150972 (or newer), which includes a potential bugfix.

Alternatively, prevent Hyper-V from running on your Windows host: HMR3Init: Attempting fall back to NEM (Hyper-V is active).
VoltHertz
Posts: 3
Joined: 21. May 2022, 19:56

Re: RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Post by VoltHertz »

Ow! Just followed this instuctions and the turtle disaper!

I dont know if the hangs/stall its over, will work on the system for some hours and will let it all night on, tomorrow I will se if it is solved. Ifnot I will try the newer test build.

Thanks.
VoltHertz
Posts: 3
Joined: 21. May 2022, 19:56

Re: RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Post by VoltHertz »

Its all working fine until now, it seems it worked really good. I thouth that i had removed Hyper-v. But only following this instructions it worked right.
fth0
Volunteer
Posts: 5668
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: RHEL 8.6 Stall on VirtualBox6.1.34: RCU_SCHED detected Stalls

Post by fth0 »

Thanks for reporting back. :)
Post Reply