Version 6.1.2 r135662 (Qt5.9.5)
on Kunbuntu 18.04:
Linux boulez 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
I have a windows 10 guest. It works fine - 100% stable - except for reboots, especially when doing a windows update. It'll just....stop. Black screen. Eventually I'll close the window and get the option to terminate it. At this point, the guest is showing as Aborted, but the ram used for the vm isn't returned to linux. I have to reboot. If I try and run it or another vm the OOM killer kicks in and I typically lose the desktop environment. My machine is in this state now.
Using htop I can see 25.4G/31.4G.
Code: Select all
~$ free
total used free shared buff/cache available
Mem: 32896520 26533640 3028276 55216 3334604 5854604
Swap: 2097148 513280 1583868
Code: Select all
~$ vmstat -w
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu--------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 3 512768 2985668 1111252 2374596 0 0 1678 1678 0 3 1 2 88 10 0
Code: Select all
~$ cat /proc/meminfo
MemTotal: 32896520 kB
MemFree: 2950448 kB
MemAvailable: 5794756 kB
Buffers: 1086776 kB
Cached: 2095248 kB
SwapCached: 12464 kB
Active: 1124620 kB
Inactive: 2949244 kB
Active(anon): 470860 kB
Inactive(anon): 485580 kB
Active(file): 653760 kB
Inactive(file): 2463664 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 2097148 kB
SwapFree: 1579260 kB
Dirty: 1012708 kB
Writeback: 5124 kB
AnonPages: 886096 kB
Mapped: 550488 kB
Shmem: 64932 kB
KReclaimable: 180748 kB
Slab: 535740 kB
SReclaimable: 180748 kB
SUnreclaim: 354992 kB
KernelStack: 10928 kB
PageTables: 64928 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 18545408 kB
Committed_AS: 5679928 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 45756 kB
VmallocChunk: 0 kB
Percpu: 15360 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 30338756 kB
DirectMap2M: 3164160 kB
DirectMap1G: 1048576 kBIdeally I'd fix the crash in the first place but I'd settle for being able to not require rebooting the PC to load this or another VM!
Which further information would be useful in diagnosing this problem?
Edit: I've kept digging. Could this be it:
Code: Select all
~$ lsmod | grep -i nvidia
nvidia_uvm 942080 0
nvidia_drm 49152 6
nvidia_modeset 1114112 26 nvidia_drm
nvidia 20463616 1438 nvidia_uvm,nvidia_modeset
drm_kms_helper 180224 1 nvidia_drm
drm 491520 9 drm_kms_helper,nvidia_drm
ipmi_msghandler 102400 2 ipmi_devintf,nvidia
i2c_nvidia_gpu 16384 0
But:
Code: Select all
~$ nvidia-smi
Fri May 29 21:26:24 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166... Off | 00000000:08:00.0 On | N/A |
| 0% 49C P8 12W / 140W | 513MiB / 5943MiB | 19% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1175 G /usr/lib/xorg/Xorg 177MiB |
| 0 1482 G kwin_x11 59MiB |
| 0 1484 G /usr/bin/krunner 2MiB |
| 0 1486 G /usr/bin/plasmashell 129MiB |
| 0 5345 G ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files 134MiB |
+-----------------------------------------------------------------------------+
I did:
Code: Select all
$ sudo modprobe -r nvidia_uvmCode: Select all
~$ lsmod | grep -i nvidia
nvidia_drm 49152 6
nvidia_modeset 1114112 26 nvidia_drm
nvidia 20463616 1437 nvidia_modeset
drm_kms_helper 180224 1 nvidia_drm
drm 491520 9 drm_kms_helper,nvidia_drm
ipmi_msghandler 102400 2 ipmi_devintf,nvidia
i2c_nvidia_gpu 16384 0
Code: Select all
modprobe: FATAL: Module nvidia_modeset is in use.