Periodic catastrophic failures of VB under macOS Big Sur
Posted: 15. Oct 2021, 17:18
Hello.
The problem is so strange I feel even unable to describe it. Anyway I'm trying...
I've been using VirtualBox on macOS for I think a decade without ever seeing something like that. My latest laptop is a MacBook Pro 16", bought in February 2021. No problems for months, even though I didn't use VB extensively (but I did use it). Everything started three weeks ago, when all VMs (all Ubuntu 20) started to have troubles: they booted, but soon showed failures such as crashes etc. Reinstalling VB from scratch didn't help. Having also VMWare Fusion and a tight deadline I used Fusion to deliver my duties in time, postponing the diagnosis of the problem.
Then I started working with a Vagrant project that requires VB. I expected to deal with the problem again, but strangely enough everything was fine and also the VMs that previously used to fail were ok.
I've been working extensively (several hours per day) on the new project for ten days with no problems at all. Then, all of a sudden, the bug showed up again. This time VB crashes (with different kinds of core dumps, from the classic segmentation fault to failing code signatures) immediately, even before the guest boots (with a few exceptions, that anyway don't complete the boot).
That's why I've written "periodic" and not "intermittent", since it seems it has good and bad periods of days.
Honestly the first thing I'd think about such a problem would be hardware failure (especially the failing code signature smells about it), possibly the disk. But:
+ no other application is failing (in general: LibreOffice, photo editing tools, Java stuff, IDEs, Firefox... Even VMWare Fusion can boot without problems a number of Linux guests);
+ I have about 1TB of RAW photos on the same SSD - they are subjected to non-destructive processing, so never change, and I have a tool that periodically checks the MD5 of the files - so they are deeply tested for integrity. No problems detected. An SSD problem should likely appear in other parts of the file system. Consider that with Vagrant I'm constantly destroying and recreating VMs, so they shouldn't affect the same SSD area.
+ A tool that checks S.M.A.R.T. status show no faults.
macOS Big Sur 11.5.1 + VirtualBox 6.1.26.
The problem is so strange I feel even unable to describe it. Anyway I'm trying...
I've been using VirtualBox on macOS for I think a decade without ever seeing something like that. My latest laptop is a MacBook Pro 16", bought in February 2021. No problems for months, even though I didn't use VB extensively (but I did use it). Everything started three weeks ago, when all VMs (all Ubuntu 20) started to have troubles: they booted, but soon showed failures such as crashes etc. Reinstalling VB from scratch didn't help. Having also VMWare Fusion and a tight deadline I used Fusion to deliver my duties in time, postponing the diagnosis of the problem.
Then I started working with a Vagrant project that requires VB. I expected to deal with the problem again, but strangely enough everything was fine and also the VMs that previously used to fail were ok.
I've been working extensively (several hours per day) on the new project for ten days with no problems at all. Then, all of a sudden, the bug showed up again. This time VB crashes (with different kinds of core dumps, from the classic segmentation fault to failing code signatures) immediately, even before the guest boots (with a few exceptions, that anyway don't complete the boot).
That's why I've written "periodic" and not "intermittent", since it seems it has good and bad periods of days.
Honestly the first thing I'd think about such a problem would be hardware failure (especially the failing code signature smells about it), possibly the disk. But:
+ no other application is failing (in general: LibreOffice, photo editing tools, Java stuff, IDEs, Firefox... Even VMWare Fusion can boot without problems a number of Linux guests);
+ I have about 1TB of RAW photos on the same SSD - they are subjected to non-destructive processing, so never change, and I have a tool that periodically checks the MD5 of the files - so they are deeply tested for integrity. No problems detected. An SSD problem should likely appear in other parts of the file system. Consider that with Vagrant I'm constantly destroying and recreating VMs, so they shouldn't affect the same SSD area.
+ A tool that checks S.M.A.R.T. status show no faults.
macOS Big Sur 11.5.1 + VirtualBox 6.1.26.