autostop savestate VBoxSVC hung on reboot of host

Discussions related to using VirtualBox on Mac OS X hosts.
Post Reply
briend
Posts: 2
Joined: 29. Jan 2015, 19:32

autostop savestate VBoxSVC hung on reboot of host

Post by briend »

VirtualBox-4.3.21-97963
Host: OSX 10.10.2 on Xserve late 2009. 6GB RAM
Guest: OSX 10.10.2 1proc, 2GB RAM
no console session logged-on

This is a bit of a rabbit hole. I wanted to setup the VMs to automatically start at boot, and automatically savestate on shutdown or reboot of the host.
I started here /manual/ch09.html#autostart-osx
and I got the autostart working just fine by setting up /etc/vbox/autostart.cfg and changing the vm settings:
modifyvm --autostart-enabled on
modifyvm --autostart-delay 30
modifyvm --autostop-type savestate

Autostart works fine, but I noticed the autostop was simply not happening. It finally dawned on me that the LaunchDaemon was not setup properly, or, rather the shell script it executes is not a "daemon". If you look at the file VBoxAutostartDarwin.sh that the LaunchDaemon runs, you'll notice at the end there is a (possibly vestigial) trap:

trap vboxStopAllUserVms HUP KILL TERM

So, clearly, the intent is that if the shell script receives the TERM signal it will stop all the VMs according to the autostop-type, but this can't happen since the script immediately exists after starting the VMs. I modified this script to include a sleep loop, to make it a proper daemon:

trap vboxStopAllUserVms HUP KILL TERM
while true; do
sleep 10
done

So with this change you can now launchctl stop org.virtualbox.vboxautostart and it will actually savestate your VMs. So now when I reboot, launchd will send that signal to the script and shutdown/savestate all the VMs!! Not quite. It still wasn't happening. So, I thought maybe launchd wasn't waiting long enough. I added a key to: /Library/LaunchDaemons/org.virtualbox.vboxautostart.plist

<key>ExitTimeOut</key>
<integer>600</integer>

So now launchd will wait a full 10 minutes after sending SIGTERM before forcibly killing VBoxAutostartDarwin.sh. When I do a reboot now, it takes 10 minutes to reboot and it still doesn't savestate my VMs! VirtualBox is getting stuck.

So, I start another sshd on another port so I can remain logged in during all of this badness:
/usr/sbin/sshd -p 23

and now I reboot again and watch what happens:

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
501 4448 100.0 0.2 2525860 14152 ?? R 2:46PM 0:18.75 /Applications/VirtualBox.app/Contents/MacOS/VBoxSVC --auto-shutdown
501 4470 8.5 24.4 4093064 1538088 ?? S 2:47PM 0:17.87 /Applications/VirtualBox.app/Contents/MacOS/VBoxHeadless --comment test --startvm 1a806c85-cff4-46d8-aef8-0f0a090a4103 --vrde config
501 4446 0.0 0.1 2452220 7112 ?? S 2:46PM 0:00.05 /Applications/VirtualBox.app/Contents/MacOS/VBoxXPCOMIPCD
root 3781 0.0 0.0 2445660 1052 ?? Ss 2:46PM 0:00.06 /bin/sh /Applications/VirtualBox.app/Contents/MacOS/VBoxAutostartDarwin.sh /etc/vbox/autostart.cfg
root 2332 0.0 0.0 2445660 1172 s001 S 2:29PM 0:00.01 bash
root 2331 0.0 0.0 2444420 2508 s001 S 2:29PM 0:00.01 sudo bash
501 2328 0.0 0.0 2452828 1148 s001 Ss 2:29PM 0:00.01 -bash
501 2327 0.0 0.0 2482656 1112 ?? S 2:29PM 0:00.02 sshd: tss@ttys001
root 2324 0.0 0.1 2481608 3976 ?? Ss 2:29PM 0:00.05 sshd: tss [priv]
root 2323 0.0 0.0 2443888 612 ?? Ss 2:29PM 0:00.00 /usr/sbin/sshd -p 23
root 935 0.0 0.2 2558304 12960 ?? S 2:25PM 0:00.11 /System/Library/CoreServices/ManagedClient.app/Contents/MacOS/ManagedClient -s
root 4488 0.0 0.0 2432788 584 s001 R+ 2:48PM 0:00.00 ps auwxww
root 1 0.0 0.1 2535612 6152 ?? Us 2:24PM 0:01.67 /sbin/launchd
root 4486 0.0 0.0 2432752 536 ?? S 2:48PM 0:00.00 sleep 10

These are ALL the processes alive on the box after the shutdown -r command is given. This entire time the VM guest is accessible and running normally, including network access, until the 10 minutes is up and it is abruptly killed:

Nothing indicates a savestate was initiated at all:
bash-3.2# tail VBox.log
00:00:06.650788 VUSB: attached 'HidKeyboard' to port 1
00:00:06.650807 VUSB: attached 'HidMouse' to port 2
00:00:09.805507 AIOMgr: Host limits number of active IO requests to 16. Expect a performance impact.
00:00:11.402026 NAT: link up
00:00:11.402332 NAT: DNS#0: 140.198.128.198
00:00:11.402341 NAT: DNS#1: 140.198.128.199
00:00:11.423178 NAT: IPv6 not supported
00:00:11.424196 NAT: DNS#0: 140.198.128.198
00:00:11.424208 NAT: DNS#1: 140.198.128.199
00:00:11.424215 NAT: DHCP offered IP address 10.0.2.15


So, I think the problem is that VBoxSVC gets hung up at 100% due to a dependency being yanked out. Since `launchd stop org.virtualbox.vboxautostart` works perfectly (without rebooting), I have to believe it has something to do with another service that VirtualBox depends on that is being killed too early by launchd. As you can see in the table, nearly ALL the processes on the box are killed almost immediately, presumably before VBox gets to do its thing. There doesn't seem to be any easy way to create dependencies or set any kind of shutdown "order" with launchd. Ideally you could force a script to run "last" on boot, and shutdown "1st" on shutdown. . . alas, launchd... perhaps somehow has written a C application that hooks into every dependency of OSX, and wedges good old init.d scripts back into the system. . . I can dream, can't I? Any other ideas out there? Thanks!
Attachments
notes.txt
(3.76 KiB) Downloaded 13 times
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: autostop savestate VBoxSVC hung on reboot of host

Post by socratis »

Not sure if this is going to help: viewtopic.php?f=7&t=63537#p298423
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
briend
Posts: 2
Joined: 29. Jan 2015, 19:32

Re: autostop savestate VBoxSVC hung on reboot of host

Post by briend »

socratis wrote:Not sure if this is going to help: viewtopic.php?f=7&t=63537#p298423
Thanks, but that doesn't seem to be it. Start and Stop (savestate) of VMs is configured to work just fine, I can start and stop the service (launchctl load and launchctl stop). It's just actually rebooting breaks something else in the system. If it was an INIT system it would probably be an ordering issue. I'm going to try to turn on DEBUG and see if VBoxSVC is telling me anything. Networking is still functional so it must be some other dependency. . .
Luicas
Posts: 1
Joined: 17. Feb 2015, 20:32

Re: autostop savestate VBoxSVC hung on reboot of host

Post by Luicas »

Hi, I'm in the same boat. Running Yosemite 10.10.2 After fiddling a bit with vboxautostartdarwin.sh, adding more signals to the trap, I can finally load and unload the plist and it all works as it should. Except during reboot. Also added an exit wait time to the launch .plist. It waits until the exit time to send the kill signal to the process, and the OS immediately shuts down. So, no time to save VM's states.

Code: Select all

trap vboxStopAllUserVms SIGKILL SIGTERM SIGINT SIGHUP HUP KILL TERM EXIT
while true; do
sleep 10
done
Any new insight on this topic?

Regards,
Luis
alextunick
Posts: 1
Joined: 22. Mar 2015, 22:04

Re: autostop savestate VBoxSVC hung on reboot of host

Post by alextunick »

For some reason I cannot make a full post here as this forum does not allow me to do it.
So I'm going to give a hints only and later I'll try to post complete working example.
All you have done before is right. The only missing thing - trap is triggered too late, as you mentioned "it has something to do with another service that VirtualBox depends on", and it's true - user shell dropped. If you try to run command on behalf of your user (using su, it is exactly what's going on in VBoxAutostartDarwin.sh) it will say that user does not exists - that is the problem why vbox cannot shutdown it (put more logs into shell script you are running - I used echo to file to capture it).
Important thing is to put org.virtualbox.vboxautostart.plist file to ~/Library/LaunchAgents/ instead of /Library/LaunchDaemons/
This makes trap triggered when we still have user shell.
Launchd has next levels:
- ~/Library/LaunchAgents Per-user agents provided by the user.
- /Library/LaunchAgents Per-user agents provided by the administrator.
- /Library/LaunchDaemons System wide daemons provided by the administrator.
- /System/Library/LaunchAgents Mac OS X Per-user agents.
- /System/Library/LaunchDaemons Mac OS X System wide daemons.

By putting plist file to ~/Library/LaunchAgents makes shell running under currently logged in user, when it is under /Library/LaunchDaemons it runs as root.
The issue with VBox - running vbox commands under root (and it is root when using /Library/LaunchDaemons/*) it does not see VMs. So you need to tell somehow to VBox where are those VMs (but remember you do not have user at this moment already so you cannot run it on behalf of that user). Another solution is to register VMs under root - than it should work (I didn't try it myself, please let me know if you do that way successfully)
BillGates
Posts: 4
Joined: 21. Sep 2014, 20:40

Re: autostop savestate VBoxSVC hung on reboot of host

Post by BillGates »

I wanted Virtualbox and it's startup and shutdown procedure to be independent of a user login, so to me LaunchAgents wasn't a a solution. I spent some hours on this, maybe not the most fancy way of doing this, but it will at least serve as an proof of concept.

For startup I followed the instructions given here:
https://gist.github.com/reidransom/6042016

I skipped the part about changing "KeepAlive" to "true" since that change won't make the autosave on shutdown function work anyway.

So my contribution will only take care of saving on shutdown (or ACPI shutdown if that is what you want, just change the code). This works and has been tested with OS X 10.10.4. Also my solution is seperate from "/etc/vbox/autostart.cfg" so you would have to edit the code below to have it save your vm's (unless your user-id is 501, done in vbox-shutdownd :) )

/Library/LaunchDaemons/no.ifixit.vbox-shutdown.plist

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Disabled</key>
    <false/>
  <key>Label</key>
    <string>no.ifixit.vbox-shutdown</string>
  <key>ProgramArguments</key>
  <array>
    <string>/usr/local/bin/vbox-shutdownd</string>
  </array>
  <key>RunAtLoad</key>
    <true/>
  <key>LaunchOnlyOnce</key>
    <true/>
  <key>KeepAlive</key>
    <false/>
  <key>ExitTimeOut</key>
    <integer>3600</integer>
</dict>
</plist>
ExitTimeOut is 3600 seconds or one hour, you might lower this, the script will exit itself when finished, but if/when something goes wrong you'll have to wait for one hour before reboot.

/usr/local/bin/vbox-shutdownd

Code: Select all

#!/bin/sh
export PATH=/bin:/usr/bin:/usr/local/bin

function poweroff() {
	sudo -u \#501 -i vbox-shutdown 501
	kill -9 $$
}

sudo -u \#501 -i vbox-startup &
trap poweroff HUP KILL TERM INT

while true; do
	sleep 86400 &
	wait $!
done

exit 0
This starts two other scripts, one that will launchd when the machine starts up, and one on shutdown. Notice "sudo -u \#501" which starts the script as the user with id 501, since username (dscl) doesn't work to well in launchd (as stated in posts above). Also notice "kill -9 $$" which will kill this script so it doesn't take an hour to exit because of the while true loop which is needed for the trap.

/usr/local/bin/vbox-startup

Code: Select all

#!/bin/bash
export PATH=/bin:/usr/bin:/usr/local/bin

for vm in `VBoxManage list vms | cut -f2 -d\{ | cut -f1 -d\}`; do
	vbox-halt-save VBoxManage controlvm "$vm"  savestate &
done

exit 0
I had troubles saving the vm's at shutdown because I really don't know, but maybe because dscl is down and Virtualbox can't figure out it's environment? What I did to counter that is to launch and immediately pause the command (done in vbox-halt-save), then at shutdown I can resume the job which seems to settle the environment issues.

/usr/local/bin/vbox-halt-save

Code: Select all

#!/bin/bash
export PATH=/bin:/usr/bin:/usr/local/bin

$@ &
PID=$!
kill -STOP $PID
wait $PID
exit 0
/usr/local/bin/vbox-shutdown

Code: Select all

#!/bin/bash
export PATH=/bin:/usr/bin:/usr/local/bin
IFS=$' \t\n'
UID=$1

PIDS=$(ps -ww -axo pid,uid,xstat,command | egrep "^\s*\d+\s+$UID\s+11\s+.*vbox-halt-save" | awk '{print $1}')

if [ -z $PIDS ]
then
exit 0
fi

kill -CONT $PIDS

REGLIST=$(echo $PIDS | perl -pe 's/([0-9]+)(\s+)/^\\s\*$1\$\|/g' | perl -pe 'chop()' )

while [ $(ps -ww -axo pid | egrep -c "$REGLIST") -gt 0 ]
do
	sleep 1
done

exit 0
This script first resumes all the jobs that was paused at launch, then it monitors the jobs to see if they are finished.

One last note. /sbin/shutdown seems to be launching before the launchd daemons shutdown process (traps). So you don't need all this if you use that. Somewhat like this will work.

Code: Select all

mv /sbin/shutdown /sbin/shutdown.org
Then create a script in /sbin/shutdown and have it do all your commands, at the end of the script it also needs to launchd /sbin/shutdown.org (exec /sbin/shutdown.org "$@"). Your commands could be as simple as "VBoxManage controlvm foo savestate" and do that for every virtual machine etc. This approach will lead to trouble with El Capitan though since he doesn't take lightly to editing /sbin/shutdown. So that is one of the reasons I did it the "right" way with launchd.
Post Reply