nanosleep() works until it doesn't
Posted: 4. Apr 2013, 03:15
I am developing an audio application, and one of my machines is a quad-core Trinity 4600M with 8GB dual-chan mem using vbox 4.2.10 (Host: Win7 SP1 x64, Guest: openSUSE 12.2 x64). It's plugged into the wall for power for max performance. All "power saving" (performance robbing) features are disabled. The host is not running any notable processes or background processes, except for vbox.
I have a small test program that reads the time, nanosleep()s for 10ms { .tv_sec=0, .tv_nsec=10000000 }, and then checks the time again. Generally there is about 1ms of delay beyond the requested 10 ms, and that can wander up to 4ms occasionally (no big deal for devel work in a virtualize environment), but after about 30~120 seconds of these continuous 10ms nanosleeps, it suddenly doesn't return for 200~3800 ms! This does NOT happen on the bare metal (typically just 80-110us extra delay beyond 10ms, max observed 1.4ms). I tried backing up to vbox 4.2.6 since I saw a nanosleep/SIG_ALRM fix for 4.2.8, but that didn't help. VirtualBox otherwise does not freeze during this period -- it's doing fine. The cores are only 10~20% utilized when this happens. It doesn't freeze forever, so I can't get a kernel debugger on it. I tried nanosleep, clock_nanosleep with CLOCK_MONOTONIC and with/without TIMER_ABSTIME, and the portable select(0, NULL, NULL, NULL, &tv) method. All produce the same issue. If I use nanosleep(80ms) then the issue doesn't seem to happen. Also I can omit the nanosleep and let the thread burn in a hot loop, and all is well.
Possible vbox bug around small timers?
I have a small test program that reads the time, nanosleep()s for 10ms { .tv_sec=0, .tv_nsec=10000000 }, and then checks the time again. Generally there is about 1ms of delay beyond the requested 10 ms, and that can wander up to 4ms occasionally (no big deal for devel work in a virtualize environment), but after about 30~120 seconds of these continuous 10ms nanosleeps, it suddenly doesn't return for 200~3800 ms! This does NOT happen on the bare metal (typically just 80-110us extra delay beyond 10ms, max observed 1.4ms). I tried backing up to vbox 4.2.6 since I saw a nanosleep/SIG_ALRM fix for 4.2.8, but that didn't help. VirtualBox otherwise does not freeze during this period -- it's doing fine. The cores are only 10~20% utilized when this happens. It doesn't freeze forever, so I can't get a kernel debugger on it. I tried nanosleep, clock_nanosleep with CLOCK_MONOTONIC and with/without TIMER_ABSTIME, and the portable select(0, NULL, NULL, NULL, &tv) method. All produce the same issue. If I use nanosleep(80ms) then the issue doesn't seem to happen. Also I can omit the nanosleep and let the thread burn in a hot loop, and all is well.
Possible vbox bug around small timers?