Performance decrease when running 30+ VMs simultaneously
Posted: 31. Jan 2017, 19:12
Hi there,
Let's clear up the host machine first, so you don't think I'm trying some nonsense with a lowcost laptop.
It is a blade server with:
- 2x Xeon E5-2699 v3
- 630 GB RAM (2133 MHz)
Part of the RAM (200 GB) is used as a RamDisk (tmpfs).
Currently I'm using VirtualBox 5.1.8 (r111374)
I have created 2 virtual machines, Win7x86 & Win7x64. They have the same settings:
- 2 CPU cores
- 4 GB RAM
- "paravirtualization interface" set to "legacy", because I've created them in VBox 4.x before
- host-only net adapter
Everything else is left as default.
Each of the machines is link-cloned 40x. All of the 82 machines reside in the RamDisk.
I run and use only the cloned machines.
They are part of a dynamic malware analysis framework called cuckoo ( cuckoosandbox org ), you might've heard of it.
Because of that, the machines are frequently in the cycle of start, poweroff, revert.
Now to the problem - I'm experiencing a significant performance decrease over a few minutes.
At first, when every VM is powered-off and the framework will start them all at once, there is apparently no problem. They run smoothly.
Average time of the "startvm" command is about 7-10 seconds.
A few minutes and a few reverts/starts later, the "startvm" command starts to struggle and the duration goes up to minutes - I've seen 6-10 minutes are not rare.
The count of concurrent running machines drops from 80 to 15 (sometimes even less, like 6).
More strangely, the CPU is literally bored during those long lasting "startvm" commands, as the load is at ~12.
Note that no disk I/O is performed (I'm monitoring it), everything is happening in the RAM.
Another observation I made is that, when I was talking about how the duration of the "startvm" goes up to few minutes, they are all started at once.
For example: 10 machines are running, 50 machines are in the "startvm" state, after a 6 minutes 30 of them are suddenly started.
I have no idea why is this happening. It looks to me like some mutex/lock (in VBoxSVC?) is preventing them to start.
Has anyone else experienced - or, even better, solved - this problem? Or is it a limitation of the VirtualBox? Or is it some bug in the VirtualBox?
If anyone is interested in more details, I'd be pleased to provide them.
Let's clear up the host machine first, so you don't think I'm trying some nonsense with a lowcost laptop.
It is a blade server with:
- 2x Xeon E5-2699 v3
- 630 GB RAM (2133 MHz)
Part of the RAM (200 GB) is used as a RamDisk (tmpfs).
Currently I'm using VirtualBox 5.1.8 (r111374)
I have created 2 virtual machines, Win7x86 & Win7x64. They have the same settings:
- 2 CPU cores
- 4 GB RAM
- "paravirtualization interface" set to "legacy", because I've created them in VBox 4.x before
- host-only net adapter
Everything else is left as default.
Each of the machines is link-cloned 40x. All of the 82 machines reside in the RamDisk.
I run and use only the cloned machines.
They are part of a dynamic malware analysis framework called cuckoo ( cuckoosandbox org ), you might've heard of it.
Because of that, the machines are frequently in the cycle of start, poweroff, revert.
Now to the problem - I'm experiencing a significant performance decrease over a few minutes.
At first, when every VM is powered-off and the framework will start them all at once, there is apparently no problem. They run smoothly.
Average time of the "startvm" command is about 7-10 seconds.
A few minutes and a few reverts/starts later, the "startvm" command starts to struggle and the duration goes up to minutes - I've seen 6-10 minutes are not rare.
The count of concurrent running machines drops from 80 to 15 (sometimes even less, like 6).
More strangely, the CPU is literally bored during those long lasting "startvm" commands, as the load is at ~12.
Note that no disk I/O is performed (I'm monitoring it), everything is happening in the RAM.
Another observation I made is that, when I was talking about how the duration of the "startvm" goes up to few minutes, they are all started at once.
For example: 10 machines are running, 50 machines are in the "startvm" state, after a 6 minutes 30 of them are suddenly started.
I have no idea why is this happening. It looks to me like some mutex/lock (in VBoxSVC?) is preventing them to start.
Has anyone else experienced - or, even better, solved - this problem? Or is it a limitation of the VirtualBox? Or is it some bug in the VirtualBox?
If anyone is interested in more details, I'd be pleased to provide them.