virtualbox.org

I did a bit of benchmarking, testing and tuning of VirtualBox and how it interacts with Intel Nehalem and Westmere + Intel SMT (Simultaneous multithreading EG: HyperThreading).
I did so on the following Hardware + OS(s):
Sun/Oracle x6270 Blades - 2x Intel Nehalem X5570 = 8 Physical, 16 Virtual Cores + OpenSUSE 11.1/11.2, SLES 11, Oracle Linux 5.5, CentOS 5.4/5.5, Windows Server 2008 R2 x64
Sun/Oracle x4170 Servers - 2x Intel Nehalem X5570 = 8 Physical, 16 Virtual Cores + OpenSUSE 11.1/11.2, SLES 11, Oracle Linux 5.5, CentOS 5.4/5.5, Windows Server 2008 R2 x64
HP Z800 Workstations - 2x Intel Nehalm W5580/X5570 = 8 Physical, 16 Virtual Cores + OpenSUSE 11.1/11.2, SLES 11, Oracle Linux 5.5, CentOS 5.4/5.5, Windows Server 2008 R2 x64
HP Z800 Workstations - 2x Intel Westmere = 12 Physical, 24 Virtual Cores + OpenSUSE 11.1/11.2, SLES 11, Oracle Linux 5.5, CentOS 5.4/5.5, Windows Server 2008 R2 x64

One common thread between all of these HW configurations is that the "Optimal Defaults" in the system BIOS is to have SMT enabled. Benchmarking and testing outside of VirtualBox yielded an overall observation that unless an application was heavily threaded or otherwise written to take advantage of a multi-core architecture there was no benefits to be had. However having Intel SMT "On" did not hurt actual system performance baseline or existing applications that that were single-threaded or non-multi-core aware. Sparing details actual performance differences across the board really boiled down to the efficiency of the OS scheduler itself.

Taking this data into account much the same behavior transgressed to a muli-core guests. As expected I found that much like other hypervisors (eg: KVM/VMware) one can always "over-subscribe" cores in this case real or virtual. Just bear in mind that that in doing so your mileage may vary on actual shared CPU cycles between the guests based mostly on the ability of your OS scheduler. The schedulers overall awareness of the base NUMA architecture and it's ability to context switch and efficiently schedule between cores will be your main focus when over-subscribing cores even in the case of Intel SMT.

A side note here somewhat but not directly related is that on the newer Intel Nehalem/Westmere/Core architectures lies an interesting animal known as "Turbo Boost". Turbo boost compliments Intel SMT. In general Intel Turbo boost will up the clock of a particular core(s) when the OS requests a P0 power state of the associated core(s) to which the process thread has been scheduled. Again efficiency of the OS scheduler plays into when a P0 state is kicked in for a particular process or core. In general VirtualBox guest processes would only kick the associated virtual/real OR real core into a P0 state when the guest core count was allocated the optimum for applications needs within the guest. Basically this says that if you have processors sitting 50% or so idle in a guest, it may be prudent to trim back on your core count to the guest. You mileage may vary on this based on the actual application load or purpose within the guest. It will take some benchmarking, testing and tuning with your particular guest(s) and associated workloads on your target HW/Base OS configuration to capitalize on and get the most out of it.

Hi Crash,

Thank you for pointing out what I had suspected for a while.

I have two x5650 (6 cores + 12 Hyperthreads) servers running on Supermicro X8SAX motherboards. host: CentOS 5.5 X64. Server 1: Guest: Windows Server 2008 x32. Server 2: Windows Server 2003 x32. Both are used as Terminal Servers. Everyone noticed no faster on four cores than two.

What I did notice under 3.1.6 and 3.1.8 was that more than two cores caused a CPU race condition that took the guest to their knees. I had to knock them both back to two cores to get the system to stabilize. I also noticed that if I wanted any stability at all, the guest machines had to be configured based on physical cores and not virtual cores. 3.2.0 did not work at all due to a bug in Nested Paging. But, 3.2.2 seems to have done the trick. I now have the 2003 server running four cores with no CPU storms. I do peg 100% for about 5 seconds when ever someone logs in, but it goes back down to 5 to 15% afterward. (Under 3.1.6/8 it would just stay at 100%.) I am about to put the Windows Server 2008 back up to four cores (it has been on 3.2.2 since last week on two cores).

The WS2008 server, running two cores, pegs at 100% for about 45 seconds whenever someone log in, but does go back to normal.

I also noticed that an odd number of multiprocessors (3, 5) gave certain M$ software all kinds of trouble (program crashes, Terminal Server screen freezes, CPU storms), especially M$SQL server. With 3 or 5 processors, I got to kiss stability goodbye.

And as an aside, I have to constantly tell my Terminal Services folks that TS does not mean an unlimited amount of users running unlimited numbers of applications on a single, powerful computer. I ask them just how many instances of a particular program can they run on their local computer before it bogs down. They don't seem to get the point.

So the moral of the story is that if your program is only using one core, having extra cores on standby does not help performance.

I really appreciate you sharing your results.

-T

I spoke too soon. My Windows Server 2003 guest used as a Terminal Server CPU stormed out today and I had to set it from four cores to two.
http://www.virtualbox.org/ticket/6928

Rats!

-T

AFAIK, MS advises to use 2 CPUs for Terminal Services, more isn't recommended anyway.

Well you will not see much improvements until you are able to set affinity and select which (2) can go to which guest. And Sas is correct as I was told to use no more then (2) on terminal services as well. Seems to be the suggestion at MS. But I wonder why you are asking these questions here Todd. It is clear that you are using this in a commercial environment. You do get service with your purchased license.

Perryg wrote:Well you will not see much improvements until you are able to set affinity and select which (2) can go to which guest. And Sas is correct as I was told to use no more then (2) on terminal services as well. Seems to be the suggestion at MS. But I wonder why you are asking these questions here Todd. It is clear that you are using this in a commercial environment. You do get service with your purchased license.

Thank you!

Have not been able to find where to sign up for commercial tech support.

Can you recall where you saw the two core Terminal Server recommendation?

-T

Very interesting.

crash0veride : Please tell us which Host OS you use...

ToddAndMargo wrote:
Perryg wrote:Well you will not see much improvements until you are able to set affinity and select which (2) can go to which guest. And Sas is correct as I was told to use no more then (2) on terminal services as well. Seems to be the suggestion at MS. But I wonder why you are asking these questions here Todd. It is clear that you are using this in a commercial environment. You do get service with your purchased license.
Thank you!

Have not been able to find where to sign up for commercial tech support.

Can you recall where you saw the two core Terminal Server recommendation?

-T

Commercial licenses are available from Oracle, although at the moment you have to purchase from Oracle sales or an Oracle partner.
The part number to order is: XVBII-LCO-9929 and is called a VirtualBox per User Perpetual License Entitlement. (Roughly $50 list and you can ask for Premier Support with that)
1-800-633-1058

I don't recall the name but it was MS tech support that told me to use (2) CPU's

A few colleagues at work also said this about TS. They got the answer from both MS and VMWare (we use ESX at work). They were seeing some weird behaviour, lots of kernel times and once it got back to 2 CPU's, it was all gone.

crash0veride : Please tell us which Host OS you use...

For me this is a rather loaded question

I run a variety of Operating Systems and VirtualBox combinations. I change things up based mainly on what fully supports and performs the best with the current HW technology I have brought in and am working with at the time. I also very dilligent on keeping base OS up to date.
Fully supports means that the Operating System and it's core kernel and supporting functions fully understands and knows how to deal with and maximize on the underlying hardware architecture.
I have a mantra: "Just because the OS loads and runs on the HW does not mean it everything is fully supported nor optimal."
What I mean is optimally a 2.6.32+ kernel would be best suited for Westmere architecture. A 2.6.27 kernel would load run and work but is not optimal, however if it was a Nehalem architecture then 1.5 years ago this was an optimal kernel for a for that architecture. (This I refer to mostly in the context of the newest OpenSUSE/SLES Kernels) A 2.6.18 kernel with tons of back ported functionality (EG: RHEL 5.4/5.5, OEL 5.4/5.5, CentOS 5.4/5.5) would also be a good contender however with the level patches and back ports it is really not a 2.6.18 kernel anymore more like a 2.6.30+ kernel with a 2.6.18 kernel ABI. I will say that when it comes to the current OpenSUSE 11.2 or SLES 11 SP1 vs RHEL/OEL/CentOS 5.5 that the performance differences between SUSE and Redhat as pertains to VirtualBox is a wash. Additionally at their core the current Microsoft windows kernels are actually very well optimized and hold their own (Win7 x64 and Win2k8R2 x64). It will really boil down to two things:
1) Personal preference on what OS you feel the most comfortable working with and supporting on the HW technology you have chosen.
2) The OS that best fully supports and is optimized for your chosen HW technology in combination with the intended use case, in my case for example a VirtualBox host/server.

Here are a few HW configurations and OS combination examples:
----
- Sun/Oracle Blade 6000
- Sun/Oracle x6270 blades
- Sun/Oracle Blade 6000 Virtualized NEM
- Sun/Oracle Amber Road (Storage 7000)
- 10 GBE and 1 GBE + 802.3ad
Current Primary Operating System(s): CentOS 5.5 x64, OpenSUSE 11.2 x64
----
- Sun/Oracle x4170
- Internal Storage (SSD)
- Sun/Oracle J4200 (DAS)
- 1 GBE and 1 GBE + 802.3ad
Current Primary Operating System(s): OpenSUSE 11.2 x64
----
- Sun/Oracle x4140
- Internal Storage
- 1 GBE and 1 GBE + 802.3ad
Current Primary Operating System(s): OpenSUSE 11.2 x64 or CentOS 5.5 x64
----
- Sun/Oracle x4100m2
- Fibre Channel SAN (EMC)
- 1 GBE and 1 GBE + 802.3ad
Current Primary Operating System(s): CentOS 5.5 x64, SLES 11 SP1 x64
----
- Sun/Oracle x4450
- Internal Storage
- 1 GBE and 1 GBE + 802.3ad
Current Primary Operating System(s): Centos 5.5 x64, Win2k8R2 x64
----
- HP Z800 (Nehalem, Tylersburg B3)
- Internal Storage
- 1 GBE
Current Primary Operating System(s): OpenSUSE 11.2, Win 7 x64
----
- HP Z800 (Westmere, Tylersburg C2)
- Internal Storage
- 1 GBE
Current Primary Operating System(s): OpenSUSE 11.2, Win 7 x64
----
- Sun/Oracle Ultra 27
- Internal Storage
- 1 GBE
Current Primary Operating System(s): OpenSUSE 11.2, OpenSolaris b134
----
- ASUS G73JH
- Internal Storage (SSD/SATA)
- 1 GBE
Current Primary Operating System(s): OpenSUSE 11.2, Win 7 x64
----

virtualbox.org

VirtualBox and Intel Nehalem/Westmere + Intel SMT

VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT

Re: VirtualBox and Intel Nehalem/Westmere + Intel SMT