Page 1 of 1

Is VirtualBox good for my application - multiple instances .

Posted: 22. May 2009, 00:20
by mikemikemike
Hello,

Is VirtualBox good for my application - multiple instances of a large model running in parallel?

I'm enticed by Virtualbox, but unfortunately have little background on the IT side. I wonder if anyone on this forum can help me out, including with helping me figure out the right questions to ask and how to ask them. My situation follows (hopefully not in too much detail).

I want to invest some unfortunately very limited grant money in hardware ($10,000 or so), and I’m trying to figure out whether I can extend my effective computational power (model runs/dollar) using Virtualbox.

I am running a hydrological model (called WEAP) that runs on Windows XP. I want to use it in a Monte Carlo analysis, which means I’ll need to run it thousands of times, collecting the data from each run. I have a program to call my model and harvest the data. The model is quite large (takes 1hr to run on a my current workstation, a new dell T3400 with 2 core 3.XX ghz processor, RAID 5, 8GB RAM).

I have not formally evaluated the current computational bottleneck. I don’t have the skills to even know where to begin on really doing that. But below is what information I have, some of which comes from the developer of the modeling software:
-The model code is not parallelized, so can only use a single processor core for any instance of the model. Dual core machines are possibly helpful because they can allocate the other core for other processes than the model. Beyond this, probably not helpful.
-One can only run a single instance of the model on each instance of Windows at any one time.
-The model appears to be RAM-limited. To be more specific, using a RAMDisk of 3-6 GB speeds it up tremendously, and it is effectively unrunnable without a ramdisk (set up as a virtual hard drive). So I guess it’s the reading and writing to the model database that is the limiting factor, as opposed to the mathematical calculations, but I’m not sure.
-I’m using Superspeed RamDisk Plus software now, which enables me to make a >4GB RamDisk on a 32-bit Windows XP system.
-Using a RAID also speeds the model up.
-I’ve been told that processor speed is helpful, but not the critical bottleneck, but have not verified that myself.

Given all that, how do I figure out if I’d be better off buying fewer multi-core workstations and running them with several Virtualbox machines on each, or more (cheaper) 2 core machines and running Windows XP natively on them? I don’t have access to a multi-core machine to try this out beforehand (unless I can get useful information from experiments on one dual-core machine?).
My concern is that if I put all my eggs in the basket of big machine, and this doesn’t work out, then I’m stuck with just one expensive machine that doesn’t run the model any faster, and my modeling work will suddenly be intractable. I’m looking at months of computing time, so it’s a significant question for me.

Any thoughts would be much appreciated, including suggestions of specific questions I am not asking, but should be.

Thanks in advance,
Mike

Re: Is VirtualBox good for my application - multiple instances .

Posted: 22. May 2009, 16:49
by vbox4me2
To some degree yes VBox would make it benefitual, but there are a multitude of conditions that makes it useful or not. For example if the app. pushes near 100% cpu vbox managing core will have diffeculty passing cpu power to other VM's, if the app. pushes near 90% cpu vbox core does have enough control. On another note if you manually assign VM's to cpu's (on the Host) you can push more then 90% cpu per VM but still not 100% to remain effective (they will run but not as effective as anything less then 100%).

A well behaving app. staying below 90% cpu will get you what you want by 50% max (a quadcore = 4/2 max power to 4 vm's).

A noname quadcore with 16gb ram and 1TB storage, assembled, doesn't have to cost anything more then $2.000.

Re: Is VirtualBox good for my application - multiple instances .

Posted: 22. May 2009, 21:55
by fixedwheel
Hi,
mikemikemike wrote:Given all that, how do I figure out if I’d be better off buying fewer multi-core workstations and running them with several Virtualbox machines on each, or more (cheaper) 2 core machines and running Windows XP natively on them?
for calculation purpose: some time ago i ran a single-threaded benchmark on my singlecore non hyperthreading host and then on my vbox guest, both 32bit no VT-x... I got 72% of memory and floatingpoint and 74% of integer performance compared to native, but your mileage may vary
mikemikemike wrote: I don’t have access to a multi-core machine to try this out beforehand (unless I can get useful information from experiments on one dual-core machine?).
you could try with running one instance on a vbox guest one native on the host, or two guest instances and "idle" host to get some experience

Re: Is VirtualBox good for my application - multiple instances .

Posted: 22. May 2009, 22:43
by HolgerB
Hey Mike,

that´s indeed a very interesting topic !
From my experience most VMs have performance issues (as vbox4me2 already stated) if the running apps push the CPU to the limit and/or they cause heavy IO. Since you usually do not access a physical harddisk or a physical networkcard the virtualisation solution has always to do some "emulation" which can easily eat up available resources.
-The model appears to be RAM-limited. To be more specific, using a RAMDisk of 3-6 GB speeds it up tremendously, and it is effectively unrunnable without a ramdisk (set up as a virtual hard drive). So I guess it’s the reading and writing to the model database that is the limiting factor, as opposed to the mathematical calculations, but I’m not sure.
Hm, no offence but the model itself seems to be more IO-limited. If you use a RAM disk and this speeds up thing tremendiously this indicates to me that another approach might help:
Using a hostsystem with a lot of RAM and may be Linux 64 as operating system. Then you could set up two RAM disks (depending how much diskspace you need for your model database) and simply run the VBox Guest HDDs in the RAM disk. All this is theory of course and you could get away cheaper by buying 4 fast single core WinXP machines with 8 GB each and simply running WinxP physically on them. Hard to judge without enough knowledge of the structure of your application and further analysis.

At work I know that we are using several Webservers on one physical host since its easier to get the most of a powerful PC like this.

Hope this helps,
Holger

Re: Is VirtualBox good for my application - multiple instances .

Posted: 12. Jun 2009, 01:16
by mikemikemike
Thanks for all of your thoughts.

I have to confess I'm still not clear on this - I'm going to search around to see if I can find a multi-core machine to 'borrow', and see if I can get this working, or possibly 'gamble' on a quad core machine myself.

If I'm able to get it set up, and manage to learn anything in the process, I'll report back.

Best,
Mike