Virtual partition vs. raw harddisk access and effects on performance
Virtual partition vs. raw harddisk access and effects on performance
On my previous Linux 12.04 host I ran Windows XP as a guest OS, using VB's standard virtual harddisk mechanism (i.e. the virtualized harddisk lives in a file on the host OS filesystem). On my new Linux 18.04 host I'm now running Windows 10 as a guest OS, using raw harddisk access for the guest. In other words, I have a separate NTFS partition on my harddisk (separate from the ext4 partitions used by the Linux host OS) which W10 accesses directly rather than as a virtualized disk living in a file on the Linux file system. In the Windows Task Manager I see that Disk 0 (C:) is labelled "VBOX HARDDISK".
I had expected harddisk performance to be better this way than what I had seen previously. It's not.
Instead I see that the significant amount of harddisk activity generated by Windows 10 significantly slows down the host OS as soon as the latter attempts to perform harddisk I/O. I suspect this is because I have two different OSes trying to access the same physical disk, both unaware of the other.
I now realize that, when using standard virtualized harddisks, the physical I/O applies to a file on the host OS (in this case Linux) which means that Linux is the only OS involved in accessing the physical harddisk and therefore is able to optimize the various I/O operations in order to achieve optimum performance and minimal waste head motion.
My question: when using raw harddisk access for the guest OS, is the host OS still aware of the guest OS harddisk I/O and is it still capable of optimizing same? In other words, does "raw" access from the guest environment still go through an optimizing abstraction layer in the host OS in order to make the host OS aware of the guest's physical harddisk I/O and put it through the same optimization mechanisms as use by the host I/O for regular harddisk access? The fact that in the Windows environment the harddisk is labelled "VBOX HARDDISK" suggests this might be the case, however this type of harddisk access being known as "raw" suggests the opposite.
Or have I shot myself in the foot from a performance standpoint by having the guest OS performing raw disk access on the same physical device as the host OS is using, rather than allow the host OS to optimize the guest's I/O by treating it as regular file I/O on the host filesystem?
I had expected harddisk performance to be better this way than what I had seen previously. It's not.
Instead I see that the significant amount of harddisk activity generated by Windows 10 significantly slows down the host OS as soon as the latter attempts to perform harddisk I/O. I suspect this is because I have two different OSes trying to access the same physical disk, both unaware of the other.
I now realize that, when using standard virtualized harddisks, the physical I/O applies to a file on the host OS (in this case Linux) which means that Linux is the only OS involved in accessing the physical harddisk and therefore is able to optimize the various I/O operations in order to achieve optimum performance and minimal waste head motion.
My question: when using raw harddisk access for the guest OS, is the host OS still aware of the guest OS harddisk I/O and is it still capable of optimizing same? In other words, does "raw" access from the guest environment still go through an optimizing abstraction layer in the host OS in order to make the host OS aware of the guest's physical harddisk I/O and put it through the same optimization mechanisms as use by the host I/O for regular harddisk access? The fact that in the Windows environment the harddisk is labelled "VBOX HARDDISK" suggests this might be the case, however this type of harddisk access being known as "raw" suggests the opposite.
Or have I shot myself in the foot from a performance standpoint by having the guest OS performing raw disk access on the same physical device as the host OS is using, rather than allow the host OS to optimize the guest's I/O by treating it as regular file I/O on the host filesystem?
-
- Site Moderator
- Posts: 27329
- Joined: 22. Oct 2010, 11:03
- Primary OS: Mac OS X other
- VBox Version: PUEL
- Guest OSses: Win(*>98), Linux*, OSX>10.5
- Location: Greece
Re: Virtual partition vs. raw harddisk access and effects on performance
Why? You're missing the host's I/O cache...frankvw wrote:I had expected harddisk performance to be better this way than what I had seen previously
That's what Win10 does, even in real life. My physical Win10 can take even up to an hour with the disk activity pinned at 100%. Then it settles. Usually it's the Defender, the Optimization, the .NET, the Installer, everyone and their cousin. Check what's going on inside your guest, and who's doing what...frankvw wrote:Instead I see that the significant amount of harddisk activity generated by Windows 10
My understanding is that "No, it doesn't". It has full reigns over the hard disk and no optimization/cache options. There maybe some low level calls involved where the host OS gets called, but that would be at a disk/sector level, not at a file optimized-reading level.frankvw wrote:In other words, does "raw" access from the guest environment still go through an optimizing abstraction layer in the host OS in order to make the host OS aware of the guest's physical harddisk I/O and put it through the same optimization mechanisms as use by the host I/O for regular harddisk access?
Could be. Haven't done any benchmarks, neither do I know/remember of anyone that has.frankvw wrote:Or have I shot myself in the foot from a performance standpoint by having the guest OS performing raw disk access on the same physical device as the host OS is using, rather than allow the host OS to optimize the guest's I/O by treating it as regular file I/O on the host filesystem?
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Re: Virtual partition vs. raw harddisk access and effects on performance
But instead of going through an additional virtualization layer and having to deal with two file systems, the I/O now goes straight to and from the physical partition. Or at least that was my thought.socratis wrote:Why? You're missing the host's I/O cache...frankvw wrote:I had expected harddisk performance to be better this way than what I had seen previously
Windows 10's atrocious disk I/O load aside (on which agree with you) I came across a section in the user guide that says:
However, I have now come across several benchmarks that indicate that isn't trueWith VirtualBox, this type of access is called “raw hard disk access”; it allows a guest operating system to access its virtual hard disk much more quickly than with disk images, since data does not have to pass through two file systems (the one in the guest and the one on the host).
So it looks like I have indeed shot myself in the foot. I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
-
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: PUEL
- Guest OSses: Mostly XP
Re: Virtual partition vs. raw harddisk access and effects on performance
Maybe on some obscure system that might once have been true. I don't understand how though. No benchmarks citations I'll bet.frankvw wrote:With VirtualBox, this type of access is called “raw hard disk access”; it allows a guest operating system to access its virtual hard disk much more quickly than with disk images, since data does not have to pass through two file systems (the one in the guest and the one on the host).
In answer to your question: "raw disk" access is still just as virtual as ever. It is raw in the sense of bypassing the host filesystem and host partitions. Put simply it calls the IOCTL level functions instead of the filesystem API. It does NOT mean direct hardware access (how could it??), and it does NOT equate to fast. On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower.
The fact that the guest OS might implement it's own DMA based caching does not mean that you have no problem: all I/O in a VM is (by definition) simulated using the host CPU, so DMA is not real DMA.
-
- Site Moderator
- Posts: 27329
- Joined: 22. Oct 2010, 11:03
- Primary OS: Mac OS X other
- VBox Version: PUEL
- Guest OSses: Win(*>98), Linux*, OSX>10.5
- Location: Greece
Re: Virtual partition vs. raw harddisk access and effects on performance
frankvw wrote:I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
- Use Disk2VHD, a Microsoft tool, to save your physical disk in the VHD format. On an external drive that's big enough that is, or your Linux partition if you have the space.
- Convert the VHD to VDI, which is native to VirtualBox. You can either read ch 8.24. VBoxManage clonemedium, or even easier, use the nice CloneVDI by 'mpack' utility to do so.
- Create a new VM as close to the original as possible. When prompted for a disk, choose the VDI you just created.
Thanks Don for putting in "scientific terms" what I was 80% sure about...mpack wrote:On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower.
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
-
- Site Moderator
- Posts: 27329
- Joined: 22. Oct 2010, 11:03
- Primary OS: Mac OS X other
- VBox Version: PUEL
- Guest OSses: Win(*>98), Linux*, OSX>10.5
- Location: Greece
Re: Virtual partition vs. raw harddisk access and effects on performance
Another way that I was just told by a developer, and I was not aware of, is that cloning of a rawdisk VMDK would also work! That is use the:frankvw wrote:I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
VBoxManage clonemedium "<YourVMDK>" "<YourNewVDI>" --format VDI
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Re: Virtual partition vs. raw harddisk access and effects on performance
Thank you for clearing that up! This is exactly what I wanted to know, and it is consistent with my perception (not having run any decent benchmarks) on my current system. I'll migrate back to VDI.mpack wrote:"raw disk" access is still just as virtual as ever. It is raw in the sense of bypassing the host filesystem and host partitions. Put simply it calls the IOCTL level functions instead of the filesystem API. It does NOT mean direct hardware access (how could it??), and it does NOT equate to fast. On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower. The fact that the guest OS might implement it's own DMA based caching does not mean that you have no problem: all I/O in a VM is (by definition) is simulated using the host CPU, so DMA is not real DMA.
Re: Virtual partition vs. raw harddisk access and effects on performance
I'm going to give that a try tonight. I've got enough room on my host FS so this should be fairly risk-free. Thank you, sir!socratis wrote:Another way that I was just told by a developer, and I was not aware of, is that cloning of a rawdisk VMDK would also work! That is use the:That should create a VDI that reflects the contents of your partition. I've not tried it myself, I was just told that this is a possibility, which actually makes a lot of sense...
VBoxManage clonemedium "<YourVMDK>" "<YourNewVDI>" --format VDI
-
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: PUEL
- Guest OSses: Mostly XP
Re: Virtual partition vs. raw harddisk access and effects on performance
Well, it would certainly work, but it's a pretty horrible way to clone a Windows drive. It would create a fully "dumb" image, where a 500GB drive creates a 500GB image - no smart discarding of unused sectors or pagefile.
I would rather run Disk2VHD from inside the VM.
I would rather run Disk2VHD from inside the VM.
-
- Site Moderator
- Posts: 27329
- Joined: 22. Oct 2010, 11:03
- Primary OS: Mac OS X other
- VBox Version: PUEL
- Guest OSses: Win(*>98), Linux*, OSX>10.5
- Location: Greece
Re: Virtual partition vs. raw harddisk access and effects on performance
Sure, for Windows systems. But it's certainly a "trick" that I didn't know was available for a rawdisk VMDK, plus it's cross-platform, it may have its uses for other OSes that don't speak Windoze...
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
-
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: PUEL
- Guest OSses: Mostly XP
Re: Virtual partition vs. raw harddisk access and effects on performance
There are bound to be smart equivalents of Disk2VHD for other OS, I expect even CloneZilla has that feature these days. It shouldn't ever be necessary to fall back on dumb copiers.