Page 1 of 1

Shrinking dynamic disks - why/how?

Posted: 23. Jun 2019, 15:32
by mynick
Hi all,

I've found a few guides online outlining the following process for reducing a dynamic disk image size:
  1. Delete anything you don't need in guest (disk cleanup, uninstall programs, etc)
  • DEFRAGMENT the disk
  • Use the SDELETE utility to fill up free space with zeroes
So... a couple of questions here:

1. How does this work?
Does VirtualBox look at the actual data being written to determine if it is all zeroes and handle this case by shrinking the file? Wouldn't that make things too slow?

2. Why defragment?
Does VirtualBox only shrink the file from the end towards the beginning of the disk? Would that mean that if there was an allocated block at the very end of the disk image then no free space before it can be reclaimed (even if you run SDELETE to zero it out)?

Re: Shrinking dynamic disks - why/how?

Posted: 23. Jun 2019, 17:32
by socratis
mynick wrote:Does VirtualBox look at the actual data being written to determine if it is all zeroes and handle this case by shrinking the file?
Yes, whatever is not used (0-ed out sectors) can be safely discarded, which in a dynamic disk reduces its size.
mynick wrote:Wouldn't that make things too slow?
No, why would you think that?
mynick wrote:Does VirtualBox only shrink the file from the end towards the beginning of the disk?
Nope, it discards the empty sectors, they could be anywhere on the disk's point of view; begin, middle or end, it doesn't matter.
mynick wrote:Would that mean that if there was an allocated block at the very end of the disk image then no free space before it can be reclaimed (even if you run SDELETE to zero it out)?
No, see above.
mynick wrote:2. Why defragment?
Because if a file that can fit to N sectors, occupies due to fragmentation M sectors (with M>N), then by defragmenting, you're lowering the number of used sectors by M-N.

Re: Shrinking dynamic disks - why/how?

Posted: 23. Jun 2019, 20:43
by mynick
Thanks for shedding light to this Socratis. It all makes sense.

My speed concern now that I think about really only applies to things like SDELETE. When writing any other file, chances are that you will meet something non-zero in the first few bytes of the data. It is only when something is running all-zeroes (an exceptional case) that you'd need to check the entire sector to make that decision.

Re: Shrinking dynamic disks - why/how?

Posted: 23. Jun 2019, 20:51
by socratis
mynick wrote:My speed concern now that I think about really only applies to things like SDELETE
I'm sorry, what speed concerns are you referring to?
mynick wrote:When writing any other file, chances are that you will meet something non-zero in the first few bytes of the data
Again, I'm not quite sure what that means...
mynick wrote:It is only when something is running all-zeroes (an exceptional case) that you'd need to check the entire sector to make that decision.
And again I'm afraid... ;)

Could you rephrase your whole last post, I can't really make heads or tails out of it...

Re: Shrinking dynamic disks - why/how?

Posted: 23. Jun 2019, 23:40
by mynick
So, maybe I can clarify why I was concerned with an example: Imagine something doing heavy I/O on a VDI disk. As I understand it VirtualBox does not just write the data to the host filesystem's VDI file, it actually inspects the bytes being written in order to detect "zeroed out sectors" to optimise host disk usage.

What this means is that for every sector it will look at the data to determine whether it is just 'all zeroes' and in my mind this raised alarms that "this will surely slow things down".

Then I had a reality check: how likely is it that data being written is all zeroes? Unless you are using some utility such as SDELETE, chances are that in 99% of cases, a non-zero byte will likely exist in the first 10 or so bytes of each sector (that's what EXE, DOC, HTML files have in them, not zeroes). This means that Virtualbox most likely does not need to check ALL the bytes being written, just a few per sector before it decides to skip the rest of it.

Hope this clarifies what was going on in my mind.

Either way, I've ran "defrag" on my windows guest and then "sdelete" but I did not notice significant reduction in my VDI size. I might try doing this once a year or so, but not more often than that...

Re: Shrinking dynamic disks - why/how?

Posted: 24. Jun 2019, 01:30
by socratis
mynick wrote:Imagine something doing heavy I/O on a VDI disk
Something? Not the guest? The VM?
mynick wrote:As I understand it VirtualBox does not just write the data to the host filesystem's VDI file, it actually inspects the bytes being written in order to detect "zeroed out sectors" to optimise host disk usage.
Not sure where you got that, but that's not how things work. Data written to a guest's disk are written to the VDI unfiltered. That's not what the "dynamic" nature of the VDI means.
mynick wrote:This means that Virtualbox most likely does not need to check ALL the bytes being written, just a few per sector before it decides to skip the rest of it.
It doesn't check, it doesn't inspect, it doesn't do anything but pass the data to the host.
mynick wrote:I've ran "defrag" on my windows guest and then "sdelete" but I did not notice significant reduction in my VDI size
Did you compact the VDI after that?
  • 
    VBoxManage modifymedium <uuid|filename> --compact
You may need to read a little bit on "All about VDIs"...

Re: Shrinking dynamic disks - why/how?

Posted: 24. Jun 2019, 10:59
by mpack
SDelete? Command line compact?

CloneVDI has been around since 2009, any particular reason for not using it?

Incidentally, VirtualBox does indeed filter 1MB blocks on a VDI write, looking for all zeros, but only if the block isn't already allocated. If the block is already allocated then there's nothing to be gained (a block doesn't get deallocated if its overwritten with zeros later). Since writing to unallocated blocks is a rare event the processing time is negligable, especially next to I/O time overall.

Re: Shrinking dynamic disks - why/how?

Posted: 24. Jun 2019, 11:30
by mynick
@mpack:

I had never before used the clone facility! I actually tried it before running the "VBoxManage modifymedium <file> --compact" command mentioned by Socratis: the clone had half the VDI size! Obviously cloning only copies the "used" part of the VDI. This is quite handy. Thanks for pointing it out and for the clarification of what the host actually does "inspect" and how that impact is minimal.

@Socratis:

I was not aware of the "compact" step. So indeed, after running the "VBoxManage modifymedium <file> --compact" there was a significant reduction in file size (it was halved!). With the clarification that this is a two-step process, I now understand that the host does not 'actively truncate' zeroed out sectors and hence so there's no performance impact whatsoever (with the caveat of what mpack mentioned).

Thanks for the informative link, I'm reading through that to get a better understanding.

---

Thank you both!

Re: Shrinking dynamic disks - why/how?

Posted: 24. Jun 2019, 15:06
by mpack
mynick wrote:@mpack:
I had never before used the clone facility!
I think perhaps you misunderstood my point. I was not recommending that you use the "VBoxManage clonemedium" function (which gives you nothing that "modifyhd --compact" doesn't, except a safer (not in-place) process), I was recommending that you try CloneVDI, which can compact a virtual drive directly, it does not require you to run SDelete first, nor do you need to use the command line.

Re: Shrinking dynamic disks - why/how?

Posted: 24. Jun 2019, 23:13
by mynick
Ok, then this is indeed the best solution by far. Too many advantages: You keep the same VM (no risk of breaking something). You avoid the lengthy process of running defrag/sdelete. You get a new image which is smaller in size (the goal). Then I suppose you just remove the old image from the VM's disk controller and replace it with the new image. Will try this next time.

Re: Shrinking dynamic disks - why/how?

Posted: 25. Jun 2019, 03:52
by BillG
That is what I do. Select the option to keep the same UUID. Delete the old .vdi and rename the new one to the old name (ie the compacted .vdi simply replaces the old one).

Re: Shrinking dynamic disks - why/how?

Posted: 25. Jun 2019, 09:53
by mpack
If you want to be even lazier, just clone the VDI in situ, using the same path and filename for source and destination. Experienced computer users are by now freaking out at the suggestion, but in fact CloneVDI detects the naming conflict and handles it automatically: it'll do all that name juggling for you behind the scenes (only if cloning succeeded without errors of course), no need for you to rename anything, nor replace storage attachments in the VM settings. The old VDI is left in the folder, renamed as "Original <name.vdi>". You can delete this after you've tested the VM.

As Bill said, you need to select "Keep UUID" to keep VirtualBox and the guest OS happy.

One caveat: never use snapshots or linked clones. It totally kills the option to manage VDIs in this way.