Is it implicit discard of transparent compression?

This is for discussing general topics about how to use VirtualBox.
Post Reply
940607
Posts: 57
Joined: 24. Sep 2012, 10:32
Primary OS: MS Windows 10
VBox Version: OSE other
Guest OSses: Archlinux
Location: Russia

Is it implicit discard of transparent compression?

Post by 940607 »

Hi. In the past when you wrote zeros to a big file on guest, the dynamic .vdi container would grow to contain these zero blocks (even those blocks that were uninitialized). But now there is a CentOS 7 image on a `.vdi` container and running on VirtualBox 5.2 and I see some strange results when writing zeros to disk. For example, I wrote 16Gb with `dd`, but the container with original size 100Mb has only grown by 5Gb. Please note that I created a snapshot before doing that and I'm checking the size of the differencing image.

I also attached a new empty 8Gb image and repeted the test. This time the image only grown by a couple of megabytes. The file systems tested are xfs and ext4 and I also ran mkfs on entire devices.

It looks like I missed some improvement in VirtualBox, but I can't find it in the change log.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Is it implicit discard of transparent compression?

Post by socratis »

From a discussion that I had with a developer on this subject, the response was "We're smart about that!". ;)

So, there must be something in the logic that says "If sector_contents == 0 then dont_bother()". And I doubt that you'd find that piece of information in the change log, same as you didn't find the fact that Audio_RDP is not enabled by default, but only if RDP server is also enabled. Or that 3D acceleration is specifically logged in the VBox.log these days.
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Is it implicit discard of transparent compression?

Post by mpack »

It has always been like that.
  • If you zero fill a 1MB block and NONE of the sectors in that block have ever been written to before (i.e. the 1MB block is not already allocated in the VDI) then VirtualBox will simply mark the block (in the blockmap) as containings zeros and still will not allocate any host disk space. VirtualBox lazy writes accomodate this.
  • If any part of the block is non zero then VirtualBox must allocate a 1MB block of disk space, hence you'll see the VDI grow and largely consist of zeros. If your guest disk contains lots of little files (<<1MB) then these are quite likely to occupy only part of a 1MB block, so this case will happen a lot.
  • If the block was already allocated before you zero filled it, the VirtualBox will fill the disk with zeros, because it can't undo an allocation. The zero filled block will however be discarded the next time you clone or compact the VDI.
In other words the precise effect on the VDI of zero filling a guest filesystem is quite hard to predict, since the result is content sensitive. I'm pretty certain that VirtualBox has not changed, only your scenario did.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Is it implicit discard of transparent compression?

Post by socratis »

mpack wrote:If the block was already allocated before you zero filled it, the VirtualBox will fill the disk with zeros, because it can't undo an allocation.
Is there a technical reason why the allocation cannot be "undone"? Or is it time/CPU consuming and it can't happen online/real-time?
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Is it implicit discard of transparent compression?

Post by mpack »

Well, you could certainly write the "zero block" marker to the block map, but that (a) leaves an allocated block orphaned in the file (VDI has no structure for tracking deleted blocks for reuse), (b) invalidates one of the ways of checking a blockmap for corruption, when is to check that the number of allocated, unallocated and zero blocks tallies with the physical file size, (c) is pointless anyway since the point of the zero marker is intended to indicate "no need to allocate a file block for this", which is beside the point if the block is already allocated.

Hmm. Another possibility is to fill the gap, e.g. swap the last allocated block for the deleted block, amend the block table: a kind of trim operation. That puts it under the control of the guest OS, but not in a very convenient way. ISTR VirtualBox has support for an SSD trim operation, but that will be about deleted sectors, not zero sectors, with the guest OS (or the virtual drive? I have not looked) maintaining a structure which lists deleted sectors.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Is it implicit discard of transparent compression?

Post by socratis »

Thanks a lot mpack for the explanation, much appreciated!
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
Post Reply