Well, I think I found the explanation, of course it is related to the zero blocks topic pointed by mpack.
Just in case anyone else touches this issue, the record of my views:
In the drive there are some P2P leftover files, temporary files. While windows reports that folder at 5.44GB, a zip compression goes to 1.6GB. The difference is 3.84GB, quite close to the disonance in vdi size and window reporting size, 3,71GB (of course 7zip will use a different compresssion approach that that of vdi ("VirtualBox (and CloneVDI) doesn't allocate 1MB blocks which are filled with zeros, it simply marks the block as all zeros in the blockmap. You could consider this a form of run length encoding").
Just to ensure findings, I also:
- I mounted both vdi (before/after compression) (arsenal image mounter has a free version that allows to mount and show vdi as letter units in windows), and did a byte by byte comparison of files (freefilesync, also free). 100% equal files.
- searched about sparse files in windows. but I had not marked files as sparse...
- found that "The files with the extension “.part” are downloads in P2P sw that are not yet finished... part files always have the
size of the final download. The missing parts are zerofilled. In the newer versions, and when using the NTFS file system, you have the option to share your incomplete downloads as “sparse”. This counteracts the mentioned process and therefore saves space on your hard disk."
All in all, it looks that the P2P sw creates and uses files "somewhat as sparse" but on its own, without operating system support. That will explain that file size and file on disk size reported by windows are the same. And, as indicated by mpack, the compaction of the vdi removes that zerofiled areas of the .part files.
Looks that it is to be expected that compacted vdis with .part files will underperform with P2P sw as they will not get advantage of their zerofill strategy.
Last check has been to remove the P2P files in the non compacted vdi, compact, and check sizes... Yes, now compacted vdi file size (host) is not smaller that windows reported contents size (guest).
CONCLUSIONS
Now I fully understand mpack statement "doesn't allocate 1MB blocks which are filled with zeros, it simply marks the block as all zeros in the blockmap", and I found the culprit/origin of the unexplained non-fitting size...
.part files are "somewhat sparse" at application level. Now I will sleep better...
Thanks all.
Later edit:
.part files / "somewhat sparse" at application level - I mean
a) that the OS is not aware of the file being sparse. NTFS manages sparse files, taking the advantage of not consuming disk space. But, files have to be marked as sparse, otherwise the OS is not aware that size on disk is smaller that file size. In the case of .part files, the P2P program, unless configured as such, does not mark .part files as sparse to the OS
b) as far as I understand, .part files are not true sparse files (
https://en.wikipedia.org/wiki/Sparse_file), as the program allocates the file with full size (so no advantage of non consuming disk space) but the program uses sparse files techniques probably to be fast and to avoid disk full situations, and perhaps to fill in incoming new information. Sparseness/hole punching have their complexities I will not dig in... whoever interested can look here
https://stackoverflow.com/questions/139 ... it-be-used and here
https://stackoverflow.com/questions/385 ... zero-range ; at the end of the day, vdi related, is that CloneVDI identifies zerofilled/nonallocated parts inside the vdi (both at the not file-assigned areas AND inside files) and makes it magic of compacting - CAVEAT EMPTOR: I simply do not know if the compaction has functional impact on the "somewhat sparse" .part files... but looks to me that the P2P program I use is able to go on...
Should you wish to check (Windows) if a file is "somewhat sparse", there is a program called sparse_checker, that can even convert the file in "OS sparse". It is 32bit and you would need VB2015dlls, and to retrieve it from wayback timemachine - here -
https://web.archive.org/web/20110602032 ... ecker.html Note that NTFS has to be used (spare aware FS). Did a test with a 873KB size and size on disk .part file and it went to 813 size and 513KB size on disk.
OVER, promise