Cleaning/compacting an image

This is for discussing general topics about how to use VirtualBox.
Post Reply
Era Scarecrow
Posts: 5
Joined: 22. Nov 2012, 04:27

Cleaning/compacting an image

Post by Era Scarecrow »

I've found a few different sources for how to blank your drive (prior to --compact), however some methods don't seem to be quite as useful; After blanking a drive's free space the 10Gb drive (150Mb differential file) grew to 230Mb. I used sdelete which allocates new files and zeroizes as much as it can before repeating and then deleting it all afterwards. This has been helpful in some cases, however it still ends up leaving a portion of the image dirty (directory entries, junk data not removed).

Is there a program that modifies the drive image, cleaning up as much as it can in preparation for --compact (or compression/application exporting)?

If there isn't I'm starting a project that may help in these areas, but that's a ways off still.
noteirak
Site Moderator
Posts: 5231
Joined: 13. Jan 2012, 11:14
Primary OS: Debian other
VBox Version: OSE Debian
Guest OSses: Debian, Win 2k8, Win 7
Contact:

Re: Cleaning/compacting an image

Post by noteirak »

sdelete only writes 0 to the space marked as free on your disk. It will not create folders or leave junk data. Also It will not clean up files or folder for you! it will only zero what has been removed already.
If you find empty folders and junk data, these have been there before sdelete was ran.

Also please note that a differencial disk will write ALL changes that occured on the disk - that is also the case for the 0 written on the disk by sdelete.
So you might not be able to get back all your space, since these 0 might not be in the parent image, and therefore must be kept in the diff disk.
Undefined data is not the same as "0" data!
Hyperbox - Virtual Infrastructure Manager - https://apps.kamax.lu/hyperbox/
Manage your VirtualBox infrastructure the free way!
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Cleaning/compacting an image

Post by mpack »

The techniques you are discussing have been obsolete since 2009. CloneVDI will compact a drive in a single step, without requiring you to run sdelete first. This works with most popular Windows and Linux filesystems, e.g. NTFS, FATx, EXTx. CloneVDI runs on Windows Hosts directly, or runs happily inside Wine on Linux or Mac hosts. See the CloneVDI sticky in the "Windows Hosts" forum.

CloneVDI can compact a snapshot, but you can't use the resulting hdd clone in the original VM. This kind of disk management is always much easier if you avoid snapshots completely. Note that the VBoxManage compaction method doesn't work with snapshots at all.
Era Scarecrow
Posts: 5
Joined: 22. Nov 2012, 04:27

Re: Cleaning/compacting an image

Post by Era Scarecrow »

Interesting. So to be clear, does the vboxmanage clonehd do the same thing? (No need for cloneVDI?).

As I understand it, cloneVDI will ignore unused sectors, but I'm not sure if that answered if it does filesystem cleanup (54 pages is a lot to sift through, the download doesn't contain any documentation). I've worked in the original fat12/fat16 for example, and the 'deleted' files are renamed so they begin with a ?, and don't actually removed/zeroize the entry (or shift all other remaining files). If those entries are removed entirely (and zeroized) then when you compress the entire image you will get a slightly better compression result. From the way it's written it sounds like it doesn't do folder cleanup/compacting.

As for the snapshots, kinda dumb if you can't make them smaller too; Although with the unused sectors zeroed it will compress a bit better anyways, just still seems a waste. I have a setup where I've a master file, I make modifications in about 6 different mini-variants and after running windows in them I want to clean them and save the new setup. Windows however on the first boot expands the blank (difference) image to some 50Mb-80Mb. Hmmm... Guess the small variants aren't that important.. Just wanted to avoid unnecessary extra work for later. Gives me a few new ideas though...
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Cleaning/compacting an image

Post by mpack »

FYI: The CloneVDI download includes release notes, and the discussion thread included frequent mention of that fact.
Era Scarecrow wrote:Interesting. So to be clear, does the vboxmanage clonehd do the same thing? (No need for cloneVDI?).
No, VBoxManage does not do these things.
Era Scarecrow wrote:As I understand it, cloneVDI will ignore unused sectors, but I'm not sure if that answered if it does filesystem cleanup
I have no idea what you mean by "filesystem cleanup". CloneVDI knows what file spaces have been deleted and can therefore be reclaimed. It does not know which additional files you might like to delete.
Era Scarecrow wrote:I've worked in the original fat12/fat16 for example, and the 'deleted' files are renamed so they begin with a ?,
You are confusing directory entries with files. CloneVDI recovers the clusters of deleted files. The vacant directory slots previously used by those files are not relevant.
Era Scarecrow wrote:and don't actually removed/zeroize the entry (or shift all other remaining files).
Neither of those things is necessary or relevant to compaction. Defragmentation of the guest filesystem may make a small difference to the efficiency of compaction - a few percent. I don't usually consider it worth much effort, but if want that additional few percent then you can run a defrag tool inside the guest prior to compaction.
Era Scarecrow wrote:As for the snapshots, kinda dumb if you can't make them smaller too;
You seem to have misunderstood. CloneVDI doesn't make snapshots smaller, it deletes them and replaces them with a single (merged) compacted VDI clone. If you consider what CloneVDI does to be kinda dumb then by all means look for a better tool, or write your own and contribute it to the community.
Era Scarecrow
Posts: 5
Joined: 22. Nov 2012, 04:27

Re: Cleaning/compacting an image

Post by Era Scarecrow »

mpack wrote:
Era Scarecrow wrote:I've worked in the original fat12/fat16 for example, and the 'deleted' files are renamed so they begin with a ?,
You are confusing directory entries with files. CloneVDI recovers the clusters of deleted files. The vacant directory slots previously used by those files are not relevant.
True and yet not at the same time. Directories are technically files, just not normally accessed by programs except through special function calls. I'd rather have fully cleaned entries.
mpack wrote:Neither of those things is necessary or relevant to compaction. Defragmentation of the guest filesystem may make a small difference to the efficiency of compaction - a few percent. I don't usually consider it worth much effort, but if want that additional few percent then you can run a defrag tool inside the guest prior to compaction.
That's not the issue I'm talking about, I'm talking about the directory entries, defragmentation may help a bit in putting files in the proper order (and moving them to the beginning of the disk as much as possible). Part of this is compression improvement (small maybe...) but also on privacy. If I downloaded some questionable content (say... porn) and deleted it afterwards and made a backup some time later I'd rather not have useless data mentioning I had something that isn't present. If as a side effect that modifying the directory entries to move everything up happens to free up a few sectors, great, if not it still improves it slightly. There are some cases where every k counts (like floppy images).
mpack wrote:You seem to have misunderstood. CloneVDI doesn't make snapshots smaller, it deletes them and replaces them with a single (merged) compacted VDI clone. If you consider what CloneVDI does to be kinda dumb then by all means look for a better tool, or write your own and contribute it to the community.
I didn't say it was dumb, but not being able to work on snapshots is silly. True it may require a little more indirection.

I've already begun work on a similar tool, however seeing how CloneVDI does this nice job, I'll concentrate on directory cleanup/compacting; You'd probably do this before running CloneVDI, although to support snapshots or not... I probably would (only a small extra entry to allow that feature).
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Cleaning/compacting an image

Post by mpack »

Era Scarecrow wrote:True and yet not at the same time. Directories are technically files
Exactly so - and as such require no special treatment. If these files are deleted then the clusters will be recovered during compaction, the same as any other file type.

The goals of your project is somewhat confusing, as is your conflation of mostly unrelated subjects. Perhaps you need to understand the scope of the tool you are suggesting. E.g. the purpose pf compaction is to free up disk space on the host. Detailed management of the guests filesystem (e.g. optimization) is better left to guest tools IMHO.
Era Scarecrow
Posts: 5
Joined: 22. Nov 2012, 04:27

Re: Cleaning/compacting an image

Post by Era Scarecrow »

mpack wrote:The goals of your project is somewhat confusing, as is your conflation of mostly unrelated subjects. Perhaps you need to understand the scope of the tool you are suggesting. E.g. the purpose pf compaction is to free up disk space on the host. Detailed management of the guests filesystem (e.g. optimization) is better left to guest tools IMHO.
Maybe.... If you 7zip an image that has not had the directory entries cleaned vs one that does, you might see quite an improvement, maybe a few megabytes; This does effectively change the size on the 'host' machine. If you use filesystem compression like in NTFS you might see larger improvements.

I've built a small script before that compresses data (zlib) as you type and UUEncodes it on an output comparison box. The results were surprising where you can add/remove one character and it could effectively double the size of the output (relatively short strings, less than a k). Effective compression of data works better when there's fewer breaks in a pattern of which it can identify.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Cleaning/compacting an image

Post by mpack »

I don't want to be distracted into a discussion of 7zip. In fact I don't think I'm getting anything useful from this discussion, so I think I'll stop now. By all means create your project, then perhaps once I've seen it in action I'll better understand what you were getting at.
Era Scarecrow
Posts: 5
Joined: 22. Nov 2012, 04:27

Re: Cleaning/compacting an image

Post by Era Scarecrow »

Alright. I'll give a proof of concept and a few files to compare against when it's done. Probably be fat 12 (floppies) as that will be the smallest for examples.
Post Reply