Size of Snapshot-VDI
Size of Snapshot-VDI
Hey,
I have a VDI with Windows XP. Now I noticed, that snapshots are rather big. I installed a new software, ~ 150 MB. The Snapshot-VDI was~600 MB in size. after "sdelte -z" and "VBoxManage modifyvdi FILE compact" it was still 500MB big. Doing a defrag (basic vdi was deframented before making the snapshot as well), 0ing and compacting again it is now 1GB big.
How can I avoid such dramatic increase of size when working with snapshots?!
I have a VDI with Windows XP. Now I noticed, that snapshots are rather big. I installed a new software, ~ 150 MB. The Snapshot-VDI was~600 MB in size. after "sdelte -z" and "VBoxManage modifyvdi FILE compact" it was still 500MB big. Doing a defrag (basic vdi was deframented before making the snapshot as well), 0ing and compacting again it is now 1GB big.
How can I avoid such dramatic increase of size when working with snapshots?!
-
mpack
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: VirtualBox+Oracle ExtPack
- Guest OSses: Mostly XP
Re: Size of Snapshot-VDI
AFAIK, you can't compact a snapshot chain with VBoxManage (especially not when applied to the wrong VDI). Therefore sdelete will only make it bigger. Ditto defrag: though I don't even know what you mean by claiming that you defraged the base VDI and the current state separately - the only way I could have thought you might attempt that would have involved creation of a new branch off the base VDI, a new current state off that branch... which you then defragged which will have made it shoot up in size.
You need to delete all the snapshots, merging everything into one file. Then VBoxManage's "compact" will work, or just clone the latest snapshot with CloneVDI with the "Compact" option enabled, then build a new VM around the clone (it must not be mounted back in the original snapshot-ridden VM).
You need to delete all the snapshots, merging everything into one file. Then VBoxManage's "compact" will work, or just clone the latest snapshot with CloneVDI with the "Compact" option enabled, then build a new VM around the clone (it must not be mounted back in the original snapshot-ridden VM).
Re: Size of Snapshot-VDI
I talked about defrag because defrag moves most of the files to the first clusters of the disk - so there are less "new" filled clusters compared to the base vdi. But since the sectors which differ have to be stored 2 times now, I would have been surprised seeing this working out - but I tried.
I know that I can merge snapshots into the base vdi or clone to a new vdi - but thats not what I want, in fact after cloning the "snapshot" occupies even more space on disk - and when merging snapshots, I dont know whether the new snapshot is smaller then the some of the 2 previous ones.
I want to understand, why the snapshots become that large and (thereby) how I can prevent them from growing that fast.
btw: compating works with the snapshot-vdi, since it shrinked e.g. from 600 MB to 500 MB
I know that I can merge snapshots into the base vdi or clone to a new vdi - but thats not what I want, in fact after cloning the "snapshot" occupies even more space on disk - and when merging snapshots, I dont know whether the new snapshot is smaller then the some of the 2 previous ones.
I want to understand, why the snapshots become that large and (thereby) how I can prevent them from growing that fast.
btw: compating works with the snapshot-vdi, since it shrinked e.g. from 600 MB to 500 MB
-
mpack
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: VirtualBox+Oracle ExtPack
- Guest OSses: Mostly XP
Re: Size of Snapshot-VDI
In fact, after cloning there would be no snapshots. This plus setting the "Compact" option produces the minimum possible disk space usage. You would then of course build a new VM around the clone, and delete the old VM.Tsso wrote:in fact after cloning the "snapshot" occupies even more space on disk
Re: Size of Snapshot-VDI
well, I make snapshots, because I want to keep a state. If I want to delete the old state(s), I don't need to "clone the snapshot" (in fact the chain of snapshots and the base image) but I could just merge the snapshots together. Having only relatively small changes in k Snapshots, If I want to keep all the snapshots I'd occupy ~ k* BASE_IMAGE_SIZE doing it this way, which is much larger then the snapshots themselves.
-
Etepetete
- Posts: 400
- Joined: 7. Oct 2009, 10:19
- Primary OS: MS Windows 10
- VBox Version: VirtualBox+Oracle ExtPack
- Guest OSses: Slackware 14.2
- Location: Berlin
Re: Size of Snapshot-VDI
Hey Tsso, the size of your VDI can only be reduced by overwriting free space with 0's. Defrag movesfiles by copying them to a new location which leaves a lot of "eroneous data" on the disk. sdelete is supposed to overwrite free space with 0's but I have never had any succes with it, not even on my host. I don't use snapshots, so I cannot confirm that the following will work, but it is worth a try. (It works on my single VDI's.) Download and install Eraser in your VM and then create a custom erasing methode (check the manual on how to do it). The custom erasing methode should overwrite free space with 0's. You can also set this method as the default for erasing as well as for erasing free space. Then run it in your VM to erase free space. This may take a while, depending on VDI size and PC horsepower. If you try it, let me know what results, if any, it brought you.
Re: Size of Snapshot-VDI
sdelete works for me, neertheless I tried eraser. After install: Snaphot-VDI 660MB After 0ing and compacting: 1,08GB (however this is possible) - another round of sdelete & compact did not change anything
-
noteirak
- Site Moderator
- Posts: 5231
- Joined: 13. Jan 2012, 11:14
- Primary OS: Debian other
- VBox Version: OSE Debian
- Guest OSses: Debian, Win 2k8, Win 7
- Contact:
Re: Size of Snapshot-VDI
I might missunderstand all this how works, but from my understanding you are approching this issue in the wrong way :
a Snapshot VDI will contain the differences between a parent VDI from a point in time and now. This means that when you start zero'ing and defrag'ing all over the place, you actually create a difference between 2 places in the disk!
I do not think virtualbox actually cares what is written in it, it would still have to put somewhere that the places you've just zero'ed actually contains zeros! Else it would have no way to know between the fact that there is no difference and the fact that 0's were written. (0 is a data itself, what you make with it is something else).
So the more you degrag, zero, etc, the more you'll create a delta.... and the more your snapshot will grow.
It looks to me like you're seeing the differencing VDI as a complement on top of your parent VDI, which is not the case, it's a difference, not an addition.
The only way is to merge everything to one file, so you will actually be able to make the difference between what is free space and what is not, since you got only place to look.
a Snapshot VDI will contain the differences between a parent VDI from a point in time and now. This means that when you start zero'ing and defrag'ing all over the place, you actually create a difference between 2 places in the disk!
I do not think virtualbox actually cares what is written in it, it would still have to put somewhere that the places you've just zero'ed actually contains zeros! Else it would have no way to know between the fact that there is no difference and the fact that 0's were written. (0 is a data itself, what you make with it is something else).
So the more you degrag, zero, etc, the more you'll create a delta.... and the more your snapshot will grow.
It looks to me like you're seeing the differencing VDI as a complement on top of your parent VDI, which is not the case, it's a difference, not an addition.
The only way is to merge everything to one file, so you will actually be able to make the difference between what is free space and what is not, since you got only place to look.
Hyperbox - Virtual Infrastructure Manager - https://apps.kamax.lu/hyperbox/
Manage your VirtualBox infrastructure the free way!
Manage your VirtualBox infrastructure the free way!
Re: Size of Snapshot-VDI
noteirag thats a very good point. I think I also figured out the reason why eraser enlarged the ths Snapshot: Eraser erases also the space between EOF and EOC (End of Cluster), which sdel most propably does not.
I think the snapshot doubled in size because all those EOF-EOC-Areas are now 0 and have been undefined before. Maybe VBox doesnt recognize that there is no difference and since it can store only clusters, a lot of such clusters have been copied. I dont know about the internal structure of a vdi......
maybe an other vhdd-image-type handles these problems in a better way ?!
I think the snapshot doubled in size because all those EOF-EOC-Areas are now 0 and have been undefined before. Maybe VBox doesnt recognize that there is no difference and since it can store only clusters, a lot of such clusters have been copied. I dont know about the internal structure of a vdi......
maybe an other vhdd-image-type handles these problems in a better way ?!
-
noteirak
- Site Moderator
- Posts: 5231
- Joined: 13. Jan 2012, 11:14
- Primary OS: Debian other
- VBox Version: OSE Debian
- Guest OSses: Debian, Win 2k8, Win 7
- Contact:
Re: Size of Snapshot-VDI
From my knowledge there is no other way to handle it. Since data is undefinied into a differencing disk unless something is written to it - that's also the whole point - you must write anything that is actually defined.
Zero'ing the space to be able to compact is only a gimmic to allow optimization. 0 is still data, unless you know it is not. And there is no way to know that into a differencing disk, since there is always the possiblity it is actually data with some 0.
Any virtualdisk format always handles things the same way, they must emulate sectors, clusters, etc. You still have to know if your 0 is actual data or a marker for no data.
Zero'ing the space to be able to compact is only a gimmic to allow optimization. 0 is still data, unless you know it is not. And there is no way to know that into a differencing disk, since there is always the possiblity it is actually data with some 0.
Any virtualdisk format always handles things the same way, they must emulate sectors, clusters, etc. You still have to know if your 0 is actual data or a marker for no data.
Hyperbox - Virtual Infrastructure Manager - https://apps.kamax.lu/hyperbox/
Manage your VirtualBox infrastructure the free way!
Manage your VirtualBox infrastructure the free way!
-
mpack
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: VirtualBox+Oracle ExtPack
- Guest OSses: Mostly XP
Re: Size of Snapshot-VDI
That's your choice, but then as I said earlier, you cannot compact the disk.Tsso wrote:well, I make snapshots, because I want to keep a state.
Really, I gave all the necessary information in my first message in this thread.
Re: Size of Snapshot-VDI
In fact, all 0-clusters and unused clusters could be handled the same way.....
image = current snapshot
check:
If (cluster defined in image)
--then return cluster from image
--else if (parent snapshot/image exists)
----then image := parent, goto check:
----else return 0
in fact the image might use blocks instead of clusters - no real difference regarding this point. But 0blocks/clusters could be optimized by this process as well (and I assumed this is the way it works)
image = current snapshot
check:
If (cluster defined in image)
--then return cluster from image
--else if (parent snapshot/image exists)
----then image := parent, goto check:
----else return 0
in fact the image might use blocks instead of clusters - no real difference regarding this point. But 0blocks/clusters could be optimized by this process as well (and I assumed this is the way it works)
-
mpack
- Site Moderator
- Posts: 39134
- Joined: 4. Sep 2008, 17:09
- Primary OS: MS Windows 10
- VBox Version: VirtualBox+Oracle ExtPack
- Guest OSses: Mostly XP
Re: Size of Snapshot-VDI
Before you persist in teaching granny how to suck eggs, perhaps I should mention that I'm the author of the aforesaid CloneVDI tool: so I'm well aware of how compaction works, and how to read data from a snapshot chain.
How exactly would you go about zeroing clusters in anything except the current state? Or are you assuming that previous states are already compacted?
How exactly would you go about zeroing clusters in anything except the current state? Or are you assuming that previous states are already compacted?
-
noteirak
- Site Moderator
- Posts: 5231
- Joined: 13. Jan 2012, 11:14
- Primary OS: Debian other
- VBox Version: OSE Debian
- Guest OSses: Debian, Win 2k8, Win 7
- Contact:
Re: Size of Snapshot-VDI
Tsso, you're confusing data value and data meaning.
If you take your original virtual disk, and write data in it from sector 100 to 120 per exemple.
Then you take a snapshot.
You then delete the data in sector 100, 105 and 110.
Your snapshot differencing disk will then have the mark for 100, 105 and 110 being "empty" into the filesystem data.
If you sdelete your disk, 100, 105 and 110 will now have 0 in them, but since your diff disk needs to be the changes from your parent disk, you gonna have 0 stored in your disk for sector 100, 105, 110, so MORE data than you actually would have if you didn't snapshot. And since it is not possible to know if that data is actually part of meaningfull data or not, you're going to keep it, so you will not save any place.
finally, you degrag and sdelete. Let's say you can move the data from sector 118, 119 & 120 to 100, 105 and 110.
your snapshot diff disk will now have a copy of your data that was in 118 -> 120 into 100, 105 & 110.
Finally you sdelete, you will then write 0 into 118 -> 120.
It means that, compared to your original disk, you have 2 changes : 100, 105 & 110 contain other data, so it must be acted in the diff disk and finally, 118 to 120 have zero's, and that must be acted in your disk.
So while you degrag'ed and sdelete'd your snapshot disk, you actually increasted the differences that must be written to the diff disk. If instead of one sector, you actually had 10 Gb of data that was spread over these concerned sectors, your diff disk will now be 20 Gb size -> 10 Gb for the changed data and 10 Gb for the 0 data!
Finally, let's say you need to know what is the state of sector 118. If you actually compacted the diff disk like you expect it to work, it means that you'll remove the 0 from the diff disk, but then, when you try to read the empty sector, they are not defined into the diff disk, so you'll get the value from the base disk..... which actually contains data! So you couldn't compact it either
In this case, the 0 in the diff disk have a meaning, you cannot remove them or you won't know they are 0 in the first place, compared to the original disk!
In a diff disk, ALL data is meaninful, and must be kept. If you didn't, you would actually loose the changes made since your snapshot.
Only in a disk without child you can assume that a full sector/cluster of 0 means absence of data.
On a final note, mpack gave you the answer all along : merge everything back into a single disk, there is no way around this, regardless of how you do it.
If you take a snapshot, it means you want to keep the data the way it was at that moment in time. If you think about modifying the data in the parent VDI, you're totally breaking and violating the snapshot idea and contract. If that is what you want, remove the snapshot....
If you take your original virtual disk, and write data in it from sector 100 to 120 per exemple.
Then you take a snapshot.
You then delete the data in sector 100, 105 and 110.
Your snapshot differencing disk will then have the mark for 100, 105 and 110 being "empty" into the filesystem data.
If you sdelete your disk, 100, 105 and 110 will now have 0 in them, but since your diff disk needs to be the changes from your parent disk, you gonna have 0 stored in your disk for sector 100, 105, 110, so MORE data than you actually would have if you didn't snapshot. And since it is not possible to know if that data is actually part of meaningfull data or not, you're going to keep it, so you will not save any place.
finally, you degrag and sdelete. Let's say you can move the data from sector 118, 119 & 120 to 100, 105 and 110.
your snapshot diff disk will now have a copy of your data that was in 118 -> 120 into 100, 105 & 110.
Finally you sdelete, you will then write 0 into 118 -> 120.
It means that, compared to your original disk, you have 2 changes : 100, 105 & 110 contain other data, so it must be acted in the diff disk and finally, 118 to 120 have zero's, and that must be acted in your disk.
So while you degrag'ed and sdelete'd your snapshot disk, you actually increasted the differences that must be written to the diff disk. If instead of one sector, you actually had 10 Gb of data that was spread over these concerned sectors, your diff disk will now be 20 Gb size -> 10 Gb for the changed data and 10 Gb for the 0 data!
Finally, let's say you need to know what is the state of sector 118. If you actually compacted the diff disk like you expect it to work, it means that you'll remove the 0 from the diff disk, but then, when you try to read the empty sector, they are not defined into the diff disk, so you'll get the value from the base disk..... which actually contains data! So you couldn't compact it either
In this case, the 0 in the diff disk have a meaning, you cannot remove them or you won't know they are 0 in the first place, compared to the original disk!
In a diff disk, ALL data is meaninful, and must be kept. If you didn't, you would actually loose the changes made since your snapshot.
Only in a disk without child you can assume that a full sector/cluster of 0 means absence of data.
On a final note, mpack gave you the answer all along : merge everything back into a single disk, there is no way around this, regardless of how you do it.
If you take a snapshot, it means you want to keep the data the way it was at that moment in time. If you think about modifying the data in the parent VDI, you're totally breaking and violating the snapshot idea and contract. If that is what you want, remove the snapshot....
Hyperbox - Virtual Infrastructure Manager - https://apps.kamax.lu/hyperbox/
Manage your VirtualBox infrastructure the free way!
Manage your VirtualBox infrastructure the free way!
Re: Size of Snapshot-VDI
I never said/thought that you would not understand how it works. I just talked about how it could possibly work and what was my mental model of it.
I have to admit I forgot about 0 values which have not been 0 before. So I'm surprised that compacting the snapshot-VDI a) worked and b) caused no problem regarding that issue.
Of course the base image should not be modified - thats the idea of snapshots, you're absolutely right and I never had in mind to change the base image. I want to prevent the snapshot from getting a lot of differences.
And as you already expected, of course I defraged & compacted the base image before making the snapshot - I'm sorry I should have stated this explicitly.
I have to admit I forgot about 0 values which have not been 0 before. So I'm surprised that compacting the snapshot-VDI a) worked and b) caused no problem regarding that issue.
Of course the base image should not be modified - thats the idea of snapshots, you're absolutely right and I never had in mind to change the base image. I want to prevent the snapshot from getting a lot of differences.
And as you already expected, of course I defraged & compacted the base image before making the snapshot - I'm sorry I should have stated this explicitly.