Page 1 of 1

Snapshot basics

Posted: 19. Aug 2017, 10:53
by socratis
*** DRAFT ***

Please read the following nice explanation (original post here) about differencing disks and snapshots (which are based on the concept of differencing disks) and you'll pretty easily figure out why they snapshots can be really bad when improperly used or misunderstood. Credit to ChipMcK for the nice write up summarizing our collective knowledge.

  • When a virtual disk is first created for a new virtual machine, it is considered as the base disk for the guest - data for the guest is read from and written to that disk image.

    The differencing disk records changes sector-by-sector to the whole disk image, not changes to any file in the disk. VirtualBox does not know what file system is employed on the disk image and therefore can not access any individual file of/on the disk image; only the guest OS is aware of that information.

    First Snapshot creates a differencing disk for read/write access while the base disk becomes read-only - as the guest modifies its data, the data is written to the differencing disk and the base disk is untouched.

    Second Snapshot creates another, new, differencing disk for read/write access while the first differencing disk becomes read-only along with the base disk.

    Subsequent Snapshots create additional differencing disks, with the preceding differencing disk joining the hierarchy (pecking order/chain) of read-only disks.

    Keep in mind that access to/from the virtual disks is sector-by-sector, not file-by-file.

    When the guest requests that a sector be read, the latest Snapshot is read first. If the sector is not found there (Sector-Not-Found is returned), the next Snapshot in the chain (youngest to oldest) is read, until the base virtual disk is reached. Then the sector on/in the base virtual disk is either read or Sector-Not-Found is returned.

Now, let's try to visualize the above, again keeping in mind that we refer to sectors, not files:
[list]
      HD state 1                   Snapshot 1
A---AAA--A-A-AAAA-------    A---AAA--A-A-AAAA-------
AAA----AAA--------------    AAA----AAA--------------
------------------------    ------------------------
------------------------    ------------------------
------------------------    ------------------------

      HD state 2                   Snapshot 2
A---AAA--A-A-AAAA-------    ------------------------
ABA-B--AABBBBB----------    -B--B----BBBBB----------
--------------B----B----    --------------B----B----
------------------------    ------------------------
------------------------    ------------------------

      HD state 3                   Snapshot 3
A-C-AAA--A-A-AAAA--CCC--    --C----------------CCC--
ABA-B--AABCCBB----------    ----------CC-----------
--CCC---------B----B----    --CCC-------------------
------------------------    ------------------------
------------------------    ------------------------

    HD state now                 Current state
A-C-AA**-A-A-AAAA--CCC--    ------**----------------
ABA-B--AA**CBB----------    ---------**-------------
--CCC---**----B----B----    --------**--------------
----------*-------------    ----------*-------------
------------------------    ------------------------
[/list]
I've marked the changes from "Snapshot 2" (the "B" sectors) differently for a reason. Say that you go outside of VirtualBox and you delete "Snapshot 2". Or the file gets corrupted, truncated or somehow modified. And then you try to recreate your hard drive. With all the "B" sectors missing that would be an impossibility.

People often confuse that a "Snapshot" is and what the "HD state" is. They think that when they're taking a snapshot they take a differential backup of the files and if they delete a snapshot manually they can recover from it. No.
  • Snapshots are NOT backups!
Another illustration of what you can save/recover if you delete "Snapshot 2" in our example:
[list]
+ Snapshot 1             <-- Usable
|
+-- Snapshot 2           <-- Deleted, modified or corrupted
  |
  +-- Snapshot 3         <-- Can NOT be used
    |
    +-- Current state    <-- Can NOT be used
[/list]
So, if the files that you've been working hard for the last couple of months are in the "Current state", you're pretty much out of luck.
  • You can NOT mount a Snapshot in another VM.
  • You can NOT start modifying UUIDs in the snapshot chain and hope that the chain will be intact.
  • You can NOT recreate your virtual hard drive.
The only thing that can save you is a full and complete backup and a tested restore process.