Copy vdi directly to physical media on windows host

Discussions related to using VirtualBox on Windows hosts.
Post Reply
rhlee
Posts: 3
Joined: 17. Oct 2012, 16:53

Copy vdi directly to physical media on windows host

Post by rhlee »

Hi,

Does there exists a method on Windows hosts to copy a vdi directly to physical media?

I know there is vboxmanage internalcommands converttoraw, but on windows, I can't set the output file to \\.\PhysicalDrive1 for an external drive. Also piping to stdout is disabled for windows hosts, so I can't pipe it to /dev/sdb in cygwin.

I had a look at the vdi file format and it looks pretty straight forwards and I'm tempted to write a tool myself. I would just like to check beforehand in case I re-invent the wheel.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Copy vdi directly to physical media on windows host

Post by mpack »

The easiest way is to run the VM (any VM which can use the drive), and inside it use the disk imaging software of your choice to write a whole disk "backup" to a shared folder or USB drive. The same backup software can be used to "restore" that image onto another PC. I use Acronis TrueImage (not free). You can get "dd for Windows" and CloneZilla for free - both are IMHO worth every penny. :P
rhlee
Posts: 3
Joined: 17. Oct 2012, 16:53

Re: Copy vdi directly to physical media on windows host

Post by rhlee »

Took a shot at implementing it myself.

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/time.h>


#define HEADER_SIZE 0x200
#define HEADER_STRING "<<< Oracle VM VirtualBox Disk Image >>>"
#define TIME_BUFFER_SZ 10
#define BAR_SZ 40

const char usage[] =
  "Usage:\n"
  "  vddi [-i/-s] innputFile [outputFile]\n"
  "    -i only print info don't copy\n"
  "    -s write sparse data (don't use this for physical devices)";


int vdi, raw;
long *map;
unsigned char *block, *zero;


void error(int line, char * file)
{
  printf("[%s:%i] Last set error code is %i: %s\n"
    "Use gdb to catch this SIGTRAP\n",
    file, line, errno, strerror(errno));
  __asm__("int3");
  exit(errno);
}

unsigned long quadToULong(char* quad)
{
  return
    (*quad & 0xff) + 
    ((*(quad + 1) & 0xff) << 010) + 
    ((*(quad + 2) & 0xff) << 020) + 
    ((*(quad + 3) & 0xff) << 030);
}

long long now()
{
  struct timeval tv;
  gettimeofday(&tv, NULL);
  return (tv.tv_sec * 1000000) + tv.tv_usec;
}

void finally()
{
  free(zero);
  free(block);
  free(map);
  close(vdi);
  close(raw);
}

void sigInt(int signal)
{
  printf("\x1b[?25h\nAborted");
  finally();
  exit(2);
}


int
main(int argc, char *argv[])
{
  char opt;
  int infoMode = 0;
  char headerBuffer[HEADER_SIZE];
  char *input, *output;
  long blockOffset, dataOffset, blockSize;
  long long diskSize, blockCount, seekTarget, i, back;
  int sparse = 0;
  long mapSize;
  long long time_buffer[TIME_BUFFER_SZ], deltaT, lastPrint = now();
  float speed;
  int bars, j;
  int timeRemaining;
  
  signal(SIGINT, sigInt);
  
  if(sizeof(long) != 4)
  {
    printf("Error: long is not 4 bytes long\n");
    exit(1);
  }
  
  while((opt = getopt(argc, argv, "i:s")) != -1)
  {
    switch(opt)
    {
      case 'i':
        if(*optarg == '-')
        {
          printf(usage);
          exit(1);
        }
        infoMode = 1;
        input = optarg;
        break;
      case 's':
        sparse = 1;
        break;
      case '?':
        perror(usage);
        exit(1);
        break;
    }
  }
  
  if((infoMode && (argc != optind)) ||
    (!infoMode && ((argc - optind) != 2)))
  {
    printf(usage);
    exit(1);
  }
  
  if(!infoMode)
  {
    input = argv[optind];
    output = argv[optind + 1];
  }
  
  if((vdi = open(input, O_RDONLY)) == -1)
    error(__LINE__, __FILE__);
  
  if(read(vdi, headerBuffer, HEADER_SIZE) != HEADER_SIZE)
    error(__LINE__, __FILE__);
  
  if(strncmp(headerBuffer, HEADER_STRING, strlen(HEADER_STRING)))
  {
    printf("Could not find header string\n");
    exit(1);
  }
  printf("VDI type: %lu\n", quadToULong(headerBuffer + 0x4c));
  printf("Block offset: %#lx\n",
    blockOffset = quadToULong(headerBuffer + 0x154));
  printf("Data offset: %#lx\n", dataOffset = quadToULong(headerBuffer + 0x158));
  printf("Disk size: %llu\n", diskSize = quadToULong(headerBuffer + 0x170) +
    ((unsigned long long)quadToULong(headerBuffer + 0x174) << 040));
  printf("Block size: %lu\n", blockSize = quadToULong(headerBuffer + 0x178));
  printf("Block Count: %llu\n\n\x1b""7", blockCount = (diskSize / blockSize));
  
  if(infoMode) exit(0);
  
  if(lseek(vdi, blockOffset, SEEK_SET) != blockOffset)
    error(__LINE__, __FILE__);
  mapSize = blockCount * 4;
  map = malloc(mapSize);
  if(read(vdi, map, mapSize) != mapSize)
    error(__LINE__, __FILE__);
  
  if(lseek(vdi, dataOffset, SEEK_SET) != dataOffset)
    error(__LINE__, __FILE__);
  if((raw = open(output, O_WRONLY | O_CREAT, 0666)) == -1)
    error(__LINE__, __FILE__);

  block = malloc(blockSize);
  zero = malloc(blockSize);
  memset(zero, 0, blockSize);
  time_buffer[0] = now();
  for(i = 0; i < blockCount; i++)
  {
    if(map[i] == -1)
    {
      if(sparse)
      {
        if(lseek(raw, blockSize, SEEK_CUR) != ((i + 1) * blockSize))
          error(__LINE__, __FILE__);
      }
      else
      {
        if(write(raw, zero, blockSize) != blockSize)
          error(__LINE__, __FILE__);
      }
    }
    else
    {
      seekTarget = dataOffset + (map[i] * blockSize);
      if(lseek(vdi, seekTarget, SEEK_SET) != seekTarget)
        error(__LINE__, __FILE__);
      if(read(vdi, block, blockSize) != blockSize)
        error(__LINE__, __FILE__);
      if(write(raw, block, blockSize) != blockSize)
        error(__LINE__, __FILE__);
    }
    
    back = i - TIME_BUFFER_SZ + 1;
    back = (0 > back) ? 0 : back;
    if((now() - lastPrint) > 250000)
    {
      bars = (i / (float)blockCount * BAR_SZ) + 0.5;
      printf("\x1b""8[");
      for(j = 0; j < bars; j++) printf("=");
      for(j = bars; j < BAR_SZ; j++) printf("-");
      printf("] %.1f%%, ", i / (float)blockCount * 100);

      if((deltaT = (now() - time_buffer[back % TIME_BUFFER_SZ])) == 0)
      {
        printf("v.fast xfer rate");
      }
      else
      {
        speed = TIME_BUFFER_SZ / (deltaT / 1000000.0);
        timeRemaining = ((blockCount - i) / speed) + 0.5;
        printf("%.2fMB/s, eta %02d:%02d:%02d ",
          blockSize * speed / (float)0x100000,
          timeRemaining / 3600, timeRemaining / 60, timeRemaining % 60);
      }
      fflush(stdout);
      lastPrint = now();
    }
    time_buffer[(i) % TIME_BUFFER_SZ] = now();
  }
  
  if(sparse && ftruncate(raw, blockSize * blockCount))
    error(__LINE__, __FILE__);
  
  finally();
  
  return 0;
}
compile with

Code: Select all

gcc vddi.c -o vddi
Use it in cygwin e.g. "vddi myvm.vdi /dev/sdb" or "vddi myvm.vdi image.raw". To dump directly to physical devices, you have to run cygwin as administrator.
It also has a sparse option (-s) when dumping the vdi as a file. The will save a lot of disk space, if the .vdi is not full. But don't use this on physical media, as you end up with old data mixed in with you data, corrupting the filesystem.
The only problem is that writing to flash is slow. I assume, I'd work a lot faster if I was writing to non-flash storage.

github: https://github.com/rhlee/vddi
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Copy vdi directly to physical media on windows host

Post by mpack »

That's impressively compact, but by the same token not very readable. Why didn't you use a struct for the header? The object code would be the same and it would be easier to maintain.

Probably this is already good enough for your own use, but I see a few problems that might affect others (only one that affects you):
  • I see you used the header string as a signature check. I also used that field as a signature, though technically it isn't fixed - it can take any value, and has taken several different values in the recent past, and there are several current tools (e.g. QEMU, CloneVDI) that put different strings there.
  • Likewise, there have been a few different header formats in times past, but perhaps that's long enough ago that you can ignore it.
  • You treat unallocated blocks correctly, but not zero blocks AFAICS. A zero block is indicated by 0xFFFFFFFE in the block map, and also indicates an unallocated block to be filled with zeros - so seeking and reading as you do now will replace zeros with junk (or parts of the VDI header).
  • You currently default to treating unallocated blocks as zero blocks. I'm not sure why you do that - since most disks are only fractionally full the copy would go a lot faster if you simply skipped them.
rhlee
Posts: 3
Joined: 17. Oct 2012, 16:53

Re: Copy vdi directly to physical media on windows host

Post by rhlee »

Thanks for the feedback. I'm relatively new to C so I'm not also sure how to lay out programs/data structure if I write them myself.

The document I used for the VDI "specification" was viewtopic.php?t=8046. I didn't realise there were zero blocks.

I only just understood your last point about unallocated blocks. It doesn't really matter what they contain because they are unallocated, so yes I can skip them.

TBH, I don't think I'm going to be using this method anymore. Flash drives are just too slow. I'll probably just pass through a faster external drive and rsync the filesystem across to a second partition in single user mode.

I was also wondering mpack, why not add more outputs formats (like physical devices or raw images) to your clonevdi tool?
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: VirtualBox+Oracle ExtPack
Guest OSses: Mostly XP

Re: Copy vdi directly to physical media on windows host

Post by mpack »

rhlee wrote:I was also wondering mpack, why not add more outputs formats (like physical devices or raw images) to your clonevdi tool?
You can in fact output raw images with CloneVDI, you go into "Sector Viewer" and tell it to dump any block of sectors you like - including the whole disk - to a file.

I intentionally keep the focus of CloneVDI narrow and simple. The purpose is VDI optimization and repair, not generic disk cloning. It outputs dynamic VDIs, and that's it. I detest e.g. Linux tools whose user interface and source code have become bloated out of all proportion with little gimmicky features that only one person ever asked for, and in fact didn't really need.
Post Reply