A procedure to backup a large disk image across several DVDs.

Hi,

This is a procedure to create a backup on DVD of a large image file, in my case a backup of the 3
W7H partitions of a laptop. This method needs that you have enough free space on another disk to
store the entire image, and more to spare (I used a 500GiB external disk via usb and eSATA on XFS).
The DVDs are compressed using zisofs, and they have extra recovery data using “dvdisaster”.

Requires:

dvdisaster: http://www.dvdisaster.com

dvdisaster provides a margin of safety against data loss on CD and
DVD media caused by scratches or aging. It creates error correction
data, which is used to recover unreadable sectors if the disc becomes
damaged at a later time.

>From the packman repo. Optional, but advisable.

zisofs.

A method to create compressed cdroms or dvds that can be read transparently by the kernel. It has
been included in openSUSE for several years. See mkzftree(1). No need to install anything.

Brief description:

  1. Create the big image file on another disk via dd.

  2. Split the image on as many smaller chunks (50 MiB each) as needed. This means thousands of small
    files.

  3. Compress each chunk using zisofs.

  4. Divide all the small files on several numbered directories, so that each one will fit in one
    DVD. This would mean a size below the 4.37 GiB size of a DVD; as we need extra space for the
    recovery data, we’ll target to 3700 MiB per directory (a).

  5. Create an ISO image from each directory. This needs to be made with “mkisofs”, it is impossible
    to use a GUI (no zisofs support).

  6. Add the recovery data to each with dvdisaster (they call it “augmenting”).

Burn the ISO images, or test them first recreating the original HD image file.

Note (a) I experimented with several sizes, and a lower margins makes dvdisaster complain that it is
not enough free space for augmenting. They recommend 20% free space, but that seems too much for me,
so I targeted for 15%. See the manual of dvdisaster to make your own decision
(<file:///usr/share/doc/packages/dvdisaster/en/howtos30.html>), Or don’t use it at all if you prefer.

Why this way?

The procedure is complex. But the alternative I contemplated was using the shareware version of rar,
which can compress a huge file on several small archives of a given size, and add correction data
IIRC. So I devised this system. I’m sure that others exist.

No, the opensource ZIP can’t do it. The pkware version might.

Just one system more :slight_smile:

Why doing it at all?

Well, I paid for the windows 7 on this laptop, so I don’t want to waste that money. Anyway, I need
Windows because I have some hardware that is only handled by Windows (a Tom-Tom navigator). True,
the laptop has a recovery partition and a recovery dvd I made from it. But using that function
rewrites the entire HD (I tried) destroying the linux partition, and that will not do.

A full recovery image is preferable, but I do not want to buy Ghost or whatever for just one
computer. So, why not use Linux to backup the other un-nameable os? :slight_smile:

Disclaimer

What I write is the procedure I used on my machine, and it worked. Exercise caution if you use this
method, adjust the command lines to your needs. Read the outputs, react if necessary. I don’t cover
all the details, so read manuals. Some steps (dd) can destroy your system beyond repair if you do a
bad mistake.

Detailed description

Step 1)

For example:


dd if=/dev/sda1 of=sda1_windows_system.img
dd if=/dev/sda2 of=sda2_windows_C.img

or, in order to impact less the system:


time ionice -c 3 dd if=/dev/sda1 of=sda1_windows_system.img

Step 2)


md splits
cd splits
time ionice -c 3 \
split --bytes=50M --numeric-suffixes --suffix-length=6  \
../sda2_windows_C.img sda2_windows_C.
cd ..

Notice that to start this step, you need to have as much free space as the size of the original
image. Once completed, you can delete that image, or keep it for comparison at the end.

This will produce thousands of small files in the directory “splits”, named like this:


sda2_windows_C.000000
sda2_windows_C.000001
...
sda2_windows_C.003250

In my case, 3250 files of 50 MiB each. Why this size? Because we need to be able to add a bunch of
them, as much as we can fit on a dvd, and have some granularity. But as they are compressed, the
real size will vary. This size seems adequate after a few tests I did (meaning: at worst, you waste
50MiB on the DVD).

Step 3)

time nice mkzftree -p 4 splits/ splits.zisofs

Adjust “-p” to the number of cores or cpus in your system. I have four in this desktop, so I wrote 4
to maximize speed (yes, the “nice” is because I do not want to make my entire system sluggish). This
step is quite cpu intensive.

The result of this is a copy of the entire directory “splits” in “splits.zisofs”, but compressed. In
my case the sizes of each file varied a lot, from about 50M down to 6.3 K for files full of zeroes.

Warning: the destination directory must not exist, or the program fails.

If successful, you can delete the source directory “splits”.

Step 4)

The previous step left me with thousands of files of various size, but smaller than 50MiB, in the
directory “splits.zisofs”. Now, we have to take a bunch of files that sum up to the size of a DVD. I
used “mc” for the purpose.

First I created numbered directories: 01, 02, 03, 04…

Then, in the directory “splits.zisofs” I mark files, starting from the first one, using the “INS”
key, repeatedly, until the status line says that the total size of the marked files is about 3700MiB
or less (a bit less than 4482MiB if you are not going to use dvdisaster later). To be precise, mc
will display “3800000K”. Then I use the “F6” key to move them over to the other directory, “01/”.

The procedure is repeated until the files are exhausted.

I ended by having directories 01/ up to 16/, meaning 16 DVDs for my backup:


cmprsd
size	nmbr	files	expanded size

3.7G    01	144	 7595M
3.7G    02	102	 5100M
3.7G    03	154
3.7G    04	108
3.7G    05	108
3.7G    06	75
3.6G    07	74
3.7G    08	431	 21550M
3.7G    09	147
3.6G    10	313
3.7G    11	200
3.7G    12	791	39550M
3.6G    13	617	30834M
3.7G    14	119	 5950M
3.6G    15	74	 3700M
2.5G	16	50	 2892M

Notice that they are well below the maximum size of the DVD, because we we’ll add recovery data that
will allow to recover a damaged DVD if we want to read again it in 5 years. Hopefully. We need to
have access to this software. You can also see how well some of them are compressed, up to 40 GiB of
the original NTFS in one DVD when it comes to the empty space on the HD. Actually, I thought it
would compress more.

(actually, the expanded size is the number of files times 50 MiB, of course. But the numbers in the
above table are all calculated)

Step 5 and 6)

I used this script, modify it to your needs (or do manually, but it is a lot of ISOs):


#!/bin/bash


function thread
{
NUMERO=$1
ISOFILE=sdax_$NUMERO.iso

DATE=`date --rfc-3339=seconds`
echo "Image $ISOFILE started on $DATE" >> masterlog

time mkisofs -quiet -iso-level 3 -R -r -z -V "Bck Compaq sda2" \
-volset "$NUMERO of 13" -P "Not Published, private backup" -p "


DATE=`date --rfc-3339=seconds`
echo "Image $ISOFILE finished on $DATE" >> masterlog
ls -lh $ISOFILE >> masterlog

time dvdisaster -i $ISOFILE -mRS02 -c
DATE=`date --rfc-3339=seconds`
echo "Imagen $ISOFILE augmented in $DATE" >> masterlog
}


echo > masterlog
DATE=`date --rfc-3339=seconds`
echo "--- Mark 1 --  $DATE" >> masterlog
echo >> masterlog

thread 01 | tee log01  &
thread 02 | tee log02  &
thread 03 | tee log03  &
thread 04 | tee log04  &

# Bug: log01..04 are empty. errio is not redirected.


wait

echo >> masterlog
DATE=`date --rfc-3339=seconds`
echo "--- Mark 2 --  $DATE" >> masterlog
echo >> masterlog


thread 05 | tee log11 &
thread 06 | tee log12 &
thread 07 | tee log13 &
thread 08 | tee log14 &

wait

echo >> masterlog
DATE=`date --rfc-3339=seconds`


# Repeat sections for all directories (me: 16)
# Grouping is for function calls in background each time, then wait for completion of the four,
# because I have four cores on the desktop. If you have 2, or 8, adjust to your system or
# processing will be slower.

Notice that this process can not be done with a GUI. I know of no linux GUI that can create a
compressed zisofs image: a pity. If any would exist, steps 3 to 5 would not be needed.

So first we create a compressed iso for each directory, then we augment each iso with the recovery
data using dvdisaster.

A test dvdisaster run goes like this:


Elessar:/mnt/Ext/Erebor/Images/Minas Tirith/conrecovery # time dvdisaster -i
sda3_1.iso -mRS02 -c
dvdisaster-0.72  Copyright 2004-2009 Carsten Gnoerlich.
This software comes with  ABSOLUTELY NO WARRANTY.  This
is free software and you are welcome to redistribute it
under the conditions of the GNU GENERAL PUBLIC LICENSE.
See the file "COPYING" for further information.

Opening sda3_1.iso: 2000181 medium sectors.
Augmenting image with Method RS02:
3906 MB data, 569 MB ecc (32 roots; 14.3% redundancy).
* Warning: Using redundancies below 20%% may not give
*          the expected data loss protection.
Preparing image (checksums, adding space): 100%
Ecc generation: 100.0%
Image has been augmented with error correction data.
New image size is 4476 MB (2291744 sectors).

real    7m14.282s
user    4m19.090s
sys     0m10.795s

The actual runs were a bit different, but I have none stored. I have the one for dvd 16, but that
one is different (smaller “payload”):


dvdisaster-0.72  Copyright 2004-2009 Carsten Gnoerlich.
This software comes with  ABSOLUTELY NO WARRANTY.  This
is free software and you are welcome to redistribute it
under the conditions of the GNU GENERAL PUBLIC LICENSE.
See the file "COPYING" for further information.

Opening sdax_16.iso: 1276082 medium sectors.
Augmenting image with Method RS02:
2492 MB data, 1961 MB ecc (112 roots; 78.3% redundancy).
Preparing image (checksums, adding space): 100%
Ecc generation: 100.0%
Image has been augmented with error correction data.
New image size is 4453 MB (2280141 sectors).

See how the final sizes of the isos are up to the maximum - the data recovery fills up all the free
space up the the dvd size (which is what the -c parameter does):


4.4G Aug  5 22:55 sdax_01.iso
4.4G Aug  5 22:55 sdax_02.iso
4.4G Aug  5 22:55 sdax_03.iso
4.4G Aug  5 22:53 sdax_04.iso
4.4G Aug  5 23:24 sdax_05.iso
4.4G Aug  5 23:24 sdax_06.iso
4.4G Aug  5 23:24 sdax_07.iso
4.4G Aug  5 23:24 sdax_08.iso
4.4G Aug  5 23:53 sdax_09.iso
4.4G Aug  5 23:52 sdax_10.iso
4.4G Aug  5 23:52 sdax_11.iso
4.4G Aug  5 23:52 sdax_12.iso
4.4G Aug  6 00:29 sdax_13.iso
4.4G Aug  6 00:28 sdax_14.iso
4.4G Aug  6 00:28 sdax_15.iso
4.4G Aug 13 04:44 sdax_16.iso

Done :slight_smile:

Just burn the images however you like - I use the command line - and don’t forget to verify the burn :slight_smile:

Testing recovery

This loops- mount a few of the isos so that I can test them:


mount -t iso9660 -o ro,loop=/dev/loop3 sdax_13.iso mnt/13
mount -t iso9660 -o ro,loop=/dev/loop4 sdax_14.iso mnt/14
mount -t iso9660 -o ro,loop=/dev/loop5 sdax_15.iso mnt/15
mount -t iso9660 -o ro,loop=/dev/loop6 sdax_16.iso mnt/16

This script (modify to your names and paths) recreates a HD image from the split files on a DVD or
iso image. It calculates the position of each source chunk in the final HD image file or disk based
on the name (the numeric extension part):


sda2_windows_C.000000  --> position 0
sda2_windows_C.000001  --> 50 MB later
...
sda2_windows_C.003250  --> 3250 * 50 MB later, ie, 158 GiB later.

I have two big image files (13 and 4 dvds each), so I take that into account. I also saved the MBR,
and /boot separately, the last DVD is half full. Not forgetting the scripts :slight_smile:


#!/bin/bash


function procesar_un_iso
{
for FILES in $1
do
NOMBRE=`echo $FILES | cut -f 1 -d . `
EXTENSION=`echo $FILES | cut -f 2 -d . `
SALIDA=`basename $NOMBRE`
NUMERO=$((10#$EXTENSION))


#       Multiplos de 50 MiB
PRINCIPIO=$((50 * $NUMERO))
echo "   Copiando los 50 MiB de $NOMBRE.$EXTENSION (nº $NUMERO) en la posición $PRINCIPIO
MiB de la salida ($SALIDA.imagen)"
dd if=$FILES of=$SALIDA.imagen seek=$NUMERO obs=50M
echo

done
}


#for CAMINO in 01 02 03 04 05 06 07 08 09 10 11 12 13
#do
#  procesar_un_iso "mnt/$CAMINO/sda2_windows_C*"
#done



for CAMINO in 13 14 15 16
do
procesar_un_iso "mnt/$CAMINO/sda3_windows_recovery*"
done

This recreates the image file using the loop mounted DVDs; I simply use “cmp” to compare the
recreated image with the original - and it is the same. Bingo! The entire procedure works. O:-)

For a real recovery, I could put the destination of dd to the real hard disk, or recreate the image
file on another disk, and later copy that to the laptop HD, using linux.

I hope I don’t need to ever use this, but… who knows, Windows is not as stable as Linux.
Eventually, it dies.

I hope this helps some one some time :slight_smile:

If you have questions, ask before I forget the answer :wink:


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” GM (Elessar))