Discussion:
[pieter.wuille@gmail.com: Re: backup the backuppc pool with bacula]
(too old to reply)
Pieter Wuille
2009-05-30 20:20:33 UTC
Permalink
Hi,
there is a regular discussion on how to backup/move/copy the
backuppc pool. Did anyone try to backup the pool with bacula?
Hello there...
I don't know about bacula, but would like myself also to get a backup of
the BackupPC server: anybody got some suggestions and practical
examples?
This is how we handle backups of the backuppc pool:
* the pool itself is on a LUKS-encrypted XFS filesystem, on a LVM volume, on a
software RAID1 of 2 1TB disks.
* twice a week following procedure in run:
* Freeze the XFS filesystem, sync, lvm-snapshot the encrypted volume
* Unfreeze
* send the snapshot over ssh to an offsite server (which thus only ever sees
the encrypted data)
* remove the snapshot
* The offsite server has 2 smaller disks (not in RAID), and snapshots are sent
in turn to one and to the other. This means we still have a complete pool if
something would goes wrong during the transfer (which takes +- a day)
* The consistency of the offsite backups can be verified by exporting them
over NBD (network block device), and mounting them on the
normal backup server (which has the encryption keys)

We use a blockdevice-based solution instead of a filesystem-based one, because
the many small files (16 million inodes and growing) makes those very disk-
and cpu intensive. (simply doing a "find | wc -l" in the root takes hours).
Furthermore it makes encryption easier.
We are also working on a rsync-like system for block devices (yet that make
still take some time...), which would bring the time for synchronising the
backup server with the offsite one down to 1-2 hours.

Greetz,
--
Pieter
Stephane Rouleau
2009-05-31 15:22:13 UTC
Permalink
Post by Pieter Wuille
Hi,
there is a regular discussion on how to backup/move/copy the
backuppc pool. Did anyone try to backup the pool with bacula?
Hello there...
I don't know about bacula, but would like myself also to get a backup of
the BackupPC server: anybody got some suggestions and practical
examples?
* the pool itself is on a LUKS-encrypted XFS filesystem, on a LVM volume, on a
software RAID1 of 2 1TB disks.
* Freeze the XFS filesystem, sync, lvm-snapshot the encrypted volume
* Unfreeze
* send the snapshot over ssh to an offsite server (which thus only ever sees
the encrypted data)
* remove the snapshot
* The offsite server has 2 smaller disks (not in RAID), and snapshots are sent
in turn to one and to the other. This means we still have a complete pool if
something would goes wrong during the transfer (which takes +- a day)
* The consistency of the offsite backups can be verified by exporting them
over NBD (network block device), and mounting them on the
normal backup server (which has the encryption keys)
We use a blockdevice-based solution instead of a filesystem-based one, because
the many small files (16 million inodes and growing) makes those very disk-
and cpu intensive. (simply doing a "find | wc -l" in the root takes hours).
Furthermore it makes encryption easier.
We are also working on a rsync-like system for block devices (yet that make
still take some time...), which would bring the time for synchronising the
backup server with the offsite one down to 1-2 hours.
Greetz,
Pieter,

This sounds rather close to what I'd like to have over the coming months. I just recently reset our backup pool, and rather stupidly did not select an encrypted filesystem (Otherwise we're on XFS, LVM, RAID1 2x1.5TB). Figured I'd encrypt the offsite only, but I see now that it'd be much better to send data at the block level.

You mention the capacity of your pool file system, but how much space is typically used on it? Curious also what kind of connection speed you have with your offsite backup solution.

Thanks for detailing your setup,

--
Stephane
Pieter Wuille
2009-06-01 10:53:47 UTC
Permalink
Post by Stephane Rouleau
Post by Pieter Wuille
* the pool itself is on a LUKS-encrypted XFS filesystem, on a LVM volume, on a
software RAID1 of 2 1TB disks.
* Freeze the XFS filesystem, sync, lvm-snapshot the encrypted volume
* Unfreeze
* send the snapshot over ssh to an offsite server (which thus only ever sees
the encrypted data)
* remove the snapshot
* The offsite server has 2 smaller disks (not in RAID), and snapshots are sent
in turn to one and to the other. This means we still have a complete pool if
something would goes wrong during the transfer (which takes +- a day)
* The consistency of the offsite backups can be verified by exporting them
over NBD (network block device), and mounting them on the
normal backup server (which has the encryption keys)
We use a blockdevice-based solution instead of a filesystem-based one, because
the many small files (16 million inodes and growing) makes those very disk-
and cpu intensive. (simply doing a "find | wc -l" in the root takes hours).
Furthermore it makes encryption easier.
We are also working on a rsync-like system for block devices (yet that might
still take some time...), which would bring the time for synchronising the
backup server with the offsite one down to 1-2 hours.
Greetz,
Pieter,
This sounds rather close to what I'd like to have over the coming months. I just recently reset our backup pool, and rather stupidly did not select an encrypted filesystem (Otherwise we're on XFS, LVM, RAID1 2x1.5TB). Figured I'd encrypt the offsite only, but I see now that it'd be much better to send data at the block level.
You mention the capacity of your pool file system, but how much space is typically used on it? Curious also what kind of connection speed you have with your offsite backup solution.
Some numbers:
* backup server has 1TB of RAID1 storage
* contains amonst others a 400GiB XFS volume for backuppc
* daily/weekly backups of +- 195GiB of data
* contains 256GiB of backups (expected to increase significantly still)
* contains 16.8 million inodes
* according to LVM snapshot usage, avg. 1.5 GiB of data blocks change on this volume daily
* offsite backup server has 2x 500GB of non-RAID storage
* twice a week, the whole 400GiB volume is sent over a 100Mbps connection (at +- 8.1MiB/s)
* that's a huge waste for maybe 5GiB of changed data, but the bandwidth is generously provided by the university
* we hope to have a more efficient blockdevice-level synchronisation system in a few months

PS: sorry for the strange subject earlier - i used a wrong 'from' address first and forwared it
--
Pieter
Stephane Rouleau
2009-06-01 22:15:52 UTC
Permalink
Thanks Pieter,

Is the blockdevel-level rsync-like solution going to be something
publicly available?

Stephane
Post by Pieter Wuille
Post by Stephane Rouleau
Post by Pieter Wuille
* the pool itself is on a LUKS-encrypted XFS filesystem, on a LVM volume, on a
software RAID1 of 2 1TB disks.
* Freeze the XFS filesystem, sync, lvm-snapshot the encrypted volume
* Unfreeze
* send the snapshot over ssh to an offsite server (which thus only ever sees
the encrypted data)
* remove the snapshot
* The offsite server has 2 smaller disks (not in RAID), and snapshots are sent
in turn to one and to the other. This means we still have a complete pool if
something would goes wrong during the transfer (which takes +- a day)
* The consistency of the offsite backups can be verified by exporting them
over NBD (network block device), and mounting them on the
normal backup server (which has the encryption keys)
We use a blockdevice-based solution instead of a filesystem-based one, because
the many small files (16 million inodes and growing) makes those very disk-
and cpu intensive. (simply doing a "find | wc -l" in the root takes hours).
Furthermore it makes encryption easier.
We are also working on a rsync-like system for block devices (yet that might
still take some time...), which would bring the time for synchronising the
backup server with the offsite one down to 1-2 hours.
Greetz,
Pieter,
This sounds rather close to what I'd like to have over the coming months. I just recently reset our backup pool, and rather stupidly did not select an encrypted filesystem (Otherwise we're on XFS, LVM, RAID1 2x1.5TB). Figured I'd encrypt the offsite only, but I see now that it'd be much better to send data at the block level.
You mention the capacity of your pool file system, but how much space is typically used on it? Curious also what kind of connection speed you have with your offsite backup solution.
* backup server has 1TB of RAID1 storage
* contains amonst others a 400GiB XFS volume for backuppc
* daily/weekly backups of +- 195GiB of data
* contains 256GiB of backups (expected to increase significantly still)
* contains 16.8 million inodes
* according to LVM snapshot usage, avg. 1.5 GiB of data blocks change on this volume daily
* offsite backup server has 2x 500GB of non-RAID storage
* twice a week, the whole 400GiB volume is sent over a 100Mbps connection (at +- 8.1MiB/s)
* that's a huge waste for maybe 5GiB of changed data, but the bandwidth is generously provided by the university
* we hope to have a more efficient blockdevice-level synchronisation system in a few months
PS: sorry for the strange subject earlier - i used a wrong 'from' address first and forwared it
Tino Schwarze
2009-06-02 09:44:11 UTC
Permalink
Post by Stephane Rouleau
Is the blockdevel-level rsync-like solution going to be something
publicly available?
blockdev-level rsync smells like drbd. I'm not sure whether it support
such huge amounts of unsynchronized data, but it might just be a matter
of configuration.

Tino.
--
"What we nourish flourishes." - "Was wir nähren erblüht."

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de
Pieter Wuille
2009-06-02 11:40:13 UTC
Permalink
Post by Tino Schwarze
Post by Stephane Rouleau
Is the blockdevel-level rsync-like solution going to be something
publicly available?
We certainly intend to, but no guarantee it ever gets finished. Except for
implementation there's not that much work left, but we do it in our free
time.

It really seems strange something like that doesn't exist yet (and rsync
itself doesn't support blockdevices).
Post by Tino Schwarze
blockdev-level rsync smells like drbd. I'm not sure whether it support
such huge amounts of unsynchronized data, but it might just be a matter
of configuration.
In fact, i'd say LVM should be able to do this: generate a blocklevel-diff
between two snapshots of the same volume, and create a new snapshot/volume
based on an old one + a diff. Eg. ZFS supports this using send/receive.
So far, i haven't read about support for such a feature.
On the other hand, i think i've read on this list that using zfs send/receive
for backuppc pools was very slow (but that's on a filesystem level, not
blockdev level).

Drdb might be a solution too - i haven't looked at it closely. It seems
more meant for high availability, probably it can be used for offsite-backup
too. It has support for recovery after disconnection/failure, so maybe you can
use it to keep older versions on a remote system by forcibly disconnecting the
nodes. I don't know how easy it would be to migrate a non-drdb volume either.
Anyone experience with this in combination with backuppc?
--
Pieter
Loading...