Project

General

Profile

Enhancement #497

Change Backup System

Added by Marc Dequènes almost 3 years ago. Updated almost 2 years ago.

Status:
In Progress
Priority:
Normal
Category:
Service :: Backup
Start date:
2017-05-09
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Patch Available:
Confirmed:
No
Branch:
Entity:
DuckCorp
Security:
Help Needed:
Yes

Description

Bacula has some drawbacks which do not improve over the last years, mainly:
  • horrible CLI, difficult to search through backups and volumes
  • retention is awfully complex to setup
  • consolidated backups never worked with file volumes
  • cannot resume, so if a full backup with a lot of data fails midway, useless incomplete volumes pile-up and eat all the available space
We may use another system for laptop/user backup, like duplicity for eg. On a trusted centralized backup system for server I would list these criteria:
  • secure transfer (TLS, SSH…)
  • delta transfer
  • compression
  • resumable
  • incremental backup: either full-incremental without full backup (except initial setup), or consolidation
  • open fifo option: useful to backup databases without reserving a huge local space, allows on-the-fly backup stream with unmodified tools (mysqldump, pg_dump…)
  • proper retention settings: we should be able to express this: keep 1 backup per day during 7 days, then 1 per week during 4 weeks, then 1 per month during 1 year
  • long time restoration: backup format breaks infrequently and either new software can read old formats or a straightforward command can convert them to the new format
  • maintained: at least one maintenance release per year, no critical bug without at least a workaround for more than a month
  • CLI
Also, would-be-nice features but we can live without it:
  • deduplication
  • single entrypoint when different category of data are to be saved (different retention for eg): single daemon and open port
  • exclude dir if contains file: allows user to exclude their own dirs, like we did with Bacula, just 'touch .nobackup' and the backup software will skip the dir

Backup scheduling:


Files

backup_scheduling.jpg (940 KB) backup_scheduling.jpg Pierre-Louis Bonicoli, 2017-06-27 15:11

Subtasks

Enhancement #533: Install a Jessie LXC container with systemd enabled in order to test/validate Burp setupResolvedPierre-Louis Bonicoli

Actions

Related issues

Related to DuckCorp Infrastructure - Review #518: Review branch backupIn Progress2017-04-03Actions
Related to DuckCorp Infrastructure - Review #519: Review burp roleIn Progress2017-04-03Actions
Blocked by DuckCorp Infrastructure - Review #585: backup_duckIn Progress2017-08-29Actions

History

#1

Updated by Marc Dequènes almost 3 years ago

  • Status changed from New to In Progress
  • Help Needed set to Yes

Potential successors:
- Burp: all needed feature are there, Pilou created an Ansible config we could use as base (need some work), but each backup partition needs an instance daemon on a specific port
- BorgBackup: similar as Burp in many ways, all needed features are there, via SSH so no daemons and ports to open

Asking for advice.

#2

Updated by Arnaud Fontaine almost 3 years ago

I use duplicity personally, not sure how it compares to other software you're mentioning. Will check when I have some time.

Here is a comparison of existing backup programs:
https://wiki.archlinux.org/index.php/Synchronization_and_backup_programs

It seems burp does not intent to keep compatibility between major releases which could be a problem when trying to access old backups. Not sure if there is a way to automatically update all of them in an easy way after major releases though...

#3

Updated by Marc Dequènes almost 3 years ago

  • Description updated (diff)
#4

Updated by Marc Dequènes almost 3 years ago

using the previous experience and the feature chart Arnau linked, I listed the criteria I think we should have.

#5

Updated by Marc Dequènes almost 3 years ago

  • Description updated (diff)
#6

Updated by Pierre-Louis Bonicoli over 2 years ago

Arnaud Fontaine wrote:

It seems burp does not intent to keep compatibility between major releases which could be a problem when trying to access old backups. Not sure if there is a way to automatically update all of them in an easy way after major releases though...

Burp version in Jessie is 1.3.48, Burp version in Stretch will be 2.0.54. The are two Burp protocols (protocol 1, protocol 2), the second one is under development and is not stable yet. The default protocol in 2.0.54 is the first one.

At the beginning development of protocol 2, it was possible to switch from protocol 1 to protocol 2, I don't have an up to date information on that subject.

#7

Updated by Pierre-Louis Bonicoli over 2 years ago

#8

Updated by Pierre-Louis Bonicoli over 2 years ago

#9

Updated by Marc Dequènes about 2 years ago

So this this a retranscription of the Bacula settings, but with some adaptations (obsolete paths, new paths, forgotten paths). So we should check this list twice. Also I simplified the retention using 3 classes only. I wonder if we should also save the logs (~9GB total). Also databases could probably go into class 2. All of this is up to discussion.

Class 1: Lightweight critical data, keep everyday for 2 months and weekly for 4 months and monthly for 6 months (1 year total):

  • /boot
  • /etc
  • /var/lib/dpkg/available
  • /var/lib/dpkg/diversions
  • /var/lib/dpkg/statoverride
  • /var/lib/dpkg/status
  • /var/spool/cron
  • /var/backups
  • /root
  • /usr/local

Additional for Elwing:

  • /var/lib/opendnssec
  • /var/lib/softhsm

Additional for Orfeo:

  • /var/lib/opendnssec
  • /var/lib/softhsm

Additional for Korutopi:

  • /var/lib/postgresql/zbx_config_backup.sql

Class 2: Lightweight-Heavyweight important data, keep everyday for 2 weeks and weekly for 6 weeks, and monthly for 4 months (6 months total):

  • /home

Additional for Elwing:

  • /var/www

Additional for Thorfinn:

  • /srv/bouncer
  • /srv/www

Additional for Orfeo:

  • /var/lib/minbif
  • /var/lib/prosody
  • /srv/www
  • /var/local/vmail
  • /var/lib/mailman
  • /var/spool/dspam

Additional for Toushirou:

  • /srv/vcs
  • /srv/www
  • /srv/projects
  • /var/local/stuffcloud-data

Class 3: Lightweight-Heavyweight less important data, keep everyday for 1 week and weekly for 3 weeks, and monthly for 5 months (6 months total):

  • /opt

Additional for Elwing:

  • /usr/local/share/mibs
  • /srv/tftp
  • /var/lib/smokeping

Additional for Jinta:

  • /var/games/minetest-server

Additional for Orfeo:

  • /var/lib/roundcube/plugins
  • /usr/local/share/roundcube-plugins/

Additional for Toushirou:

  • /var/lib/smokeping
  • /srv/ftp/ftp.duckcorp.org/private/
  • /srv/ftp/ftp.duckcorp.org/public/
#10

Updated by Marc Dequènes about 2 years ago

  • Assignee changed from Marc Dequènes to Pierre-Louis Bonicoli
#12

Updated by Marc Dequènes about 2 years ago

So I'll try to roughly translate the retention into Burp's keep parameter.:
  • class 1: 30/11
  • class 2: 14/3/2
  • class 3: 7/3/4

What do you think about this?

#13

Updated by Marc Dequènes about 2 years ago

As for scheduling I think that's very difficult to find nice ranges, with all the data we have it can take some time and BW. So I think the best is to target the week's best range and limit the BW to an acceptable rate for a fiber connection and the poor I/Os some of the servers can produce.

#14

Updated by Pierre-Louis Bonicoli about 2 years ago

Marc Dequènes wrote:

So I'll try to roughly translate the retention into Burp's keep parameter.:
  • class 1: 30/11
  • class 2: 14/3/2
  • class 3: 7/3/4

What do you think about this?

Fine.

Translated:

  • class 1
    • before: keep everyday for 2 months and weekly for 4 months and monthly for 6 months (1 year total) * now: keep everyday for 1 month and monthly for 11 months (1 year total)
  • class 2
    • before: keep everyday for 2 weeks and weekly for 6 weeks, and monthly for 4 months (6 months total) * now: keep everyday for 2 weeks and every 2 weeks for 6 weeks, and and every month and a half for 3 months (6 months total) * now: keep everyday for 2 weeks, and every 2 weeks for 6 weeks, and every 2 months for 2 months (6 months total)
  • class 3
    • before: keep everyday for 1 week and weekly for 3 weeks, and monthly for 5 months (6 months total) * now: keep everyday for 1 week and weekly for 3 weeks, and every 3 weeks for 3 months (4 months total)

For the scheduling, we will adapt: it depends on backup operation duration. Burps logs every duration, it will be easy to improve scheduling.

#15

Updated by Marc Dequènes about 2 years ago

There were changes recently, due to Stretch migration and obsolete things, so I updated the directory list above.

#16

Updated by Marc Dequènes almost 2 years ago

Tasks which need not be forgotten:
  • cleaning up the packages, configs, directories… of the previous system
  • updating the documentation (https://admin.duckcorp.org/index.php/Services/Backup) and clarify the restoration steps (also what's the best way to restore a database, without writing the full dump locally is possible as some DB might be very big and not fit in our limited space)
#17

Updated by Marc Dequènes almost 2 years ago

Also available in: Atom PDF