OS Upgrade

This is a list of steps as guidelines for OS upgrades.

  • create a temporary group_vars/<new-suite>/system.yml in a git branch:
      codename: bullseye
      codename: bullseye

  • (I often choose Elwing first) ansible-playbook --diff -l Elwing -t apt playbooks/common.yml | tee /tmp/dc.log
  • apt upgrade
  • apt full-upgrade (check if removed packages are problematic)
    • accept new version of:
      • /etc/services and copy-paste content of local services from /etc/services.dpkg-old to avoid having to redeploy all services
      • /etc/grub.d/10_linux but see warning below
      • /etc/ssh/ssh_config (we use ssh_config.d in Ansible now) BUT NOT sshd_config!
    • do not accept new versions for:
      • /etc/smartd.conf
      • /etc/snmp/snmp.conf
      • /etc/oidentd.conf
      • /etc/sudoers
      • /etc/rsyslog.conf
      • /etc/apt-cacher-ng/acng.conf
      • /etc/zabbix/*
      • /etc/logrotate.d/*
    • check the diff manually for other files
    • purge facts_cache/<host> before running Ansible to detect the new major version
  • if PHP FPM: (to avoid having to redeploy all vhosts) (example for PHP 7.3->7.4)
    • rm /etc/php/7.4/fpm/pool.d/www.conf
    • cp /etc/php/7.3/fpm/pool.d/* /etc/php/7.4/fpm/pool.d/
    • sed -i 's/7\.3/7.4/g' /etc/php/7.4/fpm/pool.d/*
    • systemctl restart php7.4-fpm.service
    • sed -i 's/7\.3/7.4/g' /etc/apache2/sites-enabled/*.conf.d/php.conf
    • systemctl restart apache2
    • run common web playbook playbooks/tenants/duckcorp/web.yml with -t web-common
  • apt purge libpython2.7-minimal
  • run the playbooks/common.yml playbook with --skip-tags monitoring (until a recent zabbix-cli is packaged)
  • on MX1 servers, run the playbooks/tenants/duckcorp/mail.yml playbook with -t antispam (to switch the Rspamd repo to the new suite)
  • run the playbooks/tenants/duckcorp/accounts.yml playbook

It is critical that the common playbook is run successfully before rebooting. Especially /etc/grub.d/10_linux must contain the --unrestricted option and the GRUB config must be regenerated or the server will block at the GRUB screen waiting for a login.

  • reboot
  • check failed services: systemctl --failed

Updated by Marc Dequènes over 1 year ago · 15 revisions