I update (apt-get update; apt-get upgrade) every Sunday. Never had a problem with an update breaking something i need working. Though, i'm only doing this on Debian Stable.
On local boxes, i have staged binary backup of the partition, which makes a restore of a broken system very easy. I have a script, which writes a partition into a file (removing deleted data and compressing the output file), like this:
#!/bin/bash
echo "[iNFO] - Writing zerofile"
TTIsWheezyMounted=`df -h|grep /wheezy|wc -l`
if [ "$TTIsWheezyMounted" = "0" ]; then
echo "[iNFO] - Mounting /wheezy"
mount /wheezy
fi
dd if=/dev/zero of=/wheezy/zerofile bs=256M
echo "[ OK ] - Zerofile written, sleeping, syncing"
sync
sleep 3
sync
echo
echo "[iNFO] - Removing zerofile"
rm -rf /wheezy/zerofile
echo "[iNFO] - Unmounting wheezy"
umount /wheezy
echo "[iNFO] - Creating backup"
dd if=/dev/mapper/vg-wheezy bs=256M|pigz -c -1|dd of=/backup/img/vg-wheezy-backup-`date +"%Y%m%d%H%M%S"`.img.gz bs=256M
echo "[ OK ] - OK, backup done:"
ls -ltr /backup/img|tail -1
mount /wheezy
exit 0
When i need to restore,i just decompress the file and write it to the partition using dd.
On remote machines, there is always some kind of reinstall, which brings the machine to a known, functional state. From that point i sequentialy use other scripts to bring the machine to the correct state as soon as possible. One of the scripts:
#!/bin/bash
#######################################################################################
# This script sets /etc/hosts file, changes root password, sets the hostname,
# creates user catcher, sets up ssh keys for catcher and root, updates apt-get source
# file, updates OS and installs basing set of packages
#######################################################################################
# TODO: Timestamping
#######################################################################################
# Environment setup
TTicRootPass=password
TTicCatcPass=password
export DEBIAN_FRONTEND=noninteractive
TTicPackList=`cat /home/catcher/scripts/instconf-base-files/TTicPackList`
TTicFilesPath=/home/catcher/scripts/instconf-base-files
TTicPackList=`cat /home/catcher/scripts/instconf-base-files/TTicPackList`
TTLog=/var/log/TTinstconf-base-`date +"$Y$m$d$H$M$S"`.log
#######################################################################################
# Code BEGIN
# Checking parameters
if [ "$#" != "1" ]; then
echo "[ERR0] - Incorrect number of parameters"
echo
echo "instconf-base - HDCS base system installation script"
echo
echo "Usage:"
echo "instconf-base host"
echo "host\t- hostname of the installed node"
echo
echo "Mind that script exits with exit code 1, if no parameters are set"
echo "or incorrect number of parameters is set"
exit 1
fi
echo "[iNFO] - This is HDCS Base system installation and configuration script."
# Setting hostname
echo "[iNFO] - Setting hostname"
echo $1 > /etc/hostname
echo "[ OK ] - Hostname set"
# Hostname setting check
TTicHostF=`cat /etc/hostname`
if [ "$1" != "$TTicHostF" ]; then
echo "[ERR0] - Setting writing hostname into /etc/hostname file failed"
exit 4
fi
# Changing root password
echo "[iNFO] - Changing root password"
echo "root:$TTicRootPass"|chpasswd
if [ "$?" != "0" ]; then
echo "[ERR0] - Changing root password failed"
exit 5
fi
echo "[ OK ] - Root password changed"
# Verifying if existing /etc/hosts file already has HDCS Namespace in it
echo "[iNFO] - Verifying existing hosts file"
TTicExistingHostsFileCheck=`grep "HDCS Namespace" /etc/hosts 2> /dev/null |wc -l`
if [ "$TTicExistingHostsFileCheck" != "0" ]; then
echo "[ERR0] - HDCS Namespace already imported! Exiting!"
exit 2
else
echo "[ OK ] - Existing hosts file seems OK"
fi
# Appending a newline and downloaded namespace into /etc/hosts
echo "[iNFO] - Importing namespace into hosts file"
# Backing up existing hosts file
mkdir /root/backup 2> /dev/null
cp /etc/hosts /root/backup/hosts-backup-`date +"%Y%m%d%H%M%S"`
# Adding HDCS namespace to /etc/hosts file
echo >> /etc/hosts
cat $TTicFilesPath/TThosts >> /etc/hosts
echo "[ OK ] - Namespace imported"
# Verifying if imported successfuly
echo "[iNFO] - Verifying imported data"
TTicImportedHostsCheck=`grep "HDCS Namespace" /etc/hosts 2> /dev/null |wc -l`
if [ "$TTicImportedHostsCheck" = "1" ]; then
echo "[ OK ] - Imported Data Seems OK"
else
echo "[ERR0] - Imported data NOT OK! Exiting!"
exit 3
fi
# Creating user catcher and setting it's password
echo "[iNFO] - Creating user catcher"
adduser --disabled-login --gecos Tomas catcher
if [ "$?" != "0" ]; then
echo "[WARN] - User creation failed (user catcher)"
fi
chmod 700 /home/catcher
if [ "$?" != "0" ]; then
echo "[ERR0] - ACL securing catcher's home (chmod 700) failed"
exit 6
fi
echo "[ OK ] - Done creating user catcher"
echo "[iNFO] - Setting catcher's password"
echo "catcher:$TTicCatcPass"|chpasswd
if [ "$?" != "0" ]; then
echo "[ERR0] - Changing catcher's password failed"
exit 7
fi
echo "[ OK ] - Catcher's password set"
# Downloading catcher's rsa keys and known_hosts file from conman storage
# and copying them into .ssh of catcher
echo "[iNFO] - Setting up .ssh files for catcher"
mkdir /home/catcher/.ssh 2> /dev/null
chown catcher:catcher /home/catcher/.ssh
chmod 700 /home/catcher/.ssh
cat $TTicFilesPath/id_rsac > /home/catcher/.ssh/id_rsa
cat $TTicFilesPath/id_rsac.pub > /home/catcher/.ssh/id_rsa.pub
cat $TTicFilesPath/authorized_keys > /home/catcher/.ssh/authorized_keys
chown catcher:catcher /home/catcher/.ssh/*
chmod 600 /home/catcher/.ssh/authorized_keys
chmod 600 /home/catcher/.ssh/id_rsa
chmod 640 /home/catcher/.ssh/id_rsa.pub
echo "[ OK ] - Done setting up .ssh files for catcher"
# Downloading root's rsa keys and known_hosts file from conman storage
# and copying them into .ssh of root
echo "[iNFO] - Setting up .ssh files for root"
mkdir /root/.ssh 2> /dev/null
chown root:root /root/.ssh
chmod 700 /root/.ssh
cat $TTicFilesPath/id_rsar > /root/.ssh/id_rsa
cat $TTicFilesPath/id_rsar.pub > /root/.ssh/id_rsa.pub
chown root:root /root/.ssh/*
chmod 600 /root/.ssh/id_rsa
chmod 640 /root/.ssh/id_rsa.pub
echo "[ OK ] - Done setting up .ssh files for root"
# Downloading sources file from conman storage and replacing existing
# /etc/apt/sources.list with it
echo "[iNFO] - Installing apt sources file"
mkdir /root/backup 2> /dev/null
cp /etc/apt/sources.list /root/backup/sources.list-backup-`date +"%Y%m%d%H%M%S"`
cat $TTicFilesPath/TTicSL > /etc/apt/sources.list
echo "[ OK ] - Apt sources files installed"
# Update the OS before installing new packages
echo "[iNFO] - Updating OS"
apt-get -y --force-yes update > $TTLog 2>&1
if [ "$?" != "0" ]; then
echo "[ERR0] - 'apt-get update' failed in the first run"
exit 8
fi
apt-get -q -y --force-yes upgrade > /dev/null 2>&1
if [ "$?" != "0" ]; then
echo "[ERR0] - 'apt-get -q -y upgrade' failed"
exit 9
fi
echo "[ OK ] - OS update done"
# Downloading the list of packages for base installation from conman remote storage
# and installing them
echo "[iNFO] - Installing packages"
apt-get -q -y --force-yes install $TTicPackList > $TTLog 2>&1
if [ "$?" != "0" ]; then
echo "[ERR0] - 'apt-get -q -y install' failed"
exit 10
fi
echo "[ OK ] - Packages installed"
# All seems to be done and OK, informing and exiting
echo "[iNFO] - Installation of base system is done. Exiting"
exit 0
Depending on the actual node/VPS role, other scripts are being run after this one, installing the correct software and applying the correct configuration and content.
As mentioned above, i have never encountered problems with upgrading Debian Stable and had to use those restore techniques for different reasons.
Hope this brings enough light into how updates may be managed.