amuck-landowner

Heads up: OpenVZ updates will probably break your system

sean

New Member
Hi Guys,

Many of you may already know but for though of you who do not, be very careful when updating your OpenVZ host nodes in the future.

We have just been stung by two major changes that crept into vzctl version 4.7 released 15-April-2014:

  1. The default filesystem has changed from simfs to ploop. This broke all new containers created by our system. We had to fix this by setting the following in vz.conf:

  2. VE_LAYOUT=simfs
  3. A new option, --netfilter, has been added to vzctl. This broke nat/connection tracking and most other modules for containers. We fixed this by adding --netfilter full
It's a little worrying that changes like this made it into openvz.org's rhel/centos respository!!
 

zionvps

Member
Verified Provider
It is quite better to only update when solusvm and other  panels roll out updates with their own configured kernels
 

SkylarM

Well-Known Member
Verified Provider
Sean,

We've deployed a few servers since the updates with Solus and haven't had issues. We've gone in and re-enabled conntrack though of course. The bulk of providers are likely fine as they aren't running anything custom.

Still find it rather amusing OpenVZ is running ploop as default, but doesn't install ploop on initial install.
 
Last edited by a moderator:

Magiobiwan

Insert Witty Statement Here
Verified Provider
The fact is that OpenVZ neglected to throw warnings when attempting to use --iptables (they didn't even provide a deprecation period...). ALSO, they didn't even add the ploop package to the dependencies list on the latest vzctl RPMs so it doesn't get installed automatically. We ran into issues with it trying to default to ploop BUT not having the ploop libs installed. Thankfully that only affected rebuilds and new provisions. OpenVZ is just such a mess... 
 

sean

New Member
The warning shouldn't be "be careful when updating your OpenVZ host nodes in the future", it should be "ALWAYS READ THE EFFING CHANGELOG BEFORE YOU UPDATE YOUR SOFTWARE".
As we're running an enterprise distribution, changes like this should not be making it in to their repositories for RHEL/CentOS. If we wanted to be subject to that crap we'd be on a rolling release distribution.
 

SkylarM

Well-Known Member
Verified Provider
As we're running an enterprise distribution, changes like this should not be making it in to their repositories for RHEL/CentOS. If we wanted to be subject to that crap we'd be on a rolling release distribution.
So because you're on an "enterprise distribution" that means you shouldn't read changelogs prior to updates? o_O
 

KuJoe

Well-Known Member
Verified Provider
As we're running an enterprise distribution, changes like this should not be making it in to their repositories for RHEL/CentOS. If we wanted to be subject to that crap we'd be on a rolling release distribution.
The OpenVZ repos have nothing to do with RHEL/CentOS, they are independent of each other. Any company that does not read the changelog AND does not install the software in dev before production has bigger problems than can be addressed on a public forum.
 
Last edited by a moderator:

Kruno

New Member
Verified Provider
Has anyone managed to convert existing Nodes to ploop? Any major issues down the road?
 

devonblzx

New Member
Verified Provider
We switched over nearly a year ago.  Ploop is the way to go now.   I wrote a script to help the migration from simfs to ploop.  The vzctl conversion process takes the VPS offline during the conversion which could be hours of downtime for large virtual servers.  My script only takes the server down for a couple of minutes while rsync syncs up files.

http://blog.byteonsite.com/?p=10
 

Magiobiwan

Insert Witty Statement Here
Verified Provider
It's worth noting that if you do want to switch to ploop, if you change the VE_LAYOUT option, as people rebuild their VPSes or as new ones are created, they'll switch over without affecting the ones using simfs. Likewise for going the other way around. 
 

Kruno

New Member
Verified Provider
We switched over nearly a year ago.  Ploop is the way to go now.   I wrote a script to help the migration from simfs to ploop.  The vzctl conversion process takes the VPS offline during the conversion which could be hours of downtime for large virtual servers.  My script only takes the server down for a couple of minutes while rsync syncs up files.

http://blog.byteonsite.com/?p=10
Nice script but doesn't work as expected. 

I took a while and fixed for everyone else who will need it. The script does a nice job :)


#!/bin/sh
# ./convert VEID
rsync_options='-aHv'
partition='vz'
if [ ! -e /etc/vz/conf/$1.conf ]; then
echo "Virtual server configuration file: /etc/vz/conf/$1.conf does not exist."
exit 1
fi
if [ -d /$partition/private/$1/root.hdd ]; then
echo "Server already has ploop device"
exit 1
fi
if [ ! -d /$partition/private/$1 ]; then
echo "Server does not exist"
exit 1
fi
# Get disk space in G of current VPS
disk=`vzctl exec $1 df -BG | grep simfs | awk {'print $2'} | head -n1`
if [ ! $disk ]; then
echo "Could not retrieve disk space figure. Is VPS running?"
exit 1
fi
# Create and mount file system
mkdir -p /$partition/private/1000$1/root.hdd
ploop init -s $disk /$partition/private/1000$1/root.hdd/root.hdd
cp /etc/vz/conf/$1.conf /etc/vz/conf/1000$1.conf
vzctl mount 1000$1
# Rsync over files (sync 1)
rsync $rsync_options /$partition/root/$1/. /$partition/root/1000$1/
# Stop primary, mount, sync final
vzctl stop $1
vzctl mount $1
rsync $rsync_options /$partition/root/$1/. /$partition/root/1000$1/
vzctl umount $1
vzctl umount 1000$1
mv /$partition/private/$1 /$partition/private/$1.backup
mv /$partition/private/1000$1 /$partition/private/$1
vzctl start $1
# Cleanup
rm -f /etc/vz/conf/1000$1.conf
rmdir /vz/root/1000$1
# Verification
verify=`vzlist -H -o status $1`
if [ $verify = "running" ]; then
echo "Virtual server conversion successful. Verify manually then run: rm -Rf /$partition/private/$1.backup to remove backup."
else
echo "Server conversion was not successful..Reverting.."
mv -f /$partition/private/$1 /$partition/private/$1.fail
mv /$partition/private/$1.backup /$partition/private/$1
vzctl start $1
fi


Fixes:

1) Changed disk=`vzctl exec $1 df -BG | grep simfs | awk {'print $2'}` to disk=`vzctl exec $1 df -BG | grep simfs | awk {'print $2'} | head -n1` because there may be multiple simfs per VPS(for example, cPanel's virtfs)

2) Changed mkdir /$partition/private/1000$1/root.hdd to mkdir -p /$partition/private/1000$1/root.hdd. /$partition/private/1000$i doesn't exist in the first place, hence root.hdd will fail to create. -p flag will do the trick. 

3) Changed ploop init -s $/$partition/private/1000$1/root.hdd/root.hdd to ploop init -s $disk /$partition/private/1000$1/root.hdd/root.hdd. I assume Devon had a typo there.

Either way, the updated script works fine now. Thanks Devon!
 

devonblzx

New Member
Verified Provider
 



Nice script but doesn't work as expected. 

I took a while and fixed for everyone else who will need it. The script does a nice job :)

Fixes:

1) Changed disk=`vzctl exec $1 df -BG | grep simfs | awk {'print $2'}` to disk=`vzctl exec $1 df -BG | grep simfs | awk {'print $2'} | head -n1` because there may be multiple simfs per VPS(for example, cPanel's virtfs)

2) Changed mkdir /$partition/private/1000$1/root.hdd to mkdir -p /$partition/private/1000$1/root.hdd. /$partition/private/1000$i doesn't exist in the first place, hence root.hdd will fail to create. -p flag will do the trick. 

3) Changed ploop init -s $/$partition/private/1000$1/root.hdd/root.hdd to ploop init -s $disk /$partition/private/1000$1/root.hdd/root.hdd. I assume Devon had a typo there.

Either way, the updated script works fine now. Thanks Devon!
I have PMed you back and update the script with these changes along with a few more I found.  The new ploop no longer sets a default filesystem, so I made sure ext4 is now the default.  I tested the script on the newest vzctl/ploop available and it works now.  Thanks for the input!
 

Kruno

New Member
Verified Provider
FLY, the script fails on a small amount of containers due to unexplained reasons. I talked to Devon and other WHT guy who experienced with ploop and none of us could explain it. It was only 1-2 per 100 containers in our case. vzctl convert worked for those, though. If you want to minimize the downtime you can setup a NEW container on the ploop Node with same resources, and then rsync old simfs container over to newly created ploop container. That way it works.

Make sure to run vzctl compact against all containers once it's converted to ploop, otherwise disk usage will be messed up. We got more than 100GB of free disk space on certain Nodes after this.

Note: Every ploop container will reserve 5% of usable disk space for itself so don't let df outputs surprise you :)
 

eddynetweb

New Member
I've seen hosts in the past few days having issues upgrading. Kernal panics and such.
 
Last edited by a moderator:
Top
amuck-landowner