@
drmike - If i recall correctly you're not a large fan of OVZ (or at least, when it's done wrong). Keep that in mind if I used too many Layman's terms by our definition.
Having seen your opinions of some other OpenVZ providers but still wanting to help, I'm going out on a limb here!
So, here's my sleep-deprived diatribe about ghost containers....
While I don't doubt that SolusVM has issues in a lot of places with VZ (not to mention /usr/local/solusvm/tmp/extras/ has some fun stuff in there that doesn't look entirely safe) but can make zombies happen pretty easily when I'm not trying to. More of an OpenVZ thing I believe. Usually happens when a container slightly exceeds it's resource threshold (physpages, in the below examples) and a failcnt is recorded. Kir told me a hell of a long time ago that it's "how it is", that by default, beancounters aren't reset until the node is rebooted (or stopped, or terminated, or even migrated, for that matter).
Here are a couple of phantom containers that I have in production right now. I replaced all my X5650s with dual E5s this past month, and these are two examples of containers which migrated fine (no kid sis here) ... but notice physpages is just a hair over it's limit on both containers? Kernel memory is being held somehow and/or reclamation might be askew. Anyway, that's pretty much what I believe creates a phantom. Never been able to "fix" it. There's a hokey ub-reset doc somewhere on the OVZ wiki that
may work, although I think it did diddly shit for me.
The part where SolusVM could fit into this... is that according to the DB, the container no longer exists. SolusVM may try and use that CTID again during, say, an automated provision, and perhaps is running into the /etc/vz/conf/CTID.conf.destroyed file and attempting to doing restore from that instead of starting a fresh config.
2312: kmemsize 319 147173376 268435456 268435456 0
lockedpages 0 1023 9223372036854775807 9223372036854775807 0
privvmpages 0 411797 9223372036854775807 9223372036854775807 0
shmpages 0 262335 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
numproc 0 150 9223372036854775807 9223372036854775807 0
physpages 63 131100 0 131072 0
vmguarpages 0 0 0 9223372036854775807 0
oomguarpages 0 170449 0 9223372036854775807 0
numtcpsock 0 40 9223372036854775807 9223372036854775807 0
numflock 0 41 9223372036854775807 9223372036854775807 0
numpty 0 1 9223372036854775807 9223372036854775807 0
numsiginfo 0 48 9223372036854775807 9223372036854775807 0
tcpsndbuf 0 697600 9223372036854775807 9223372036854775807 0
tcprcvbuf 0 4100920 9223372036854775807 9223372036854775807 0
othersockbuf 0 603584 9223372036854775807 9223372036854775807 0
dgramrcvbuf 0 8720 9223372036854775807 9223372036854775807 0
numothersock 0 48 9223372036854775807 9223372036854775807 0
dcachesize 0 134217728 134217728 134217728 0
numfile 0 1995 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
numiptent 0 14504 9223372036854775807 9223372036854775807 0
2230: kmemsize 319 135000064 805306368 805306368 0
lockedpages 0 12 9223372036854775807 9223372036854775807 0
privvmpages 0 439581 9223372036854775807 9223372036854775807 0
shmpages 0 2017 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
numproc 0 180 9223372036854775807 9223372036854775807 0
physpages 1 393250 0 393216 0
vmguarpages 0 0 0 9223372036854775807 0
oomguarpages 0 278749 0 9223372036854775807 0
numtcpsock 0 120 9223372036854775807 9223372036854775807 0
numflock 0 318 9223372036854775807 9223372036854775807 0
numpty 0 2 9223372036854775807 9223372036854775807 0
numsiginfo 0 87 9223372036854775807 9223372036854775807 0
tcpsndbuf 0 16773712 9223372036854775807 9223372036854775807 0
tcprcvbuf 0 7803408 9223372036854775807 9223372036854775807 0
othersockbuf 0 724904 9223372036854775807 9223372036854775807 0
dgramrcvbuf 0 210848 9223372036854775807 9223372036854775807 0
numothersock 0 239 9223372036854775807 9223372036854775807 0
dcachesize 0 97813079 402653184 402653184 0
numfile 0 3555 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
dummy 0 0 9223372036854775807 9223372036854775807 0
numiptent 0 17650 9223372036854775807 9223372036854775807 0
Now ... if you mean a client cancels, termination module runs, but
absolutely nothing was even attempted according to vzctl.log, I'm afraid I don't know. Blame Ploop.
Or add "VE_STOP_MODE=stop" to the vz.conf and prevent the damn things from checkpointing in the first place ... 500 day uptime my ass...
Anyway, I'm pretty sure it's got something to do with checkpointing. When a CT is restored, for some reason all of the mount points get all screwed up inside the container. Sure, by default all of the containers are mounted with the node's memory segments and creates unnecessary load on the HWN, but hey, at least people get to show off their uptime. VE_STOP_MODE="stop" in vz,conf instead of using "suspend" fixes (most) of the issues with phantoms, I've found.
I hope I gave ya somethin' worthwhile.