amuck-landowner

Test Services / Help with cgroups / KVM abusability?

GoodHosting

New Member
Hello VPSBoard,

<introrant> First I would like to start out by saying that by no means am I some expert in this field, while I have been providing hosting services for many years; there is still quite a lot of things I could learn.  The things I often portray as "facts" in my speech, could all very well be fictitious information based on improper research or bad ideals; who knows.  Hence I figured I would open up a thread, and ask what other aspiring / startup providers, as well as the industry leading providers had to say regarding these issues that are "pressing" to me. </introrant>

Test Service

I am currently offering trials on our Chicago1 and Phoenix1 locations for our Nebula Enterprise Cloud on packages Ci1 through Ci3 (therefore, no Windows OS; sorry abusers clients.)

URL: http://goodhosting.co/

Voucher Code: 40CNWJ36VQ

Valid For: Ci1, Ci2, Ci3 [Feb 01-Feb 08]

Period: FIRST MONTH 100% DISCOUNT

cgroup / KVM abuse troubles

I've recently had a steady increase of trouble (almost all of it coming from LowEndTalk) of Windows VPS clients purchasing the smallest possible plan, then abusing the absolute ever living hell out of the CPU they are assigned.  Regardless of the intermediate layer or panel I use, 1 core = 1 core.  Shares don't seem to mean anything, unless the host node is already over 100 us% CPU (ie: not usable for me, where I'd like to keep the host nodes with at least 30-50% free CPU of breathing room.)  It doesn't seem to matter if the cpuset quota is set to 1 or 100000, or if the cpu_period is 1 or 10000 ; the Windows VPS guest is still able to use 100% of one core as a minimum, regardless of how low I try and set the limitations; which makes density of WIndows VPS literally impossible to me.

Surely other hosts have found some way to overcome this, so I figured I would open this thread in the hopes that someone might know.

Example config in cgroups:

Code:
[[email protected] ~]# for FH in /cgroup/cpu/libvirt/qemu/one-5/vcpu0/*; do echo "----- $FH -----" ; cat $FH ; echo ; done
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cgroup.procs -----
16013
 
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cpu.cfs_period_us -----
100000
 
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cpu.cfs_quota_us -----
-1
 
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cpu.rt_period_us -----
1000000
 
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cpu.rt_runtime_us -----
0
 
----- /cgroup/cpu/libvirt/qemu/one-5/vcpu0/cpu.shares -----
1024
 

raindog308

vpsBoard Premium Member
Moderator
Your site doesn't have an AUP - the link just goes to the terms.  So are people free to run forex, bitcoin miners, IRC servers, bit torrent, etc.?  

I think I've found the source of your abuse  :lol:
 

drmike

100% Tier-1 Gogent
I think I've found the source of your abuse  :lol:
"I've recently had a steady increase of trouble (almost all of it coming from LowEndTalk) of Windows VPS"

Abuse from LET users. Ha!  That would never happen. 

Two ways to get rid of that chronic LET disease, it's worse than AIDS.

    1. Don't post / advertise there

    2. Offer services that cost more than $7 a month.

Presto! Cured of Lowend AIDS.
 
Last edited by a moderator:

serverian

Well-Known Member
Verified Provider
"I've recently had a steady increase of trouble (almost all of it coming from LowEndTalk) of Windows VPS"

Abuse from LET users. Ha!  That would never happen. 

Two ways to get rid of that chronic LET disease, it's worse than AIDS.

    1. Don't post / advertise there

    2. Offer services that cost more than $7 a month.

Presto! Cured of Lowend AIDS.
I wonder how many posts would you have if that website didn't exist.
 

GoodHosting

New Member
Your site doesn't have an AUP - the link just goes to the terms.  So are people free to run forex, bitcoin miners, IRC servers, bit torrent, etc.?  

I think I've found the source of your abuse  :lol:
The acceptable use is explained in the terms (not laid out in a separate document); but most abusers don't read it, and don't care.  Most chargeback the VPS after the first month anyways, and their bank always wins since it's a virtual service (etc etc, #providerwoes.)  We allow anything legal, even bitcoin mining is fine if it could be limited properly (such as the way that OpenVZ can, where 25 units actually means 25 units.)  Bit torrent is fine as long as the torrents are legal (you can see linux ISOs all day if you want, we don't care.)  As this doesn't even use any CPU compared to what they're using.
 

tchen

New Member
At the risk of being that guy that posts 'the manual', 

https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt

Specifically, are you running the normal scheduler or the RT?  You kinda dumped it all.  Things to do/look at:  look at the cpu.stat's for the number of times your current settings have been enforced.  Your current dump also shows no bandwidth restrictions.  Go through the examples in the above manual page and try them; be sure to set both quota and window simultaneously.

P.S. A period of 1 means that slice is reset every microsecond.  You're effectively wiping the slate clean and giving them a full quota again.
 

GoodHosting

New Member
At the risk of being that guy that posts 'the manual', 

https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt

Specifically, are you running the normal scheduler or the RT?  You kinda dumped it all.  Things to do/look at:  look at the cpu.stat's for the number of times your current settings have been enforced.  Your current dump also shows no bandwidth restrictions.  Go through the examples in the above manual page and try them; be sure to set both quota and window simultaneously.

P.S. A period of 1 means that slice is reset every microsecond.  You're effectively wiping the slate clean and giving them a full quota again.

To test, on a development node where the following was created:

5x 1GB VPS (100 CPU units, 4 guest cores) running dogecoin CPU via scrypt-jane (the most consistent real-world CPU drain I could find.) with the following settings:

echo 10000 > cpu.cfs_quota_us

echo 50000 > cpu.cfs_period_us

Each VPS was still able to use 400% CPU (as shown in top) and effectively tie up 4 cores on the host machine, for an extended period of time (never thorttled.)

--

I'm not sure what I'm doing wrong, but that's what I tried based on the manual you linked above.


[[email protected] cpu]# pwd ; echo ; for FH in *cfs*; do echo "----- $FH -----" ; cat $FH ; echo ; done
/cgroup/cpu
 
----- cpu.cfs_period_us -----
0
 
----- cpu.cfs_quota_us -----
0
 
[[email protected] cpu]# cd libvirt
[[email protected] libvirt]# pwd ; echo ; for FH in *cfs*; do echo "----- $FH -----" ; cat $FH ; echo ; done
/cgroup/cpu/libvirt
 
----- cpu.cfs_period_us -----
100000
 
----- cpu.cfs_quota_us -----
-1
 
[[email protected] libvirt]# cd qemu
[[email protected] qemu]# pwd ; echo ; for FH in *cfs*; do echo "----- $FH -----" ; cat $FH ; echo ; done
/cgroup/cpu/libvirt/qemu
 
----- cpu.cfs_period_us -----
100000
 
----- cpu.cfs_quota_us -----
-1
 
[[email protected] qemu]# cd one-5
[[email protected] one-5]# pwd ; echo ; for FH in *cfs*; do echo "----- $FH -----" ; cat $FH ; echo ; done
/cgroup/cpu/libvirt/qemu/one-5
 
----- cpu.cfs_period_us -----
100000
 
----- cpu.cfs_quota_us -----
-1
 
[[email protected] one-5]# cd vcpu0
[[email protected] vcpu0]# pwd ; echo ; for FH in *cfs*; do echo "----- $FH -----" ; cat $FH ; echo ; done
/cgroup/cpu/libvirt/qemu/one-5/vcpu0
 
----- cpu.cfs_period_us -----
50000
 
----- cpu.cfs_quota_us -----
10000


EDIT:

All the way up to /cgroup/cpu the "cpu.stat" file contains:

Code:
nr_periods 0
nr_throttled 0
throttled_time 0
 
Last edited by a moderator:

GoodHosting

New Member
Please check your kernel compilation options

CONFIG_FAIR_GROUP_SCHED=y


CONFIG_CFS_BANDWIDTH=y

And here.  Follow this

http://www.blaess.fr/christophe/2012/01/07/linux-3-2-cfs-cpu-bandwidth-english-version/
Unfortunately, as I'm using CentOS; I have a heavily edited 2.6 kernel; with a lot of half-implemented back-ported features.  I'm not sure how many apply to that artcile, but I'm reading through it now.  Both of the above switches you asked for are enabled in my compilation.  After reading through the article, and following the steps; I was not able to make the loops obtained per second dramatically fall.  I tried setting the period to 10000 and the quota to 100, and the loops achieved did not waiver from when quota was -1.
 

tchen

New Member
As far as I can tell, it's 3.2 that brought in the bandwidth control (the hard cap) on top of the old bandwidth scheduler.  Maybe that's where your trouble lies.  There were a couple patches floating around before that circa 2011 if you want to try those.
 

kaniini

Beware the bunny-rabbit!
Verified Provider
You want to edit cpu.shares, not the other ones.

1024 = 100% cpu

128 = 10% cpu

Let me know if you have any other problems.
 

GoodHosting

New Member
You want to edit cpu.shares, not the other ones.

1024 = 100% cpu

128 = 10% cpu

Let me know if you have any other problems.
@kaniini @tchen

That's the thing, shares are only useful (even as the manual explains) as a way to prioritize CPU to specific process or group of processes.  Shares only made it worse, larger VMs getting more CPu than smaller ones; in a huge way.

---

That being said, I was able to recompile my 2.6 kernel with a few 2012-ish patches to get CFS working as a hard limiter!  Machines are happily using exactly as much CPU as they're allowed to use, unlike before.  Thanks goes to SingleHOP for having some damned wonderful software partnership, able to replace the kernel without taking the system down.  Kplice or something?  Forgot the name.
 
Last edited by a moderator:

Virtovo

New Member
Verified Provider
@kaniini @tchen

That's the thing, shares are only useful (even as the manual explains) as a way to prioritize CPU to specific process or group of processes.  Shares only made it worse, larger VMs getting more CPu than smaller ones; in a huge way.

---

That being said, I was able to recompile my 2.6 kernel with a few 2012-ish patches to get CFS working as a hard limiter!  Machines are happily using exactly as much CPU as they're allowed to use, unlike before.  Thanks goes to SingleHOP for having some damned wonderful software partnership, able to replace the kernel without taking the system down.  Kplice or something?  Forgot the name.
Indeed it was most likely Ksplice! :)  A wonderful piece of software that sadly fell under the Oracle umbrella.
 

Magiobiwan

Insert Witty Statement Here
Verified Provider
If you're running KVM (assuming you're not using SolusVM), I'd recommend using Ubuntu 13.10 for your hypervisor. For the first part, it comes with a MUCH MUCH MUCH newer QEMU version (1.5 compared to .12), it has a newer Kernel, and overall it works much better. If you like being edgy, you can compile your own QEMU 1.7 (which fixes some minor issues with ACPI in 1.5 and overall seems to perform better as shown by my testing) and use that instead. You'll need to compile your own libvirtd too, but that's not too hard. If you're using SolusVM... Idk then. LONG LIVE FEATHUR!
 

concerto49

New Member
Verified Provider
If you're running KVM (assuming you're not using SolusVM), I'd recommend using Ubuntu 13.10 for your hypervisor. For the first part, it comes with a MUCH MUCH MUCH newer QEMU version (1.5 compared to .12), it has a newer Kernel, and overall it works much better. If you like being edgy, you can compile your own QEMU 1.7 (which fixes some minor issues with ACPI in 1.5 and overall seems to perform better as shown by my testing) and use that instead. You'll need to compile your own libvirtd too, but that's not too hard. If you're using SolusVM... Idk then. LONG LIVE FEATHUR!
Is it stable for production? Ubuntu 13.10 that is.
 

Magiobiwan

Insert Witty Statement Here
Verified Provider
I have 13.10 on my Dedi I'm doing testing with QEMU 1.7 on, and BlueVM runs it on our new KVM nodes. Works MUCH better than CentOS 6 does for KVM. Newer kernel, newer QEMU... Yeah. MUCH better. 
 

Wintereise

New Member
There's nothing at all stopping you from building the bleeding edge kernel and qemu for CentOS, if you really wanted to -- useless argument.

That said, I hate the RHEL family of OSes with a passion -- and would recommend Debian with own maintained kernel for the job instead.

If you must use Ubuntu, wait for the new 14.04 LTS release.
 
Top
amuck-landowner