# VPS Benchmark Testing is Useless. Creating better benchmarking tests.



## drmike (Apr 27, 2014)

So, some of the valued community members have pointed to highly technical reasons why benchmark testing inside of VPS containers and other virtualized environments is useless.   Useless on gauging CPU, useless on disk IO and throughput information, generally useless.

Goal is to get input on what should be gleaned from testing and benchmarking and to inspire a new test suite/battery.

As-is, ping times, latency on network, raw throughput on network and high/low over time are stable with disclaimer of dealing with shared resources and at times contention for such.

What else should be taken from containers in testing?  Beancounter info? 

Give us your thoughts.


----------



## tonyg (Apr 27, 2014)

drmike it is my take that probably the most vocal of the "vps benchmarks are useless" voices are VPS hosts that don't quite measure up.

From my own benchmark testing, RamNode and BuyVM are the best performers and those benchmark results translate properly to actual production performance.

It is like 0-60 times from a performance vehicle; sure most driver's can't match those numbers but it gives one a performance gauge for comparison testing.


----------



## drmike (Apr 27, 2014)

Well there are legitimate reasons why the old school benchmarks intended for baremetal (dedicated servers) isn't up to snuff.

Smart folks who are NOT providers pointed at various things and they make sense.

If anything, all/most providers don't want discovery of how things really working and an informed buying public armed with tools or results therefrom. Providers benefit from ignorant customers.  I get that.

Thus, the collective silence


----------



## datarealm (Apr 27, 2014)

The major hurdle is consistency, and even within providers that will change drastically over time.  If a provider puts up a new node and you are the first VM on it, performance will be stellar.  However over the course of a month as new VMs fire up, your performance will change.

As an end-user you have no way to determine this information.

It would seem that true benchmarking for a VM would need to occur with routine samples over an extended period of time.


----------



## imperio (Apr 27, 2014)

A good example.

http://geekahost.com/db/show.php


----------



## peterw (Apr 28, 2014)

The common scripts are useless. And the pseudo reviews out of them are useless. They peak a system and messure how good the vps can stand the pressure. But only for a short amount of time. I like the approach to run a service on each vps and messure over time how good the page load times are. Or a cluster where you can query each node. Or do the same tasks again on each vps, like compiling python.

A review should always be about a time frame, not about a moment. I like @wlanboy reviews because he says which services are running and how they perform. It is important for me that he is updating them every month to see how the hoster performs over time.


----------



## Deleted (Apr 28, 2014)

Same tasks again is futile, since some/most things will be in VFS cache after the first iteration. This is why when you remount a filesystem your VFS cache layer is cleared.

All and all, Benchmarking VPS containers is silly, you need to do the host node, as I've said before, the containers do NOT have direct access to the hardware (PCI address space, some privileged CPU instructions are emulated, ie: cmpxchg was emulated on vmware and was dogshit slow for atomic ops, timers/interrupts are emulated from the host node, context switching is/can be horrendously bad)


----------



## blergh (Apr 28, 2014)

While higher figures and results are always nice to look at, you need to put things into perspective and rather look at "Does my application/site/ require over 150MB/s constant writes and 20MB/s upload at all times?". The answer is usually no, your applications do not need these resources but if they actually do you shouldn't be looking at a $20-a-month solution (And most probably not a virtualized solution at that)

While spare resources for the occasional bursts is perfectly sane and healthy, it does not magically make your applications or site run faster. Unfortunately this sort of "misconception" appears to be fairly wide-spread as an awful lot of people seem to dismiss hosts who, as an example, gets you a sub 200MB/s DD from inside your VM.

KISS!


----------



## HalfEatenPie (Apr 28, 2014)

All I have to say is this.

You want to benchmark the system? Awesome! Go for it! Do whatever test you need to do.

You want to do this on a regular basis and run a cron running this test every hour? No. The resource needs to be available for other people to use.

Benchmarking in itself is useful on the provider-side because it shows how much capacity the node can take. Benchmarking on the client side is ok as well, but it only gives the situation at that time and period. Conditions can change rapidly.

I'm all ok for real-use testing though.

*Clarification:* "You" isn't directed at one specific individual. It's just generally anyone who uses a VPS. I love you all equally and independently. Some more equal than others.


----------



## DomainBop (Apr 28, 2014)

> "Does my application/site/ require over 150MB/s constant writes and 20MB/s upload at all times?". The answer is usually no, your applications do not need these resources but if they actually do you shouldn't be looking at a $20-a-month solution (*And most probably not a virtualized solution at that*)


I both agree and disagree with your _"probably not a virtualized solution at that"_ statement.

Virtualization can be a very good solution for mission critical, resource intensive, or any app that requires a stable environment IF you're using your own server and divide it up into a handful of VPS.

On the other hand, the average low end VPS provider, especially those that offer containerization not true virtualization (i.e. openvz) is selling a glorified shared hosting service and your VPS is going to be fighting for CPU time and sharing that 2 x 1 Gbps connection with dozens of other customers (and in some openvz cases hundreds of other customers if the offer contains the word "Buffalo"), some of whom will inevitably be abusing the hell out of the node...not a good solution for running anything mission critical, resource intensive, or any app that requires a stable environment.

Benchmark results on a VPS you rent from a provider can vary widely depending on current node abuse usage and aren't a good indication of the performance (or available resources) that you can expect over the longer term.



> If anything, all/most providers don't want discovery of how things really working and an informed buying public armed with tools or results therefrom.


suggestion for the next topic: _"why the 365 days uptime that the uptime command results show on your VPS from top provider x might not reflect reality"_

example: one of my servers, an E3 (32GB RAM) with 1 dom0 and 3 domU Xen-HVM guests.  A kernel upgrade and reboot was performed on the dom0 this morning:

dom0:


uptime
15:09:00 up 3:30, 1 users, load average: 0.08, 0.21, 0.12
a domU guest:


```
uptime
 15:10:00 up 106 days, 23:58,  2 users,  load average: 0.33, 0.58, 0.45
```


----------



## MartinD (Apr 28, 2014)

tonyg said:


> drmike it is my take that probably the most vocal of the "vps benchmarks are useless" voices are VPS hosts that don't quite measure up.
> 
> From my own benchmark testing, RamNode and BuyVM are the best performers and those benchmark results translate properly to actual production performance.
> 
> It is like 0-60 times from a performance vehicle; sure most driver's can't match those numbers but it gives one a performance gauge for comparison testing.


Unfortunately not always true. It's quite easy to change the host configuration to ensure that when people run these benchmarks and specifically the 'usual' dd test, that the figures come back looking great. These are never really an accurate way of measuring performance unfortunately.


----------



## Lee (Apr 28, 2014)

@tonyg I remember Ramnode themselves posting in a thread on WHT that their nodes are by design setup to deliver maximum return on the DD tests because they know the impact it has on clients.

Not saying they are bad, they are very good.  It's all in the marketing.


----------



## Lee (Apr 28, 2014)

@drmike

Here is a different suggestion.  Instead of coming up with "how can I rape my providers server with more tests" type ideas, how about this.

Someone come up with a suite of tests in the form of a package that the providers install on each node that does it job without pulling performance down.  Data regarding up time, CPU use, memory use and so on are sent to an external site where people can come along and see true unmolested performance metrics on how those nodes have really performed over the past 7 days, month, 3 months or whatever.  No messing around with figures from the providers, it has to be clear, worthwhile data over time to base opinion on, not DD test that covers only a single point in time.

Doubt providers would like it, but clients would, and it will allow for much more rewarding evidence of a providers ability.

/2c


----------



## DomainBop (Apr 28, 2014)

> Data regarding up time, CPU use, memory use and so on are sent to an external site where people can come along and see true unmolested performance metrics on how those nodes have really performed over the past 7 days, month, 3 months or whatever.


you mean like this --> (user: guest/ passwd: ginernet) http://observium.ginernet.com/


----------



## Lee (Apr 28, 2014)

@DomainBop

Well kind of.  My point for example is that you can have a provider with the shiniest nodes reporting the highest IO but you may well be better off with a provider that has older nodes with IO that is perfectly adequate for your needs and have the same or better experience.  I guess that is what I would be looking to demonstrate, that it's not just as simple as DD, but we all (mostly) know that.

Not trying to say don't use the providers that have the shiny nodes.  Just that there are plenty of providers out there that can deliver top results not based on how new their hardware is, but simply because of their good old fashioned ability and talent.

I could start a new VPS service tomorrow on my own and chuck $25k at it no problem, however on my own, the minute something went wrong I would be sweating like GVH during a DDoS attack.  But hey my DD results are just awesome..


----------



## tonyg (Apr 28, 2014)

~Lee~ said:


> @tonyg I remember Ramnode themselves posting in a thread on WHT that their nodes are by design setup to deliver maximum return on the DD tests because they know the impact it has on clients.
> 
> Not saying they are bad, they are very good.  It's all in the marketing.


It's not just dd tests that they excel at.

They consistently turn in some of the best CPU performance and ioping figures.


----------



## drmike (Apr 28, 2014)

~Lee~ said:


> @tonyg I remember Ramnode themselves posting in a thread on WHT that their nodes are by design setup to deliver maximum return on the DD tests because they know the impact it has on clients.
> 
> Not saying they are bad, they are very good.  It's all in the marketing.


This is something that has been pointed out elsewhere and by others...

People are testing dd with ZERO data which is like 99.99% compressible.  That practice needs to stop.   If anything should be random data and with payload that should break cache/explicitly removes cache and other workarounds from the mix.

Same technique has been done on layers elsewhere - like download speedtest files.


----------



## drmike (Apr 28, 2014)

tonyg said:


> It's not just dd tests that they excel at.
> 
> They consistently turn in some of the best CPU performance and ioping figures.


ioping is entirely incosistent/useless in these environments, especially where SSDs and these controllers are rolled in.


----------



## drmike (Apr 28, 2014)

DomainBop said:


> suggestion for the next topic: _"why the 365 days uptime that the uptime command results show on your VPS from top provider x might not reflect reality"_
> 
> example: one of my servers, an E3 (32GB RAM) with 1 dom0 and 3 domU Xen-HVM guests.  A kernel upgrade and reboot was performed on the dom0 this morning:
> 
> ...



That sure is interesting.  I am unclear here, the baremetal has uptime as it wasn't rebooted... The container has no real uptime since it was rebooted.

Is the idea here that some providers may be pulling uptime from the master vs. the container/VPS?


----------



## Virtovo (Apr 28, 2014)

drmike said:


> That sure is interesting.  I am unclear here, the baremetal has uptime as it wasn't rebooted... The container has no real uptime since it was rebooted.
> 
> Is the idea here that some providers may be pulling uptime from the master vs. the container/VPS?


You can suspend guests before a host node reboot.  They retain their uptime. 

As for the Xen reference in the post, dom0 is the initial domain.  The domUs are guests.  dom0 has no uptime as the node was rebooted yet the guests retained theirs.


----------



## Virtovo (Apr 30, 2014)

Just performed maintenance on one of our KVM nodes.  Downtime as 14 minutes.

Node uptime post maintenance:


uptime
13:52:03 up 8 min

VPS uptime post maintenance:


```
uptime
 16:36:29 up 11 days
```


----------



## jmginer (May 5, 2014)

DomainBop said:


> you mean like this --> (user: guest/ passwd: ginernet) http://observium.ginernet.com/


I know it


----------



## Neo (May 5, 2014)

Such *transparency *but i thought you have more nodes.


----------

