amuck-landowner

Scaleway Cloud Launches x86-64 C2 Servers: Dedicated Avoton C2550 and C2750

DomainBop

Dormant VPSB Pathogen
I think it is delusional to talk "mission critical" applications running on VPS.

That depends on who controls the hypervisor. If you control it and use virtualization strictly to run your own company's apps then it is a reliable way to stretch your resources and save money and you can achieve uptime that is comparable to, or equals,  a dedicated server.  If you're using a typical VPS company** however where you have no control over the environment (node setup, your neighbors, provider errors that result in downtime/data loss, maintenance scheduling, etc) then a VPS probably isn't the best place for anything mission critical.


**I'm defining 'typical VPS company' as one using off the shelf software(that usually contains obfuscated code that requires putting a ticket in with the developer when something goes wrong) and more times than not rented servers and rented IP space..and did I mention overselling?
 

drmike

100% Tier-1 Gogent
I think it is delusional to talk "mission critical" applications running on VPS.

Broken iPB quote... so....


I am on fence about running anything important at all on a VPS.  I do it, and annually I am reminded with horror as to why not to.  Have had downed instance issues.  Abuse slapped when something went afoul another time.  Provider changing this and that.   Really adds up to like 3-6 events I'd say at average provider.  Ones not with events are just not building anything usually (i.e. coasting).


I've had issues with dedis just the same frequency annually.  Infamous power fail of DC, DDoS of their network, etc.


So I am in the boat of cheap dedicated at this point.   ARM is fine, cheap is fine.  I think this is the current evolution of the market myself.  At least for those of us long into VPS.
 

willie

Active Member
Certainly people build businesses around AWS all the time, where EC2 basically amounts to overpriced VPS.  I think the most important thing is eliminate SPOFs.  Multiple servers from multiple vendors in multiple locations, etc.
 
Last edited by a moderator:

drmike

100% Tier-1 Gogent
Certainly people build businesses around AWS all the time, where EC2 basically amounts to overpriced VPS.  I think the most important thing is eliminate SPOFs.  Multiple servers from multiple vendors in multiple locations, etc.

No doubt AWS and Google and some other monoliths offer things more business reliable.  But you are going to pay heavily for it.  Amount people spend on services like that is pretty insane often.  At the numbers I see from folks, I'd be dealing with dedis from better DC suppliers directly myself.


Definitely the many vendor route in many locations is one that appeals more to me than monolith worship.  Quite a big niche AWS and others have carved out though.
 

fm7

Active Member
Certainly people build businesses around AWS all the time, where EC2 basically amounts to overpriced VPS.  I think the most important thing is eliminate SPOFs.  Multiple servers from multiple vendors in multiple locations, etc.

SPOF would be the catastrophic impediment but how about the more mundane noisy neighbor? Or vastly different performance characteristics of AWS instances? 


BTW (Wikipedia):


Mission critical system is usually the online banking system, railway and aircraft operating and control system, electric power systems, and many other computer systems that will affect the business and society seriously if downed.
 
Last edited by a moderator:

drmike

100% Tier-1 Gogent
SPOF would be the catastrophic impediment but how about the more mundane noisy neighbor? Or vastly different performance characteristics of AWS instances? 

Anyone here using / tried / aware of AWS platform and what they are actually running to make that all work?  What is the virtualization based on?


I haven't heard anything that I recall about noisy neighbors with AWS or similar large competitors.  Remains sort of ahh magical in some ways.. Clearly there have been complaints like total service on smaller instances = really slow.


I am betting they invested heavily in setting proper resource limits to contain things from getting too ugly.  Seems to be where the real shops differ from run of the mill Solus and pray brands.
 

fm7

Active Member
Anyone here using / tried / aware of AWS platform and what they are actually running to make that all work?  What is the virtualization based on?


I haven't heard anything that I recall about noisy neighbors with AWS or similar large competitors.  Remains sort of ahh magical in some ways.. Clearly there have been complaints like total service on smaller instances = really slow.


I am betting they invested heavily in setting proper resource limits to contain things from getting too ugly.  Seems to be where the real shops differ from run of the mill Solus and pray brands.



Google Compute Engine and Predictable Performance



 

 


Tim Freeman


July 1, 2012

I raised my eyebrows at one statement Google is making about Google Compute Engine:



Deploy your applications on an infrastructure that provides consistent performance. Benefit from a system designed from the ground up to provide strong isolation of users’ actions. Use our consistently fast and dependable core technologies, such as our persistent block device, to store and host your data.




While many talk about how one IaaS solution will give you better performance than another, one of the more bothersome issues in clouds is whether or not an instance will give you consistent performance. This is especially true with I/O.


A lot of this performance consistency problem is due to the “noisy neighbor” issue. IaaS solutions typically have some kind of multi-tenant support, multiple isolated containers (VM instances, zones, etc.) on each physical server. The underlying kernel/hypervisor is responsible for cutting each tenant off at the proper times to make sure the raw resources are shared correctly (according to whatever policy is appropriate).


AWS, while nailing many things, has struggled with this. I’ve heard from many users that they’re running performance tests on every EC2 instance they create in order to see if the neighbor situation looks good. This only gets you so far, of course: a particulary greedy neighbor could be provisioned to the same physical node at a later time.


Taking the concept further, I’ve been in a few conversations where the suggestion is to play “whack-a-mole” and constantly monitor the relative performance, steal time, etc., and move things around whenever it’s necessary. (That sounds like a great CS paper, but stepping back… that’s just kind of weird and crazy to me if this is the best we can do.)


The best approach on most clouds (except Joyent who claims to have a better situation) is to therefore use the biggest instances, if you can afford them. These will take up either half or all of the typical ~64-70GB RAM in the servers underlying the VM: no neighbors, no problems. Though other kinds of “neighbors” are still an issue, like if you’re using a centralized, network-based disk.


So how serious is Google in the opening quote above? What different technology is being used on GCE?


A Google employee (who does not work on the GCE team but who I assume is fairly reporting from the Google I/O conference) tweeted the following:



Google compute is based on KVM Linux VMs. Storage: local ephemeral, network block, google storage #io12




KVM.


Years ago, we investigated various techniques we could use in the Nimbus IaaS stack to guarantee that guests only used a given amount of CPU percentage and network bandwidth while also allowing colocated guests to enjoy their own quota. Pure CPU workloads fared well against “hostile” CPU based workloads. But once you introduced networking, the situation was very bad.


The key to these investigations is introducing pathologically hostile neighbors and seeing what you can actually guarantee to other guests, including all of the overhead that goes into accounting and enforcement.


That was on Xen, and it’s not even something the Xen community was ignoring, it’s just a hard problem. And since then I’ve seen that the techniques and Xen guest schedulers have improved.


But I haven’t seen much attention to this in KVM (though I admit I haven’t had the focus on this area that I had in the past).


So we have this situation:

  • AWS uses Xen.
  • AWS and Xen historically have issues with noisy neighbors.
  • Google uses KVM, not historically known for strong resource isolation.
  • Google is claiming consistent performance as a strong selling point.

Do they have their own branch, a new technique? Are they actually running SmartOS zones + KVM? I’m really curious what is happening here. Surely they’ve seen this has been an issue for people on AWS for years and would not make such a bold claim without testing the hell out of it, right?


Another thing they’re claiming is a “consistently fast and dependable” network block device. Given the a priori failure mode problems of these solutions, I’m doubly curious.


UPDATE: This talk from Joe Beda has some new information, slide 14: Linux cgroups – I also heard via @lusis that they worked with RedHat on this.


UPDATE: comment from Joe Beda:


“We are obviously worried about cascading failures and failure modes in general. Our industry, as a whole, has more work to do here. This is an enormously difficult problem and I’m not going to start throwing rocks.


That being said, I can tell you that our architecture for our block store is fundamentally different from what we can guess others are doing and, I think, provides benefits in these situations. We can take advantage of some of the great storage infrastructure at Google (BigTable, Colossus) and build on that. Our datacenters are really built to fit these software systems well.”


http://www.peakscale.com/noisyneighbors/

 


BTW DigitalOcean is XEN; Vultr, Atlantic.net, Profitbricks, KVM; and Linode replaced XEN by KVM.


 



Linode: goodbye Xen and welcome KVM!


 

June 16, 2015 12:01 pm

Happy 12th birthday to us!

Welp, time keeps on slippin’ into the future, and we find ourselves turning 12 years old today. To celebrate, we’re kicking off the next phase of Linode’s transition from Xen to KVM by making KVM Linodes generally available, starting today.

Better performance, versatility, and faster booting

Using identical hardware, KVM Linodes are much faster compared to Xen. For example, in our UnixBench testing a KVM Linode scored 3x better than a Xen Linode. During a kernel compile, a KVM Linode completed 28% faster compared to a Xen Linode. KVM has much less overhead than Xen, so now you will get the most out of our investment in high-end processors.

KVM Linodes are, by default, paravirtualized, supporting the Virtio disk and network drivers. However, we also now support fully virtualized guests – which means you can run alternative operating systems like FreeBSD, BSD, Plan 9, or even Windows – using emulated hardware (PIIX IDE and e1000). We’re also working on a graphical console (GISH?) which should be out in the next few weeks.

In a recent study of VM creation and SSH accessibility times performed by Cloud 66, Linode did well. The average Linode ‘create, boot, and SSH availability’ time was 57 seconds. KVM Linodes boot much faster – we’re seeing them take just a few seconds.

How do I upgrade a Linode from Xen to KVM?

On a Xen Linode’s dashboard, you will see an “Upgrade to KVM” link on the right sidebar. It’s a one-click migration to upgrade your Linode to KVM from there. Essentially, our KVM upgrade means you get a much faster Linode just by clicking a button.

How do I set my account to default to KVM for new stuff?

In your Account Settings you can set ‘Hypervisor Preference’ to KVM. After that, any new Linodes you create will be KVM.

What will happen to Xen Linodes?

New customers and new Linodes will, by default, still get Xen. Xen will cease being the default in the next few weeks. Eventually we will transition all Xen Linodes over to KVM, however this is likely to take quite a while. Don’t sweat it.

On behalf of the entire Linode team, thank you for the past 12 years and here’s to another 12! Enjoy!

-Chris


 


 


https://blog.linode.com/2015/06/16/linode-turns-12-heres-some-kvm/

 
 
Last edited by a moderator:

willie

Active Member
EC2 used to have bad NN problems unless you used very big instances, in which case it was merely incredibly expensive.  I don't know if it's better now, but they've introduced a cpu-time accounting system where you get a certain amount of CPU credit for each hour you pay for on the instance, up to some maximum.  So e.g. if you idle for 2 hours you're then allowed to use 100% cpu for 10 minutes before getting throttled, that sort of thing, parameters depend on the instance type.


I hate doing anything computation intensive on VPS's these days.  I love my cheap-ass Hetzner dedicated server, or my Scaleways since that's what we're talking about here.  I remember starting a 9 hour computation on the Hetzner one night, 100% cpu on all 4 cores, then checking the result in the morning.  It had worked properly but had misformatted the output, printed 2 columns in the wrong order or something like that.  I could have spent 15 minutes whipping up a script to re-order the output file, but instead I spent 30 seconds fixing the relevant print statement in the original program, then restarted the 9 hour computation and left for work, so I had correct output waiting when I got home.  It was incredibly satisfying to be able to do that.


For really big-time cheap single-box CPU, this was near unbelievable (out of stock now but might return): https://www.wholesaleinternet.net/out-of-stock/?id=277  a dual E5-2670 (i.e. 16 cores, 32 threads) with 32GB ram and 240GB SSD for $49/month.  A bit underprovisioned in ram and disk for typical server uses, but amazing if all you wanted was to compute.  They have some bigger setups in stock right no at still very attractive prices, though I have no idea how good their network etc. is.
 
Last edited by a moderator:

drmike

100% Tier-1 Gogent
  • AWS uses Xen.
  • AWS and Xen historically have issues with noisy neighbors.
  • Google uses KVM, not historically known for strong resource isolation.
  • Google is claiming consistent performance as a strong selling point.

EC2 used to have bad NN problems unless you used very big instances, in which case it was merely incredibly expensive.

So buy tenants off a box basically for isolation.  Dedi wins unless their software / panel is just truly that awesome or API advance in your world is that developed.


The Xen vs. KVM in this big farms is interesting.   Xen hasn't been getting much love for years.  Performance on it hasn't been keeping up (indicated in the above too).

I love my cheap-ass Hetzner dedicated server, or my Scaleways since that's what we're talking about here

Count me in.  This is how to roll.  Scaleway is making it affordable.  Really compelling offers, moreso than most cheapie hosts even.  

For really big-time cheap single-box CPU, this was near unbelievable (out of stock now but might return): https://www.wholesaleinternet.net/out-of-stock/?id=277  a dual E5-2670 (i.e. 16 cores, 32 threads) with 32GB ram and 240GB SSD for $49/month

Those boxes are mutations that are going to go wrong sideways.  Believe that's all the same as their infamous 96GB boxes.  Those are computation boards and not meant for this stuff to random consumers.  Yeah decent deal, can buy these at like $150~ outright.


WSI is alright, even though I took to jabbing Aaron for his intimate dealings and having his hands and feet in other pockets while in public being all they are different companies.   Guy sure wrongly endorses / shills so, that's how I feel about that.  Their network is alright there though and with the new other side of town DC should be better all around.  I remember when staff had to get in vehicle there and drive across town to do support.  It's a hobby location, I wouldn't limited budget put my stuff there without redundancy live second location elsewhere.
 
  • Like
Reactions: fm7

willie

Active Member
Scaleway, Hetzner, and that WSI E5 cost roughly the same per passmark, the difference is that Scaleway bills hourly, making it tempting to spin up ten Scaleways instead of 3-4 E3's or 1-2 E5's for a short period (few days or whatever) instead of running your task a lot longer on fewer monthly billed boxes.  The trouble is Scaleway has constant hardware shortages so it's not at all clear that you can spin up five of them whenever you want. 


I don't understand the issue with those E5 servers or the 96GB ones?  I saw an LET thread saying they have no KVM, but Hetzner doesn't either and I haven't needed it (Hetzner rescue system was enough).  The E5's let you boot an ISO image so you can always reinstall, other than that you need good frequent backups.  Yeah I'm not sure what WSI's situation is in other regards.  I was surprised to see some overlap with Joe's Datacenter, where I've sometimes thought of parking a box.


What do you mean about being able to buy those E5-2670's for $150 outright?  I've never seen anything like that.  L5520's or whatever, maybe, but those are much much slower.  WSI now has E5 configs with two SSD's instead of one, making them more useful (RAIDable), fwiw.
 
Last edited by a moderator:
  • Like
Reactions: fm7

drmike

100% Tier-1 Gogent
Scaleway, Hetzner, and that WSI E5 cost roughly the same per passmark, the difference is that Scaleway bills hourly, making it tempting to spin up ten Scaleways instead of 3-4 E3's or 1-2 E5's for a short period (few days or whatever) instead of running your task a lot longer on fewer monthly billed boxes.  The trouble is Scaleway has constant hardware shortages so it's not at all clear that you can spin up five of them whenever you want. 


I don't understand the issue with those E5 servers or the 96GB ones?  I saw an LET thread saying they have no KVM, but Hetzner doesn't either and I haven't needed it (Hetzner rescue system was enough).  The E5's let you boot an ISO image so you can always reinstall, other than that you need good frequent backups.  Yeah I'm not sure what WSI's situation is in other regards.  I was surprised to see some overlap with Joe's Datacenter, where I've sometimes thought of parking a box.


What do you mean about being able to buy those E5-2670's for $150 outright?  I've never seen anything like that.  L5520's or whatever, maybe, but those are much much slower.  WSI now has E5 configs with two SSD's instead of one, making them more useful (RAIDable), fwiw.

Scaleway remains killer, and thus, forget about seeing inventory available steady and any time soon.  Not going to happen.


I believe WSI is using these Quanta Windmill systems, ignore price on this:


http://www.ebay.com/itm/QUANTA-WINDMILL-SYTEM-2-NODES-4x-XEON-8-CORE-E5-2660-2-2GHz-16GB-RAM-2x-250GB-/131639736542


A rack full of them:
http://www.ebay.com/itm/52X-QUANTA-WINDMILL-OPEN-COMPUTE-NODES-4x-E5-2660-2-2GHZ-16GB-2x-250GB-WITH-RACK-/201426399499


Not the prices I said, but I know the base boards are available super cheap.. even loaded nodes are like $500 per populated unit full sale price there.


These were Quanta made for Facebook and spec like a computation farm build.


The Delimiter guys are familiar with the boards.  I think they acquired some at some point and decided against using those.


No traditional ports.  Just NIC which doubles are IPMI interface and allegedly insecure.


@mikeyur
 
Last edited by a moderator:

fm7

Active Member
So buy tenants off a box basically for isolation.  Dedi wins unless their software / panel is just truly that awesome or API advance in your world is that developed.

I think dedi ever wins. :)


Google's Predictable Performance angle was used to attract number crunching users, in special consultants and engineering firms hired by big corporations to solve complex problems. Instead the firms spending CAPEX to build their (scientific/engineering) clusters Google's mermaid promised no upfront costs, no cancellation fees, only pay for what you use. Considering numerical methods usually take a lot of cpu, memory, I/O,  the "predictable performance" pitch is sort of marketing ploy because you want/need one VM per server. :)

Scaleway, Hetzner, and that WSI E5 cost roughly the same per passmark,

If you are using a cluster of servers to run solvers you will be interested to check the LINPACK or benchmarks like that. Or roughly, Byte's Double-Precision Whetstone.
 
Last edited by a moderator:

willie

Active Member
Holy cow Drmike, thanks for those ebay links, it's tempting to buy some of those and colo them.  Where's the security issue with the pseudo-IPMI if you're on a routed ethernet port? 


Passmark has been a very good estimate of actual performance for the stuff I've been doing, basically distilling database dumps in a way that parallelizes well.  So far I do it semi-manually with a few python scripts but if I had the inclination and access to a ton of machines, I could put some more serious orchestration together.  I had been thinking of doing that with 100 or so Scaleway C1's last year, before meeting the reality that it's not possible to get that many on demand.


One annoying misfeature of the WSI offers is that all network traffic counts against your 33TB monthly allocation, even to another server in the same data center.  That makes it hard to use separate servers for computation and storage, because of the high traffic between them.
 
Last edited by a moderator:
  • Like
Reactions: fm7

fm7

Active Member
E3-1240v3


Double-Precision Whetstone  3113 MWIPS


8 CPUs in system; running 8 parallel copies of tests
Double-Precision Whetstone 35840 MWIPS


-------


Dedibox Kidéchire (VIA Nano U2250) (2€)


1 CPU in system; running 1 parallel copy of tests


Double-Precision Whetstone 1654 MWIPS


-----


Scaleway C1 (ARMv7 32-bit) (3€)


Double-Precision Whetstone  553 MWIPS


4 CPUs in system; running 4 parallel copies of tests
Double-Precision Whetstone 2221 MWIPS


-----



Scaleway C2 S (C2550) (9€)


Double-Precision Whetstone  1997 MWIPS


4 CPUs in system; running 4 parallel copies of tests
Double-Precision Whetstone  7985 MWIPS


-----


Scaleway VPS (C2750) (3€)


Double-Precision Whetstone   1989 MWIPS


2 CPUs in system; running 2 parallel copies of tests
Double-Precision Whetstone   3978 MWIPS
 
Last edited by a moderator:

willie

Active Member
Thanks FM7.  The E3 numbers are interesting because the E3-1240 is a 4-core machine with 8 threads, so I'd expect the parallel benchmark result to be at best about 5x the single threaded result.  But instead it is over 11x.  My current stuff is all integer but the floating benchmarks are nice to have.  I think though that people doing heavy duty numerics these days tend to use GPUs.
 
Last edited by a moderator:

fm7

Active Member
Thanks FM7.  The E3 numbers are interesting because the E3-1240 is a 4-core machine with 8 threads, so I'd expect the parallel benchmark result to be at best about 5x the single threaded result.  But instead it is over 11x.



Not that interesting :)


GOVERNOR is set to ondemand


# of Cores


4


# of Threads


8


Processor Base Frequency


3.4 GHz


Max Turbo Frequency


3.8 GHz




:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz
stepping        : 3
microcode       : 0x17
cpu MHz         : 2800.000
cache size      : 8192 KB


processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz
stepping        : 3
microcode       : 0x17
cpu MHz         : 800.000
 


processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz
stepping        : 3
microcode       : 0x17
cpu MHz         : 800.000


=================


# of Cores


4


# of Threads


8


Processor Base Frequency


3.3 GHz


Max Turbo Frequency


3.7 GHz




:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 58
model name      : Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz
stepping        : 9
microcode       : 0x15
cpu MHz         : 1600.000


E3-1230 V2


8 CPUs in system; running 1 parallel copy of tests
Double-Precision Whetstone                     3499 MWIPS


8 CPUs in system; running 8 parallel copies of tests
Double-Precision Whetstone                    33143 MWIPS


9,5x :)


I posted that E3 result as reference.
 
Last edited by a moderator:

DomainBop

Dormant VPSB Pathogen
Scaleway added two more VPS offers to their lineup today:


€5.99/Month
4 x86 64bit Cores
4GB Memory
100GB SSD Disk
1 Flexible public IPv4
200Mbit/s Unmetered bandwidth


€9.99/Month
6 x86 64bit Cores
8GB Memory
200GB SSD Disk
1 Flexible public IPv4
200Mbit/s Unmetered bandwidth


compare Scaleway's new VPS offerings to the competition they're trying to kill (OVH Public Cloud Instances):


1 vCore
2.4 GHz
4 GB RAM
20 GB SSD
100 Mbps best effort
€5.99 /month


2 vCores
2.4 GHz
8 GB RAM
40 GB SSD
100 Mbps best effort
€11.99 /month
 
Last edited by a moderator:
  • Like
Reactions: fm7
Top
amuck-landowner