amuck-landowner

What is the software solution for the high availability VPS reselling?

ICPH

Member
Hello,

when i get 2 dedicated servers and want to run an virtualization on it (create a VPSs), which software to use to achieve state where:

- if one server die, VPS uptime wont be affected

- i can at anytime add one more server or several new servers into the cloud to increase resources of the cloud

aim is to offer high availability of the VPSs and ease of running such a server cloud. the software should be free or cheap, no enterprise kind of pricing

I know about OpenStack (open source), but it probably do not allow high availability (HA), only flexibility of the resources in the cloud
 
Last edited by a moderator:

TheLinuxBug

New Member
Saw this and figured I would get a quick reply here, there is a lot more to be discussed but I wanted to address just a few points here real quick:

1. As far as I know there is no 'Cloud' platform which can have a failure where your server will not be at least restarted, this includes OnApp, Virtualizor, etc.  The 'HA' functions for these platforms are usually 'HA'  in the sense that they are using two volume streams for each disk volume.  In other words if you have a 10GB SSD volume you are really using 2x10GB SSD volumes in a raid 1 where the two volumes are taken from 2 different hypervisors.  This allows for if one hypervisor is to crash, you can quickly restart the VM on another hypervisor even if the main data volume was on that crashed hypervisor.  Then once the secondary stream comes back online, just like in a normal raid, the volume rebuilds.  As far as I know there is no platform that exists where there can be a physical hardware failure and the server can just continue on another piece of hardware.  It is possible to migrate the VM between hypervisors before say a reboot of one hypervisor or the other, but this isn't the same as if the physical hardware its currently on crashes.

2. With OnApp for example you can add new hypervisors to their 'Cloud' anytime you like, but again, this doesn't work like you are thinking.  You can't provision more resources than exist on one single physical hardware node.  So if your thought here was to say two 8 core dedicated servers and across them run 1 16 core server, it isn't going to happen. 

Most people who start asking the questions like you are really are looking to build a fully redundant 'Cluster' not use 'HA' virtual servers.  You can place this platform on virtual servers, but what you really need is a setup that has servers for all the different functions and fail overs for each of those functions. Something like:

  • 2 x Gateway running with heartbeat with one in standby at all time for fail over.  If one fails, the ip is taken over by the other and it continues
  • 2 (or more) web servers or backends, so if one fails over the gateway in the front can load balance
  • a mechanism to handle balancing DB requests
  • 2 or more database servers in replication
The company I work with builds these types of clusters for enterprise companies, so I do have some experience with this.  If this is really what you are looking for I can try and help get you started, but if you want specific details or trade secrets you will have to spend some money ;)

There is probably more to be answered here, but this may answer some of the initial questions you had at least.

As I said, this is to my knowledge, maybe there is some incredibly expensive platform out there that does do this, but it defiantly isn't going to fall into your 'cheap or free' requirement.  Even CA AppLogic (which is EOL and has been discontinued now) only automated the reboots onto a second hypervisor in case of a failure (OnApp doesn't do this to my knowledge, at least by default) and it was to be touted as one of the better pre 'CloudStack' Xen HA platforms.

my 2 cents.

Cheers!
 
Last edited by a moderator:

perennate

New Member
Verified Provider
1. As far as I know there is no 'Cloud' platform which can have a failure where your server will not be at least restarted, this includes OnApp, Virtualizor, etc.  The 'HA' functions for these platforms are usually 'HA'  in the sense that they are using two volume streams for each disk volume.  In other words if you have a 10GB SSD volume you are really using 2x10GB SSD volumes in a raid 1 where the two volumes are taken from 2 different hypervisors.  This allows for if one hypervisor is to crash, you can quickly restart the VM on another hypervisor even if the main data volume was on that crashed hypervisor.  Then once the secondary stream comes back online, just like in a normal raid, the volume rebuilds.  As far as I know there is no platform that exists where there can be a physical hardware failure and the server can just continue on another piece of hardware.  It is possible to migrate the VM between hypervisors before say a reboot of one hypervisor or the other, but this isn't the same as if the physical hardware its currently on crashes.

http://wiki.xen.org/wiki/Remu

(https://www.usenix.org/event/nsdi08/tech/full_papers/cully/cully.pdf)

Edit: apparently drserver can support it, probably expensive though: http://lowendbox.com/blog/drserver-4-plans-starting-at-0-50month-for-a-64mb-ipv6-only-xen-vps/#comment-282141
 
Last edited by a moderator:

TheLinuxBug

New Member
Thanks for that info @perennate I do imagine that it will have to be a bit expensive though at DrServers.  Didn't realize about Remus though, that is an interesting read.  This does look open source?  So it may be possible for the OP to set this up him self.  Though it does seem like it might have a bit of a learning curve.

Do you guys at Lunanode have any plans to offer something similar on your OpenStack setup?  Seems like something that would have a more specialized use case though.  Not sure you could make much money outside the enterprise sector with it because of the cost of resources needed to run the full redundancy?

Cheers!
 

perennate

New Member
Verified Provider
Do you guys at Lunanode have any plans to offer something similar on your OpenStack setup?  Seems like something that would have a more specialized use case though.  Not sure you could make much money outside the enterprise sector with it because of the cost of resources needed to run the full redundancy?

Certainly not (there is a micro-checkpointing patch for KVM, but it seems unstable). IMO these days it's more worth it to design your application to be fault tolerant (like you were saying above) than to focus on per-VM availability (especially when >half of the downtime events is going to be because of network issues; I guess you can checkpoint across datacenters, but then you're going to have to pay for expensive dedicated links). But checkpointing at the hypervisor level is still an interesting idea.
 
Last edited by a moderator:

pcan

New Member
- if one server die, VPS uptime wont be affecte

VmWare Vsphere with the FT (Fault Tolerance) feature enabled does this, but it is neither free nor cheap or easy to deploy.

The cluster approach is more simple and cost effective. As added bonus, the secondary server of the cluster does take a portion of the load from the primary one; On a FT system, the failover server works in lockstep with the primary and does not perform any useful work.
 

TheLinuxBug

New Member
@perennate ahh yes, forgot you guys were using KVM and not Xen on your hypervisors.  Yeah everything I have seen for KVM doesn't look very stable.  I agree, building out a cluster is a much better use of time and money rather than fighting to setup a fully FT setup, as @pcan was suggesting.

Cheers!
 
Last edited by a moderator:

perennate

New Member
Verified Provider
@ICPH: e.g. for web application, a typical approach is:

  • Use a distributed database for any persistent data. This may be master-slave MySQL, MySQL Galera/NDB cluster, MongoDB, Cassandra, etc.
  • Write your application code so that requests can be processed in parallel across multiple application instances. Most modern web code is already like this, although maybe lacking transactions. (e.g. PHP is by nature like this since your code doesn't persist across sessions).
  • Optional: have a distributed cache (like Redis) that is separate from your distributed database to store temporary session data, and perhaps to also cache database values. This is only needed when your application becomes very resource intensive.
  • Use at least two load balancers to distribute load across your application servers. Use DNS round robin with failover to deactivate load balancers if the load balancer becomes unreachable. Note: this is only necessary if you have high load; otherwise you can just use put your application servers in DNS and not have load balancers.
Note: if your application code cannot be easily distributed (e.g. doesn't use transactions), you can set up the haproxy to do primary-backup load balancing where both load balancers pass to the primary application server unless it goes offline. If you're worried about the load balancers having inconsistent state (one forwarding to primary, one to backup), then you can have a third view server that conducts the monitoring and switches the load balancers atomically.
 
Last edited by a moderator:

unixhost

New Member
Verified Provider
Code:
I can advise to try [URL="https://www.proxmox.com/en/"]proxmox[/URL]. Very easy to use virtualization
 

VPSSoldiers

New Member
I use Proxmox in my home environment and love it (though I personally feel its aimed towards the "private VPS" type setups). I haven't ever had the equipment to setup HA with it though but I've been looking into it just to play with it.
 

perennate

New Member
Verified Provider
Yeah, simple and stable, i have 2 nodes on proxmox, all work fine and good, so if you have time just install and try

Sorry, I wasn't clear, I meant "really?" as in "really? why would you post that unrelated garbage in this thread?".

proxmox is just VM management software; nothing to do with VM failover technology.
 
Last edited by a moderator:

mitgib

New Member
Verified Provider

perennate

New Member
Verified Provider

Don't think so. It was suggested simply as a "very easy to use virtualization", and anyway there are more robust platforms for VM failover via simple shared storage / distributed block storage like OpenStack (which OP already acknowledged), OpenNebula, etc.

I don't think there's any VM management software that makes it easy to set up instantaneous failover (with connection persistence across failovers) via microcheckpointing or similar technology.
 
Last edited by a moderator:

mitgib

New Member
Verified Provider

Don't think so. It was suggested simply as a "very easy to use virtualization", and anyway there are more robust platforms for VM failover via simple shared storage / distributed block storage like OpenStack (which OP already acknowledged), OpenNebula, etc.

I don't think there's any VM management software that makes it easy to set up instantaneous failover (with connection persistence across failovers) via microcheckpointing or similar technology.
I'm not trying to start a debate about it, but OpenStack and Proxmod both use Ceph for distributed storage, but I do admit Proxmox is not as robust as OpenStack, at the same time does not require the learning curve either.
 

willie

Active Member
BuyVM has a nice setup involving IP Anycast to three locations (western US, eastern US, and Luxembourg Europe).  You buy three VPS from them (one at each location) and at your request, all three get the same IP address, so in the event of a node failure, client packets automatically get routed to another node.  Of course it is up to you to make use of that at the application level.

I don't think there are any generally purpose HA solutions with connection persistence for Linux, and anyway there would have to be a load balancer or router that would be a single point of failure.  You could do something like that at the user level and some languages like Erlang have features for it, but the application has to be written specifically to implement such capability.  I've done some HA embedded programming (phone switch software written in C) and the failover stuff was complicated and pervasive in the program.  But it meant you could do stuff like rip the CPU card right out of the box while it was running, and the users wouldn't notice anything since the backup CPU would keep chugging along.

If you're satisfied with a failover if a node crashes, Openstack/KVM/Ceph is set up for that and I see OVH offers it with their "Cloud" VPS line.  I presume that failover means a container restart on the new node, but I haven't looked into it or tried it.
 
Last edited by a moderator:

perennate

New Member
Verified Provider
I don't think there are any generally purpose HA solutions with connection persistence for Linux, and anyway there would have to be a load balancer or router that would be a single point of failure.

You could run Remus between two host nodes in different datacenters with dedicated 1gbps link, just have to wait for BGP failover
 
Last edited by a moderator:
Top
amuck-landowner