amuck-landowner

Clustering Systems

HalfEatenPie

The Irrational One
Retired Staff
Alright.  Lets break out the thinking caps and jump right in!

Lets say, hypothetically, I have 12 Dual E5 nodes with 128 GB of RAM and 500 GB HDD each as the slave nodes.  The master would be an E3 or something node with 32 GB RAM (or should I use a CPU that supports more RAM?) and lets say 12 4TB HDDs.  (Oh do I wish I had the money to buy this hardware...) 

What's the best way to setup clustering for high performance computing needs?  I know there's OpenMPI, Matlab's in-house Clustering system, and StackIQ.  I mean if I wanted to use this for a cloud system I'd simply run OpenStack or CloudStack or OpenNebula or whatnot, but for HPC anyone have any recommendations? 

Thanks!  This is currently a concept idea that I'm trying to refine a bit more for... future use ;) 
 

GIANT_CRAB

New Member
You should post this on LET instead since Jon Biloh has the best in-house clustering system. 

On a more serious note, Openstack has very separated systems (keystone, etc...) which is quite good. If you were to setup either a "cloud" or a cluster, Openstack can do just fine. 

I recommend installing Openstack on Fedora 20. (*fedora tipping intensifies*)

Instructions available here - https://openstack.redhat.com
 

HalfEatenPie

The Irrational One
Retired Staff
Huh interesting, thanks @DomainBop.  
 
A Hadoop/HPCC system would actually be a very useful tool since we'd need tools to sift through a ton of data.  My biggest concern I guess would be handling HPC job tasks (like how Amazon handles it).  
 
Also...
 

I didn't know OpenStack had something like that.  I'll look more into it @GIANT_CRAB.
 
Last edited by a moderator:

splitice

Just a little bit crazy...
Verified Provider
It all depends on what you are trying to achieve

Data Processing? Theorem Proving? AI / NN? Data Clustering (Lingo3 / k-means etc) ? Pattern Mining? The list goes on.. 

Or even just a high performance web cluster (well it fits the definition)...
 
Last edited by a moderator:

hibernate

New Member
Well if you want custom distributed processing you can go with MPI. There are few OpenMP implementations MPICH2 with manual here http://mpitutorial.com/beginner-mpi-tutorial/. There are also a few other different implementations and from redhat in particular.

However to make it efficient, you also need very fast, low-latency network, in most cases I know guys are using InfiniBand. 
 

HalfEatenPie

The Irrational One
Retired Staff
Well if you want custom distributed processing you can go with MPI. There are few OpenMP implementations MPICH2 with manual here http://mpitutorial.com/beginner-mpi-tutorial/. There are also a few other different implementations and from redhat in particular.

However to make it efficient, you also need very fast, low-latency network, in most cases I know guys are using InfiniBand. 
Totally unrelated to this, InfiniBand!  That was it!  I was trying to remember that for a few weeks because that's what our CFD research lab uses on their cluster.  I personally would like to copy their hardware (well...  at least network setup) but would like to implement my own software. 

Thanks a ton, this is a good part of what I was looking for!
 

DomainBop

Dormant VPSB Pathogen
Totally unrelated to this, InfiniBand!  That was it!  I was trying to remember that for a few weeks because that's what our CFD research lab uses on their cluster. 
InfiniBand is available as an option on OVH's HPC line... --> free 1-month beta test application for the hourly HPCspot service.
 
Last edited by a moderator:
Top
amuck-landowner