Mammoth's Binary Lane e-mail:
2016-May-6
Migrating to local storage
From today, most new Binary Lane cloud servers will be deployed onto direct-attached RAID-10 SSD ("local storage") instead of the Ceph network-distributed cluster ("cloud storage") that Binary Lane has used to date.
Existing cloud servers will be live-migrated to local storage starting June 1 on an ongoing basis as internal resources permit.
Local RAID-10 SSD storage will provide customers with increased server performance; while improvements to KVM will allow us to continue to utilise live-migrations for scheduled maintenance.
We have limited capacity for early adopters who would like to migrate a VPS to local storage prior to June. To do so, please contact
[email protected] .
Requests will be processed on a first-come, first-served basis and depending on demand may not necessarily be completed prior to June; however we will continue to prioritise such requests ahead of the service-wide migration.
What changes can I expect?
By far, the biggest impact the migration will have is on disk performance. In our testing, we have seen improvements of up to 500%:
The primary negative that may be encountered is when using Change Plan to increase disk size. With the current Cloud SSD solution, all SSDs are combined together into a single, massive pool.
By comparison, when using local SSD each cloud server's disk size will be limited by the amount of storage available on the individual host node that the VPS is located on.
In the future we plan to work around this limitation by enabling Change Plan to automatically perform live migrations when the desired upgrade is not available on the current host, but currently customers will need to contact support to request a live-migration if Change Plan reports that sufficient resources are not available.
Why are we dropping cloud storage now?
Before Binary Lane launched in February 2014, early in our solution design process we reached two key decisions:
- The service would be 100% SSD for fantastic disk performance
- The service would utilise Ceph network-distributed storage, for increased reliability and better availability
In our testing of Ceph before launch, it was apparent that the disk performance of Ceph was significantly below that of a local RAID solution – somewhere around 20% to 30%.
From our investigation at the time, it was apparent that using SSD's with Ceph was a relatively unexplored usecase (with typical usage being massive 1PB+ clusters of slow disks) and that the software was not yet optimized enough to bridge the gap.
However, the then-upcoming "Firefly" release was adding support for SSD caching and we felt confident that we could essentially "ride the wave" of new version releases to reach a solution as fast as local storage, while providing more functionality.
Instead, we saw a few different things happen that in combination have, at least for us, changed the preferred solution:
- While Ceph has improved performance to some degree, it has not been the focus of the developers and is still far behind local storage.
- Inktank (the company who developed Ceph) was purchased by Redhat, and there appears to be a greater focus on enterprise functionality instead.
- KVM (our virtualization platform) has implemented a variety of new features allowing for live "mirror" migrations, which allows for a VPS and its local storage to be moved from one host to another by transparently copying the disk (and any modifications during the copy) to a new host.
This has left us with a scenario where the majority of functionality that we wanted from Ceph is now available in KVM with local storage, and without paying the performance penalty that is still associated with Ceph to date.