Ceph

Tue, Jan 29, 2013

The Storage Stack for OpenStack

Florian Haas - florian@hastexo.com

  • Friday tutorial for Ceph
  • Native Object Storage
  • Ignores permissions, hierarchy (eg directory), owners
  • Limited to GET PUT DELETE ect…
  • Not a file or a block, it’s an object
  • Can scale to exabytes
  • Stored redundantly
  • Ceph can be used directly or with a layer above it, eg with OpenStack
  • OpenStack can use Ceph to present a block device
    • Thin provisioned
    • Snapshotting, caching
  • ReSTful Object Storage
    • access via HTTP ResTful calls
  • Distributed Filesystem
    • POSIX stuff
  • Lots of client APIs / command line tools

RADOS

  • Flat namespace
  • Objects are assigned placement groups
  • placement group has an ordered list of object storage devices
    • contents stored redundantly
  • Placement is algorytmically generated

CRUSH

  • CRUSH map - rules on how data is saved
    • Ceph defaults rules set to make sure no two copies of data are on the same device
    • More complicated setups to make datacentre redundant

Block storage

  • RADOS block device (RBD). thin-provisioned, striped across multiple RADOS objects. Only on
  • Easy snapshots and cloning (writable)
  • Used in OpenStack template based virtual machines / cloning VMs
  • rbd - kernel-level block device driver (/dev/rdb) Linux 2.6.37
    • most people use libvert / kvm then not preferred choice
  • qemu-rbd driver based on librados C API used for qemu and KVM
  • Intefrated with Glace for image storage
  • RBD also integrated with Cinder
    • Allows for boot and object storage
  • intergraded with nova boot from
  • cinder / nova handover

ReSTful

  • radosgw (rados gateway)
  • Supports S3 / OpenStack Swift APIs
  • native load balancing and scaleout
    • roundrobin dns, ect…
  • Supports keystone
    • doesn’t need separate auth now

Ease of use? - Few issues need to be ironed out - 3 nodes = ceph cluster - Targeted at private clouds - Suggests XFS for backend storage OSD