summaryrefslogtreecommitdiffstats
path: root/doc/rados/operations/data-placement.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rados/operations/data-placement.rst')
-rw-r--r--doc/rados/operations/data-placement.rst47
1 files changed, 47 insertions, 0 deletions
diff --git a/doc/rados/operations/data-placement.rst b/doc/rados/operations/data-placement.rst
new file mode 100644
index 000000000..3d3be65ec
--- /dev/null
+++ b/doc/rados/operations/data-placement.rst
@@ -0,0 +1,47 @@
+=========================
+ Data Placement Overview
+=========================
+
+Ceph stores, replicates, and rebalances data objects across a RADOS cluster
+dynamically. Because different users store objects in different pools for
+different purposes on many OSDs, Ceph operations require a certain amount of
+data- placement planning. The main data-placement planning concepts in Ceph
+include:
+
+- **Pools:** Ceph stores data within pools, which are logical groups used for
+ storing objects. Pools manage the number of placement groups, the number of
+ replicas, and the CRUSH rule for the pool. To store data in a pool, it is
+ necessary to be an authenticated user with permissions for the pool. Ceph is
+ able to make snapshots of pools. For additional details, see `Pools`_.
+
+- **Placement Groups:** Ceph maps objects to placement groups. Placement
+ groups (PGs) are shards or fragments of a logical object pool that place
+ objects as a group into OSDs. Placement groups reduce the amount of
+ per-object metadata that is necessary for Ceph to store the data in OSDs. A
+ greater number of placement groups (for example, 100 PGs per OSD as compared
+ with 50 PGs per OSD) leads to better balancing. For additional details, see
+ :ref:`placement groups`.
+
+- **CRUSH Maps:** CRUSH plays a major role in allowing Ceph to scale while
+ avoiding certain pitfalls, such as performance bottlenecks, limitations to
+ scalability, and single points of failure. CRUSH maps provide the physical
+ topology of the cluster to the CRUSH algorithm, so that it can determine both
+ (1) where the data for an object and its replicas should be stored and (2)
+ how to store that data across failure domains so as to improve data safety.
+ For additional details, see `CRUSH Maps`_.
+
+- **Balancer:** The balancer is a feature that automatically optimizes the
+ distribution of placement groups across devices in order to achieve a
+ balanced data distribution, in order to maximize the amount of data that can
+ be stored in the cluster, and in order to evenly distribute the workload
+ across OSDs.
+
+It is possible to use the default values for each of the above components.
+Default values are recommended for a test cluster's initial setup. However,
+when planning a large Ceph cluster, values should be customized for
+data-placement operations with reference to the different roles played by
+pools, placement groups, and CRUSH.
+
+.. _Pools: ../pools
+.. _CRUSH Maps: ../crush-map
+.. _Balancer: ../balancer