1 files changed, 601 insertions, 0 deletions
diff --git a/doc/rados/operations/control.rst b/doc/rados/operations/control.rst
new file mode 100644
index 000000000..d7a512618
--- /dev/null
+++ b/doc/rados/operations/control.rst
@@ -0,0 +1,601 @@
+.. index:: control, commands
+
+==================
+ Control Commands
+==================
+
+
+Monitor Commands
+================
+
+Monitor commands are issued using the ``ceph`` utility:
+
+.. prompt:: bash $
+
+   ceph [-m monhost] {command}
+
+The command is usually (though not always) of the form:
+
+.. prompt:: bash $
+
+   ceph {subsystem} {command}
+
+
+System Commands
+===============
+
+Execute the following to display the current cluster status.  :
+
+.. prompt:: bash $
+
+   ceph -s
+   ceph status
+
+Execute the following to display a running summary of cluster status
+and major events. :
+
+.. prompt:: bash $
+
+   ceph -w
+
+Execute the following to show the monitor quorum, including which monitors are
+participating and which one is the leader. :
+
+.. prompt:: bash $
+
+   ceph mon stat
+   ceph quorum_status
+
+Execute the following to query the status of a single monitor, including whether
+or not it is in the quorum. :
+
+.. prompt:: bash $
+
+   ceph tell mon.[id] mon_status
+
+where the value of ``[id]`` can be determined, e.g., from ``ceph -s``.
+
+
+Authentication Subsystem
+========================
+
+To add a keyring for an OSD, execute the following:
+
+.. prompt:: bash $
+
+   ceph auth add {osd} {--in-file|-i} {path-to-osd-keyring}
+
+To list the cluster's keys and their capabilities, execute the following:
+
+.. prompt:: bash $
+
+   ceph auth ls
+
+
+Placement Group Subsystem
+=========================
+
+To display the statistics for all placement groups (PGs), execute the following: 
+
+.. prompt:: bash $
+
+   ceph pg dump [--format {format}]
+
+The valid formats are ``plain`` (default), ``json`` ``json-pretty``, ``xml``, and ``xml-pretty``.
+When implementing monitoring and other tools, it is best to use ``json`` format.
+JSON parsing is more deterministic than the human-oriented ``plain``, and the layout is much
+less variable from release to release.  The ``jq`` utility can be invaluable when extracting
+data from JSON output.
+
+To display the statistics for all placement groups stuck in a specified state, 
+execute the following: 
+
+.. prompt:: bash $
+
+   ceph pg dump_stuck inactive|unclean|stale|undersized|degraded [--format {format}] [-t|--threshold {seconds}]
+
+
+``--format`` may be ``plain`` (default), ``json``, ``json-pretty``, ``xml``, or ``xml-pretty``.
+
+``--threshold`` defines how many seconds "stuck" is (default: 300)
+
+**Inactive** Placement groups cannot process reads or writes because they are waiting for an OSD
+with the most up-to-date data to come back.
+
+**Unclean** Placement groups contain objects that are not replicated the desired number
+of times. They should be recovering.
+
+**Stale** Placement groups are in an unknown state - the OSDs that host them have not
+reported to the monitor cluster in a while (configured by
+``mon_osd_report_timeout``).
+
+Delete "lost" objects or revert them to their prior state, either a previous version
+or delete them if they were just created. :
+
+.. prompt:: bash $
+
+   ceph pg {pgid} mark_unfound_lost revert|delete
+
+
+.. _osd-subsystem:
+
+OSD Subsystem
+=============
+
+Query OSD subsystem status. :
+
+.. prompt:: bash $
+
+   ceph osd stat
+
+Write a copy of the most recent OSD map to a file. See
+:ref:`osdmaptool <osdmaptool>`. :
+
+.. prompt:: bash $
+
+   ceph osd getmap -o file
+
+Write a copy of the crush map from the most recent OSD map to
+file. :
+
+.. prompt:: bash $
+
+   ceph osd getcrushmap -o file
+
+The foregoing is functionally equivalent to :
+
+.. prompt:: bash $
+
+   ceph osd getmap -o /tmp/osdmap
+   osdmaptool /tmp/osdmap --export-crush file
+
+Dump the OSD map. Valid formats for ``-f`` are ``plain``, ``json``, ``json-pretty``,
+``xml``, and ``xml-pretty``. If no ``--format`` option is given, the OSD map is 
+dumped as plain text.  As above, JSON format is best for tools, scripting, and other automation. :
+
+.. prompt:: bash $
+
+   ceph osd dump [--format {format}]
+
+Dump the OSD map as a tree with one line per OSD containing weight
+and state. :
+
+.. prompt:: bash $
+
+   ceph osd tree [--format {format}]
+
+Find out where a specific object is or would be stored in the system:
+
+.. prompt:: bash $
+
+   ceph osd map <pool-name> <object-name>
+
+Add or move a new item (OSD) with the given id/name/weight at the specified
+location. :
+
+.. prompt:: bash $
+
+   ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
+
+Remove an existing item (OSD) from the CRUSH map. :
+
+.. prompt:: bash $
+
+   ceph osd crush remove {name}
+
+Remove an existing bucket from the CRUSH map. :
+
+.. prompt:: bash $
+
+   ceph osd crush remove {bucket-name}
+
+Move an existing bucket from one position in the hierarchy to another.  :
+
+.. prompt:: bash $
+
+   ceph osd crush move {id} {loc1} [{loc2} ...]
+
+Set the weight of the item given by ``{name}`` to ``{weight}``. :
+
+.. prompt:: bash $
+
+   ceph osd crush reweight {name} {weight}
+
+Mark an OSD as ``lost``. This may result in permanent data loss. Use with caution. :
+
+.. prompt:: bash $
+
+   ceph osd lost {id} [--yes-i-really-mean-it]
+
+Create a new OSD. If no UUID is given, it will be set automatically when the OSD
+starts up. :
+
+.. prompt:: bash $
+
+   ceph osd create [{uuid}]
+
+Remove the given OSD(s). :
+
+.. prompt:: bash $
+
+   ceph osd rm [{id}...]
+
+Query the current ``max_osd`` parameter in the OSD map. :
+
+.. prompt:: bash $
+
+   ceph osd getmaxosd
+
+Import the given crush map. :
+
+.. prompt:: bash $
+
+   ceph osd setcrushmap -i file
+
+Set the ``max_osd`` parameter in the OSD map. This defaults to 10000 now so
+most admins will never need to adjust this. :
+
+.. prompt:: bash $
+
+   ceph osd setmaxosd
+
+Mark OSD ``{osd-num}`` down. :
+
+.. prompt:: bash $
+
+   ceph osd down {osd-num}
+
+Mark OSD ``{osd-num}`` out of the distribution (i.e. allocated no data). :
+
+.. prompt:: bash $
+
+   ceph osd out {osd-num}
+
+Mark ``{osd-num}`` in the distribution (i.e. allocated data). :
+
+.. prompt:: bash $
+
+   ceph osd in {osd-num}
+
+Set or clear the pause flags in the OSD map. If set, no IO requests
+will be sent to any OSD. Clearing the flags via unpause results in
+resending pending requests. :
+
+.. prompt:: bash $
+
+   ceph osd pause
+   ceph osd unpause
+
+Set the override weight (reweight) of ``{osd-num}`` to ``{weight}``. Two OSDs with the
+same weight will receive roughly the same number of I/O requests and
+store approximately the same amount of data. ``ceph osd reweight``
+sets an override weight on the OSD. This value is in the range 0 to 1,
+and forces CRUSH to re-place (1-weight) of the data that would
+otherwise live on this drive. It does not change weights assigned
+to the buckets above the OSD in the crush map, and is a corrective
+measure in case the normal CRUSH distribution is not working out quite
+right. For instance, if one of your OSDs is at 90% and the others are
+at 50%, you could reduce this weight to compensate. :
+
+.. prompt:: bash $
+
+   ceph osd reweight {osd-num} {weight}
+
+Balance OSD fullness by reducing the override weight of OSDs which are
+overly utilized.  Note that these override aka ``reweight`` values
+default to 1.00000 and are relative only to each other; they not absolute.
+It is crucial to distinguish them from CRUSH weights, which reflect the
+absolute capacity of a bucket in TiB.  By default this command adjusts
+override weight on OSDs which have + or - 20% of the average utilization,
+but if you include a ``threshold`` that percentage will be used instead. :
+
+.. prompt:: bash $
+
+   ceph osd reweight-by-utilization [threshold [max_change [max_osds]]] [--no-increasing]
+
+To limit the step by which any OSD's reweight will be changed, specify
+``max_change`` which defaults to 0.05.  To limit the number of OSDs that will
+be adjusted, specify ``max_osds`` as well; the default is 4.  Increasing these
+parameters can speed leveling of OSD utilization, at the potential cost of
+greater impact on client operations due to more data moving at once.
+
+To determine which and how many PGs and OSDs will be affected by a given invocation
+you can test before executing. :
+
+.. prompt:: bash $
+
+   ceph osd test-reweight-by-utilization [threshold [max_change max_osds]] [--no-increasing]
+
+Adding ``--no-increasing`` to either command prevents increasing any
+override weights that are currently < 1.00000.  This can be useful when
+you are balancing in a hurry to remedy ``full`` or ``nearful`` OSDs or
+when some OSDs are being evacuated or slowly brought into service.
+
+Deployments utilizing Nautilus (or later revisions of Luminous and Mimic)
+that have no pre-Luminous cients may instead wish to instead enable the
+`balancer`` module for ``ceph-mgr``.
+
+Add/remove an IP address or CIDR range to/from the blocklist.
+When adding to the blocklist,
+you can specify how long it should be blocklisted in seconds; otherwise,
+it will default to 1 hour. A blocklisted address is prevented from
+connecting to any OSD. If you blocklist an IP or range containing an OSD, be aware
+that OSD will also be prevented from performing operations on its peers where it
+acts as a client. (This includes tiering and copy-from functionality.)
+
+If you want to blocklist a range (in CIDR format), you may do so by
+including the ``range`` keyword.
+
+These commands are mostly only useful for failure testing, as
+blocklists are normally maintained automatically and shouldn't need
+manual intervention. :
+
+.. prompt:: bash $
+
+   ceph osd blocklist ["range"] add ADDRESS[:source_port][/netmask_bits] [TIME]
+   ceph osd blocklist ["range"] rm ADDRESS[:source_port][/netmask_bits]
+
+Creates/deletes a snapshot of a pool. :
+
+.. prompt:: bash $
+
+   ceph osd pool mksnap {pool-name} {snap-name}
+   ceph osd pool rmsnap {pool-name} {snap-name}
+
+Creates/deletes/renames a storage pool. :
+
+.. prompt:: bash $
+
+   ceph osd pool create {pool-name} [pg_num [pgp_num]]
+   ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
+   ceph osd pool rename {old-name} {new-name}
+
+Changes a pool setting. : 
+
+.. prompt:: bash $
+
+   ceph osd pool set {pool-name} {field} {value}
+
+Valid fields are:
+
+	* ``size``: Sets the number of copies of data in the pool.
+	* ``pg_num``: The placement group number.
+	* ``pgp_num``: Effective number when calculating pg placement.
+	* ``crush_rule``: rule number for mapping placement.
+
+Get the value of a pool setting. :
+
+.. prompt:: bash $
+
+   ceph osd pool get {pool-name} {field}
+
+Valid fields are:
+
+	* ``pg_num``: The placement group number.
+	* ``pgp_num``: Effective number of placement groups when calculating placement.
+
+
+Sends a scrub command to OSD ``{osd-num}``. To send the command to all OSDs, use ``*``. :
+
+.. prompt:: bash $
+
+   ceph osd scrub {osd-num}
+
+Sends a repair command to OSD.N. To send the command to all OSDs, use ``*``. :
+
+.. prompt:: bash $
+
+   ceph osd repair N
+
+Runs a simple throughput benchmark against OSD.N, writing ``TOTAL_DATA_BYTES``
+in write requests of ``BYTES_PER_WRITE`` each. By default, the test
+writes 1 GB in total in 4-MB increments.
+The benchmark is non-destructive and will not overwrite existing live
+OSD data, but might temporarily affect the performance of clients
+concurrently accessing the OSD. :
+
+.. prompt:: bash $
+
+   ceph tell osd.N bench [TOTAL_DATA_BYTES] [BYTES_PER_WRITE]
+
+To clear an OSD's caches between benchmark runs, use the 'cache drop' command :
+
+.. prompt:: bash $
+
+   ceph tell osd.N cache drop
+
+To get the cache statistics of an OSD, use the 'cache status' command :
+
+.. prompt:: bash $
+
+   ceph tell osd.N cache status
+
+MDS Subsystem
+=============
+
+Change configuration parameters on a running mds. :
+
+.. prompt:: bash $
+
+   ceph tell mds.{mds-id} config set {setting} {value}
+
+Example:
+
+.. prompt:: bash $
+
+   ceph tell mds.0 config set debug_ms 1
+
+Enables debug messages. :
+
+.. prompt:: bash $
+
+   ceph mds stat
+
+Displays the status of all metadata servers. :
+
+.. prompt:: bash $
+
+   ceph mds fail 0
+
+Marks the active MDS as failed, triggering failover to a standby if present.
+
+.. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap
+
+
+Mon Subsystem
+=============
+
+Show monitor stats:
+
+.. prompt:: bash $
+
+   ceph mon stat
+
+::
+
+	e2: 3 mons at {a=127.0.0.1:40000/0,b=127.0.0.1:40001/0,c=127.0.0.1:40002/0}, election epoch 6, quorum 0,1,2 a,b,c
+
+
+The ``quorum`` list at the end lists monitor nodes that are part of the current quorum.
+
+This is also available more directly:
+
+.. prompt:: bash $
+
+   ceph quorum_status -f json-pretty
+	
+.. code-block:: javascript	
+
+	{
+	    "election_epoch": 6,
+	    "quorum": [
+		0,
+		1,
+		2
+	    ],
+	    "quorum_names": [
+		"a",
+		"b",
+		"c"
+	    ],
+	    "quorum_leader_name": "a",
+	    "monmap": {
+		"epoch": 2,
+		"fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
+		"modified": "2016-12-26 14:42:09.288066",
+		"created": "2016-12-26 14:42:03.573585",
+		"features": {
+		    "persistent": [
+			"kraken"
+		    ],
+		    "optional": []
+		},
+		"mons": [
+		    {
+			"rank": 0,
+			"name": "a",
+			"addr": "127.0.0.1:40000\/0",
+			"public_addr": "127.0.0.1:40000\/0"
+		    },
+		    {
+			"rank": 1,
+			"name": "b",
+			"addr": "127.0.0.1:40001\/0",
+			"public_addr": "127.0.0.1:40001\/0"
+		    },
+		    {
+			"rank": 2,
+			"name": "c",
+			"addr": "127.0.0.1:40002\/0",
+			"public_addr": "127.0.0.1:40002\/0"
+		    }
+		]
+	    }
+	}
+	  
+
+The above will block until a quorum is reached.
+
+For a status of just a single monitor:
+
+.. prompt:: bash $
+
+   ceph tell mon.[name] mon_status
+	
+where the value of ``[name]`` can be taken from ``ceph quorum_status``. Sample
+output::
+	
+	{
+	    "name": "b",
+	    "rank": 1,
+	    "state": "peon",
+	    "election_epoch": 6,
+	    "quorum": [
+		0,
+		1,
+		2
+	    ],
+	    "features": {
+		"required_con": "9025616074522624",
+		"required_mon": [
+		    "kraken"
+		],
+		"quorum_con": "1152921504336314367",
+		"quorum_mon": [
+		    "kraken"
+		]
+	    },
+	    "outside_quorum": [],
+	    "extra_probe_peers": [],
+	    "sync_provider": [],
+	    "monmap": {
+		"epoch": 2,
+		"fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
+		"modified": "2016-12-26 14:42:09.288066",
+		"created": "2016-12-26 14:42:03.573585",
+		"features": {
+		    "persistent": [
+			"kraken"
+		    ],
+		    "optional": []
+		},
+		"mons": [
+		    {
+			"rank": 0,
+			"name": "a",
+			"addr": "127.0.0.1:40000\/0",
+			"public_addr": "127.0.0.1:40000\/0"
+		    },
+		    {
+			"rank": 1,
+			"name": "b",
+			"addr": "127.0.0.1:40001\/0",
+			"public_addr": "127.0.0.1:40001\/0"
+		    },
+		    {
+			"rank": 2,
+			"name": "c",
+			"addr": "127.0.0.1:40002\/0",
+			"public_addr": "127.0.0.1:40002\/0"
+		    }
+		]
+	    }
+	}
+
+A dump of the monitor state:
+
+       .. prompt:: bash $
+
+	ceph mon dump
+
+  ::
+
+	dumped monmap epoch 2
+	epoch 2
+	fsid ba807e74-b64f-4b72-b43f-597dfe60ddbc
+	last_changed 2016-12-26 14:42:09.288066
+	created 2016-12-26 14:42:03.573585
+	0: 127.0.0.1:40000/0 mon.a
+	1: 127.0.0.1:40001/0 mon.b
+	2: 127.0.0.1:40002/0 mon.c
+