diff options
Diffstat (limited to 'ctdb/doc/ctdb-statistics.7.xml')
-rw-r--r-- | ctdb/doc/ctdb-statistics.7.xml | 689 |
1 files changed, 689 insertions, 0 deletions
diff --git a/ctdb/doc/ctdb-statistics.7.xml b/ctdb/doc/ctdb-statistics.7.xml new file mode 100644 index 0000000..387852d --- /dev/null +++ b/ctdb/doc/ctdb-statistics.7.xml @@ -0,0 +1,689 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<!DOCTYPE refentry + PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" + "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> + +<refentry id="ctdb-statistics.7"> + + <refmeta> + <refentrytitle>ctdb-statistics</refentrytitle> + <manvolnum>7</manvolnum> + <refmiscinfo class="source">ctdb</refmiscinfo> + <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>ctdb-statistics</refname> + <refpurpose>CTDB statistics output</refpurpose> + </refnamediv> + + <refsect1> + <title>OVERALL STATISTICS</title> + + <para> + CTDB maintains information about various messages communicated + and some of the important operations per node. See the + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry> commands + <command>statistics</command> and <command>statisticsreset</command> + for displaying statistics. + </para> + + <refsect2> + <title>Example: ctdb statistics</title> + <screen> +CTDB version 1 +Current time of statistics : Fri Sep 12 13:32:32 2014 +Statistics collected since : (000 01:49:20) Fri Sep 12 11:43:12 2014 + num_clients 6 + frozen 0 + recovering 0 + num_recoveries 2 + client_packets_sent 281293 + client_packets_recv 296317 + node_packets_sent 452387 + node_packets_recv 182394 + keepalive_packets_sent 3927 + keepalive_packets_recv 3928 + node + req_call 48605 + reply_call 1 + req_dmaster 23404 + reply_dmaster 24917 + reply_error 0 + req_message 958 + req_control 197513 + reply_control 153705 + client + req_call 130866 + req_message 770 + req_control 168921 + timeouts + call 0 + control 0 + traverse 0 + locks + num_calls 220 + num_current 0 + num_pending 0 + num_failed 0 + total_calls 130866 + pending_calls 0 + childwrite_calls 1 + pending_childwrite_calls 0 + memory_used 334490 + max_hop_count 18 + total_ro_delegations 2 + total_ro_revokes 2 + hop_count_buckets: 42816 5464 26 1 0 0 0 0 0 0 0 0 0 0 0 0 + lock_buckets: 9 165 14 15 7 2 2 0 0 0 0 0 0 0 0 0 + locks_latency MIN/AVG/MAX 0.000685/0.160302/6.369342 sec out of 214 + reclock_ctdbd MIN/AVG/MAX 0.004940/0.004969/0.004998 sec out of 2 + reclock_recd MIN/AVG/MAX 0.000000/0.000000/0.000000 sec out of 0 + call_latency MIN/AVG/MAX 0.000006/0.000719/4.562991 sec out of 126626 + childwrite_latency MIN/AVG/MAX 0.014527/0.014527/0.014527 sec out of 1 + </screen> + </refsect2> + + <refsect2> + <title>CTDB version</title> + <para> + Version of the ctdb protocol used by the node. + </para> + </refsect2> + + <refsect2> + <title>Current time of statistics</title> + <para> + Time when the statistics are generated. + </para> + <para> + This is useful when collecting statistics output periodically + for post-processing. + </para> + </refsect2> + + <refsect2> + <title>Statistics collected since</title> + <para> + Time when ctdb was started or the last time statistics was reset. + The output shows the duration and the timestamp. + </para> + </refsect2> + + <refsect2> + <title>num_clients</title> + <para> + Number of processes currently connected to CTDB's unix socket. + This includes recovery daemon, ctdb tool and samba processes + (smbd, winbindd). + </para> + </refsect2> + + <refsect2> + <title>frozen</title> + <para> + 1 if the databases are currently frozen, 0 otherwise. + </para> + </refsect2> + + <refsect2> + <title>recovering</title> + <para> + 1 if recovery is active, 0 otherwise. + </para> + </refsect2> + + <refsect2> + <title>num_recoveries</title> + <para> + Number of recoveries since the start of ctdb or since the last + statistics reset. + </para> + </refsect2> + + <refsect2> + <title>client_packets_sent</title> + <para> + Number of packets sent to client processes via unix domain socket. + </para> + </refsect2> + + <refsect2> + <title>client_packets_recv</title> + <para> + Number of packets received from client processes via unix domain socket. + </para> + </refsect2> + + <refsect2> + <title>node_packets_sent</title> + <para> + Number of packets sent to the other nodes in the cluster via TCP. + </para> + </refsect2> + + <refsect2> + <title>node_packets_recv</title> + <para> + Number of packets received from the other nodes in the cluster via TCP. + </para> + </refsect2> + + <refsect2> + <title>keepalive_packets_sent</title> + <para> + Number of keepalive messages sent to other nodes. + </para> + <para> + CTDB periodically sends keepalive messages to other nodes. + See <citetitle>KeepaliveInterval</citetitle> tunable in + <citerefentry><refentrytitle>ctdb-tunables</refentrytitle> + <manvolnum>7</manvolnum></citerefentry> for more details. + </para> + </refsect2> + + <refsect2> + <title>keepalive_packets_recv</title> + <para> + Number of keepalive messages received from other nodes. + </para> + </refsect2> + + <refsect2> + <title>node</title> + <para> + This section lists various types of messages processed which + originated from other nodes via TCP. + </para> + + <refsect3> + <title>req_call</title> + <para> + Number of REQ_CALL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_call</title> + <para> + Number of REPLY_CALL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_dmaster</title> + <para> + Number of REQ_DMASTER messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_dmaster</title> + <para> + Number of REPLY_DMASTER messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_error</title> + <para> + Number of REPLY_ERROR messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_message</title> + <para> + Number of REQ_MESSAGE messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_control</title> + <para> + Number of REQ_CONTROL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_control</title> + <para> + Number of REPLY_CONTROL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_tunnel</title> + <para> + Number of REQ_TUNNEL messages from the other nodes. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>client</title> + <para> + This section lists various types of messages processed which + originated from clients via unix domain socket. + </para> + + <refsect3> + <title>req_call</title> + <para> + Number of REQ_CALL messages from the clients. + </para> + </refsect3> + + <refsect3> + <title>req_message</title> + <para> + Number of REQ_MESSAGE messages from the clients. + </para> + </refsect3> + + <refsect3> + <title>req_control</title> + <para> + Number of REQ_CONTROL messages from the clients. + </para> + </refsect3> + + <refsect3> + <title>req_tunnel</title> + <para> + Number of REQ_TUNNEL messages from the clients. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>timeouts</title> + <para> + This section lists timeouts occurred when sending various messages. + </para> + + <refsect3> + <title>call</title> + <para> + Number of timeouts for REQ_CALL messages. + </para> + </refsect3> + + <refsect3> + <title>control</title> + <para> + Number of timeouts for REQ_CONTROL messages. + </para> + </refsect3> + + <refsect3> + <title>traverse</title> + <para> + Number of timeouts for database traverse operations. + </para> + </refsect3> + </refsect2> + + <refsect2> + <title>locks</title> + <para> + This section lists locking statistics. + </para> + + <refsect3> + <title>num_calls</title> + <para> + Number of completed lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>num_current</title> + <para> + Number of scheduled lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>num_pending</title> + <para> + Number of queued lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + <refsect3> + <title>num_failed</title> + <para> + Number of failed lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>total_calls</title> + <para> + Number of req_call messages processed from clients. This number + should be same as client --> req_call. + </para> + </refsect2> + + <refsect2> + <title>pending_calls</title> + <para> + Number of req_call messages which are currently being processed. + This number indicates the number of record migrations in flight. + </para> + </refsect2> + + <refsect2> + <title>childwrite_calls</title> + <para> + Number of record update calls. Record update calls are used to + update a record under a transaction. + </para> + </refsect2> + + <refsect2> + <title>pending_childwrite_calls</title> + <para> + Number of record update calls currently active. + </para> + </refsect2> + + <refsect2> + <title>memory_used</title> + <para> + The amount of memory in bytes currently used by CTDB using + talloc. This includes all the memory used for CTDB's internal + data structures. This does not include the memory mapped TDB + databases. + </para> + </refsect2> + + <refsect2> + <title>max_hop_count</title> + <para> + The maximum number of hops required for a record migration request + to obtain the record. High numbers indicate record contention. + </para> + </refsect2> + + <refsect2> + <title>total_ro_delegations</title> + <para> + Number of readonly delegations created. + </para> + </refsect2> + + <refsect2> + <title>total_ro_revokes</title> + <para> + Number of readonly delegations that were revoked. The difference + between total_ro_revokes and total_ro_delegations gives the + number of currently active readonly delegations. + </para> + </refsect2> + + <refsect2> + <title>hop_count_buckets</title> + <para> + Distribution of migration requests based on hop counts values. + Buckets are 0, < 2, < 4, < 8, + < 16, < 32, < 64, < 128, + < 256, < 512, < 1024, < 2048, + < 4096, < 8192, < 16384, ≥ 16384. + </para> + </refsect2> + + <refsect2> + <title>lock_buckets</title> + <para> + Distribution of record lock requests based on time required to + obtain locks. Buckets are < 1ms, < 10ms, + < 100ms, < 1s, < 2s, < 4s, + < 8s, < 16s, < 32s, < 64s, + ≥ 64s. + </para> + </refsect2> + + <refsect2> + <title>locks_latency</title> + <para> + The minimum, the average and the maximum time (in seconds) + required to obtain record locks. + </para> + </refsect2> + + <refsect2> + <title>reclock_ctdbd</title> + <para> + The minimum, the average and the maximum time (in seconds) + required to check if recovery lock is still held by recovery + daemon when recovery mode is changed. This check is done in ctdb daemon. + </para> + </refsect2> + + <refsect2> + <title>reclock_recd</title> + <para> + The minimum, the average and the maximum time (in seconds) + required to check if recovery lock is still held by recovery + daemon during recovery. This check is done in recovery daemon. + </para> + </refsect2> + + <refsect2> + <title>call_latency</title> + <para> + The minimum, the average and the maximum time (in seconds) required + to process a REQ_CALL message from client. This includes the time + required to migrate a record from remote node, if the record is + not available on the local node. + </para> + </refsect2> + + <refsect2> + <title>childwrite_latency</title> + <para>Default: 0</para> + <para> + The minimum, the average and the maximum time (in seconds) + required to update records under a transaction. + </para> + </refsect2> + </refsect1> + + <refsect1> + <title>DATABASE STATISTICS</title> + + <para> + CTDB maintains per database statistics about important operations. + See the <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry> command + <command>dbstatistics</command> for displaying database statistics. + </para> + + <refsect2> + <title>Example: ctdb dbstatistics notify_index.tdb</title> + <screen> +DB Statistics: notify_index.tdb + ro_delegations 0 + ro_revokes 0 + locks + total 131 + failed 0 + current 0 + pending 0 + hop_count_buckets: 9890 5454 26 1 0 0 0 0 0 0 0 0 0 0 0 0 + lock_buckets: 4 117 10 0 0 0 0 0 0 0 0 0 0 0 0 0 + locks_latency MIN/AVG/MAX 0.000683/0.004198/0.014730 sec out of 131 + Num Hot Keys: 3 + Count:7 Key:2f636c75737465726673 + Count:18 Key:2f636c757374657266732f64617461 + Count:7 Key:2f636c757374657266732f646174612f636c69656e7473 + </screen> + </refsect2> + + <refsect2> + <title>DB Statistics</title> + <para> + Name of the database. + </para> + </refsect2> + + <refsect2> + <title>ro_delegations</title> + <para> + Number of readonly delegations created in the database. + </para> + </refsect2> + + <refsect2> + <title>ro_revokes</title> + <para> + Number of readonly delegations revoked. The difference in + ro_delegations and ro_revokes indicates the currently active + readonly delegations. + </para> + </refsect2> + + <refsect2> + <title>locks</title> + <para> + This section lists locking statistics. + </para> + + <refsect3> + <title>total</title> + <para> + Number of completed lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>failed</title> + <para> + Number of failed lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + <refsect3> + <title>current</title> + <para> + Number of scheduled lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>pending</title> + <para> + Number of queued lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>hop_count_buckets</title> + <para> + Distribution of migration requests based on hop counts values. + Buckets are 0, < 2, < 4, < 8, + < 16, < 32, < 64, < 128, + < 256, < 512, < 1024, < 2048, + < 4096, < 8192, < 16384, ≥ 16384. + </para> + </refsect2> + + <refsect2> + <title>lock_buckets</title> + <para> + Distribution of record lock requests based on time required to + obtain locks. Buckets are < 1ms, < 10ms, + < 100ms, < 1s, < 2s, < 4s, + < 8s, < 16s, < 32s, < 64s, + ≥ 64s. + </para> + </refsect2> + + <refsect2> + <title>locks_latency</title> + <para> + The minimum, the average and the maximum time (in seconds) + required to obtain record locks. + </para> + </refsect2> + + <refsect2> + <title>Num Hot Keys</title> + <para> + Number of contended records determined by hop count. CTDB keeps + track of top 10 hot records and the output shows hex encoded + keys for the hot records. + </para> + </refsect2> + </refsect1> + + <refsect1> + <title>SEE ALSO</title> + <para> + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry>, + + <citerefentry><refentrytitle>ctdbd</refentrytitle> + <manvolnum>1</manvolnum></citerefentry>, + + <citerefentry><refentrytitle>ctdb-tunables</refentrytitle> + <manvolnum>7</manvolnum></citerefentry>, + + <ulink url="http://ctdb.samba.org/"/> + </para> + </refsect1> + + <refentryinfo> + <author> + <contrib> + This documentation was written by + Amitay Isaacs, + Martin Schwenke + </contrib> + </author> + + <copyright> + <year>2007</year> + <holder>Andrew Tridgell</holder> + <holder>Ronnie Sahlberg</holder> + </copyright> + <legalnotice> + <para> + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 3 of + the License, or (at your option) any later version. + </para> + <para> + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + PURPOSE. See the GNU General Public License for more details. + </para> + <para> + You should have received a copy of the GNU General Public + License along with this program; if not, see + <ulink url="http://www.gnu.org/licenses"/>. + </para> + </legalnotice> + </refentryinfo> + +</refentry> + |