summaryrefslogtreecommitdiffstats
path: root/doc/rados/troubleshooting/memory-profiling.rst
blob: 8e58f2d76fbbd4521bb06261dbe8cae0bde464b3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
==================
 Memory Profiling
==================

Ceph Monitor, OSD, and MDS can report ``TCMalloc`` heap profiles. Install
``google-perftools`` if you want to generate these. Your OS distribution might
package this under a different name (for example, ``gperftools``), and your OS
distribution might use a different package manager. Run a command similar to
this one to install ``google-perftools``: 

.. prompt:: bash 

    sudo apt-get install google-perftools

The profiler dumps output to your ``log file`` directory (``/var/log/ceph``).
See `Logging and Debugging`_ for details.

To view the profiler logs with Google's performance tools, run the following
command:

.. prompt:: bash

    google-pprof --text {path-to-daemon}  {log-path/filename}

For example::

    $ ceph tell osd.0 heap start_profiler
    $ ceph tell osd.0 heap dump
    osd.0 tcmalloc heap stats:------------------------------------------------
    MALLOC:        2632288 (    2.5 MiB) Bytes in use by application
    MALLOC: +       499712 (    0.5 MiB) Bytes in page heap freelist
    MALLOC: +       543800 (    0.5 MiB) Bytes in central cache freelist
    MALLOC: +       327680 (    0.3 MiB) Bytes in transfer cache freelist
    MALLOC: +      1239400 (    1.2 MiB) Bytes in thread cache freelists
    MALLOC: +      1142936 (    1.1 MiB) Bytes in malloc metadata
    MALLOC:   ------------
    MALLOC: =      6385816 (    6.1 MiB) Actual memory used (physical + swap)
    MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
    MALLOC:   ------------
    MALLOC: =      6385816 (    6.1 MiB) Virtual address space used
    MALLOC:
    MALLOC:            231              Spans in use
    MALLOC:             56              Thread heaps in use
    MALLOC:           8192              Tcmalloc page size
    ------------------------------------------------
    Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
    Bytes released to the OS take up virtual address space but no physical memory.
    $ google-pprof --text \
                   /usr/bin/ceph-osd  \
                   /var/log/ceph/ceph-osd.0.profile.0001.heap
     Total: 3.7 MB
     1.9  51.1%  51.1%      1.9  51.1% ceph::log::Log::create_entry
     1.8  47.3%  98.4%      1.8  47.3% std::string::_Rep::_S_create
     0.0   0.4%  98.9%      0.0   0.6% SimpleMessenger::add_accept_pipe
     0.0   0.4%  99.2%      0.0   0.6% decode_message
     ...

Performing another heap dump on the same daemon creates another file. It is
convenient to compare the new file to a file created by a previous heap dump to
show what has grown in the interval. For example::

    $ google-pprof --text --base out/osd.0.profile.0001.heap \
          ceph-osd out/osd.0.profile.0003.heap
     Total: 0.2 MB
     0.1  50.3%  50.3%      0.1  50.3% ceph::log::Log::create_entry
     0.1  46.6%  96.8%      0.1  46.6% std::string::_Rep::_S_create
     0.0   0.9%  97.7%      0.0  26.1% ReplicatedPG::do_op
     0.0   0.8%  98.5%      0.0   0.8% __gnu_cxx::new_allocator::allocate

See `Google Heap Profiler`_ for additional details.

After you have installed the heap profiler, start your cluster and begin using
the heap profiler. You can enable or disable the heap profiler at runtime, or
ensure that it runs continuously. When running commands based on the examples
that follow, do the following:

#. replace ``{daemon-type}`` with ``mon``, ``osd`` or ``mds`` 
#. replace ``{daemon-id}`` with the OSD number or the MON ID or the MDS ID 


Starting the Profiler
---------------------

To start the heap profiler, run a command of the following form: 

.. prompt:: bash

   ceph tell {daemon-type}.{daemon-id} heap start_profiler

For example:

.. prompt:: bash

   ceph tell osd.1 heap start_profiler

Alternatively, if the ``CEPH_HEAP_PROFILER_INIT=true`` variable is found in the
environment, the profile will be started when the daemon starts running.

Printing Stats
--------------

To print out statistics, run a command of the following form:

.. prompt:: bash

   ceph  tell {daemon-type}.{daemon-id} heap stats

For example:

.. prompt:: bash

   ceph tell osd.0 heap stats

.. note:: The reporting of stats with this command does not require the
   profiler to be running and does not dump the heap allocation information to
   a file.


Dumping Heap Information
------------------------

To dump heap information, run a command of the following form:

.. prompt:: bash

   ceph tell {daemon-type}.{daemon-id} heap dump

For example:

.. prompt:: bash

   ceph tell mds.a heap dump

.. note:: Dumping heap information works only when the profiler is running.


Releasing Memory
----------------

To release memory that ``tcmalloc`` has allocated but which is not being used
by the Ceph daemon itself, run a command of the following form:

.. prompt:: bash

   ceph tell {daemon-type}{daemon-id} heap release

For example:

.. prompt:: bash

    ceph tell osd.2 heap release


Stopping the Profiler
---------------------

To stop the heap profiler, run a command of the following form:

.. prompt:: bash

   ceph tell {daemon-type}.{daemon-id} heap stop_profiler

For example:

.. prompt:: bash

   ceph tell osd.0 heap stop_profiler

.. _Logging and Debugging: ../log-and-debug
.. _Google Heap Profiler: http://goog-perftools.sourceforge.net/doc/heap_profiler.html

Alternative Methods of  Memory Profiling
----------------------------------------

Running Massif heap profiler with Valgrind
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The Massif heap profiler tool can be used with Valgrind to measure how much
heap memory is used. This method is well-suited to troubleshooting RadosGW.

See the `Massif documentation
<https://valgrind.org/docs/manual/ms-manual.html>`_ for more information.

Install Valgrind from the package manager for your distribution then start the
Ceph daemon you want to troubleshoot:

.. prompt:: bash

   sudo -u ceph valgrind --max-threads=1024 --tool=massif /usr/bin/radosgw -f --cluster ceph --name NAME --setuser ceph --setgroup ceph

When this command has completed its run, a file with a name of the form
``massif.out.<pid>`` will be saved in your current working directory. To run
the command above, the user who runs it must have write permissions in the
current directory.

Run the ``ms_print`` command to get a graph and statistics from the collected
data in the ``massif.out.<pid>`` file:

.. prompt:: bash

   ms_print massif.out.12345

The output of this command is helpful when submitting a bug report.