1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
|
<?xml version='1.0'?>
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<!-- SPDX-License-Identifier: LGPL-2.1-or-later -->
<refentry id="sd_event_add_memory_pressure" xmlns:xi="http://www.w3.org/2001/XInclude">
<refentryinfo>
<title>sd_event_add_memory_pressure</title>
<productname>systemd</productname>
</refentryinfo>
<refmeta>
<refentrytitle>sd_event_add_memory_pressure</refentrytitle>
<manvolnum>3</manvolnum>
</refmeta>
<refnamediv>
<refname>sd_event_add_memory_pressure</refname>
<refname>sd_event_source_set_memory_pressure_type</refname>
<refname>sd_event_source_set_memory_pressure_period</refname>
<refname>sd_event_trim_memory</refname>
<refpurpose>Add and configure an event source run as result of memory pressure</refpurpose>
</refnamediv>
<refsynopsisdiv>
<funcsynopsis>
<funcsynopsisinfo>#include <systemd/sd-event.h></funcsynopsisinfo>
<funcsynopsisinfo><token>typedef</token> struct sd_event_source sd_event_source;</funcsynopsisinfo>
<funcprototype>
<funcdef>int <function>sd_event_add_memory_pressure</function></funcdef>
<paramdef>sd_event *<parameter>event</parameter></paramdef>
<paramdef>sd_event_source **<parameter>ret_source</parameter></paramdef>
<paramdef>sd_event_handler_t <parameter>handler</parameter></paramdef>
<paramdef>void *<parameter>userdata</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_memory_pressure_type</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>const char *<parameter>type</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_source_set_memory_pressure_period</function></funcdef>
<paramdef>sd_event_source *<parameter>source</parameter></paramdef>
<paramdef>uint64_t <parameter>threshold_usec</parameter></paramdef>
<paramdef>uint64_t <parameter>window_usec</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>int <function>sd_event_trim_memory</function></funcdef>
<paramdef>void</paramdef>
</funcprototype>
</funcsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para><function>sd_event_add_memory_pressure()</function> adds a new event source that is triggered
whenever memory pressure is seen. This functionality is built around the Linux kernel's <ulink
url="https://docs.kernel.org/accounting/psi.html">Pressure Stall Information (PSI)</ulink> logic.</para>
<para>Expects an event loop object as first parameter, and returns the allocated event source object in
the second parameter, on success. The <parameter>handler</parameter> parameter is a function to call when
memory pressure is seen, or <constant>NULL</constant>. The handler function will be passed the
<parameter>userdata</parameter> pointer, which may be chosen freely by the caller. The handler may return
negative to signal an error (see below), other return values are ignored. If
<parameter>handler</parameter> is <constant>NULL</constant>, a default handler that compacts allocation
caches maintained by <filename>libsystemd</filename> as well as glibc (via <citerefentry
project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>)
will be used.</para>
<para>To destroy an event source object use
<citerefentry><refentrytitle>sd_event_source_unref</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
but note that the event source is only removed from the event loop when all references to the event
source are dropped. To make sure an event source does not fire anymore, even if it is still referenced,
disable the event source using
<citerefentry><refentrytitle>sd_event_source_set_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>
with <constant>SD_EVENT_OFF</constant>.</para>
<para>If the second parameter of <function>sd_event_add_memory_pressure()</function> is
<constant>NULL</constant> no reference to the event source object is returned. In this case the event
source is considered "floating", and will be destroyed implicitly when the event loop itself is
destroyed.</para>
<para>The event source will fire according to the following logic:</para>
<orderedlist>
<listitem><para>If the
<varname>$MEMORY_PRESSURE_WATCH</varname>/<varname>$MEMORY_PRESSURE_WRITE</varname> environment
variables are set at the time the event source is established, it will watch the file, FIFO or AF_UNIX
socket specified via <varname>$MEMORY_PRESSURE_WATCH</varname> (which must contain an absolute path
name) for <constant>POLLPRI</constant> (in case it is a regular file) or <constant>POLLIN</constant>
events (otherwise). After opening the inode, it will write the (decoded) Base64 data provided via
<varname>$MEMORY_PRESSURE_WRITE</varname> into it before it starts polling on it (the variable may be
unset in which case this is skipped). Typically, if used, <varname>$MEMORY_PRESSURE_WATCH</varname>
will contain a path such as <filename>/proc/pressure/memory</filename> or a path to a specific
<filename>memory.pressure</filename> file in the control group file system
(cgroupfs).</para></listitem>
<listitem><para>If these environment variables are not set, the local PSI interface file
<filename>memory.pressure</filename> of the control group the invoking process is running in is
used.</para></listitem>
<listitem><para>If that file does not exist, the system-wide PSI interface file
<filename>/proc/pressure/memory</filename> is watched instead.</para></listitem>
</orderedlist>
<para>Or in other words: preferably any explicit configuration passed in by an invoking service manager
(or similar) is used as notification source, before falling back to local notifications of the service,
and finally to global notifications of the system.</para>
<para>Well-behaving services and applications are recommended to react to memory pressure events by
executing one or more of the following operations, in order to ensure optimal behaviour even on loaded
and resource-constrained systems:</para>
<itemizedlist>
<listitem><para>Release allocation caches such as <function>malloc_trim()</function> or similar, both
implemented in the libraries consumed by the program and in private allocation caches of the program
itself.</para></listitem>
<listitem><para>Release any other form of in-memory caches that can easily be recovered if
needed (e.g. browser caches).</para></listitem>
<listitem><para>Terminate idle worker threads or processes, or similar.</para></listitem>
<listitem><para>Even exit entirely from the program if it is idle and can be automatically started when
needed (for example via socket or bus activation).</para></listitem>
</itemizedlist>
<para>Any of the suggested operations should help easing memory pressure situations and allowing the
system to make progress by reclaiming the memory for other purposes.</para>
<para>This event source typically fires on memory pressure stalls, i.e. when operational latency above a
configured threshold already has been seen. This should be taken into consideration when discussing
whether later latency to re-aquire any released resources is acceptable: it's usually more important to
think of the latencies that already happened than those coming up in future.</para>
<para>The <function>sd_event_source_set_memory_pressure_type()</function> and
<function>sd_event_source_set_memory_pressure_period()</function> functions can be used to fine-tune the
PSI parameters for pressure notifications. The former takes either <literal>some</literal>,
<literal>full</literal> as second parameter, the latter takes threshold and period times in microseconds
as parameters. For details about these three parameters see the PSI documentation. Note that these two
calls must be invoked immediately after allocating the event source, as they must be configured before
polling begins. Also note that these calls will fail if memory pressure parameterization has been passed
in via the <varname>$MEMORY_PRESSURE_WATCH</varname>/<varname>$MEMORY_PRESSURE_WRITE</varname>
environment variables (or in other words: configuration supplied by a service manager wins over internal
settings).</para>
<para>The <function>sd_event_trim_memory()</function> function releases various internal allocation
caches maintained by <filename>libsystemd</filename> and then invokes glibc's <citerefentry
project='man-pages'><refentrytitle>malloc_trim</refentrytitle><manvolnum>3</manvolnum></citerefentry>. This
makes the operation executed when the handler function parameter of
<function>sd_event_add_memory_pressure</function> is passed as <constant>NULL</constant> directly
accessible for invocation at any time (see above). This function will log a structured log message at
<constant>LOG_DEBUG</constant> level (with message ID f9b0be465ad540d0850ad32172d57c21) about the memory
pressure operation.</para>
<para>For further details see <ulink url="https://systemd.io/MEMORY_PRESSURE">Memory Pressure Handling in
systemd</ulink>.</para>
</refsect1>
<refsect1>
<title>Return Value</title>
<para>On success, these functions return 0 or a positive
integer. On failure, they return a negative errno-style error
code.</para>
<refsect2>
<title>Errors</title>
<para>Returned errors may indicate the following problems:</para>
<variablelist>
<varlistentry>
<term><constant>-ENOMEM</constant></term>
<listitem><para>Not enough memory to allocate an object.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EINVAL</constant></term>
<listitem><para>An invalid argument has been passed.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EHOSTDOWN</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable has been set to the literal
string <filename>/dev/null</filename>, in order to explicitly disable memory pressure
handling.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EBADMSG</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable has been set to an invalid
string, for example a relative rather than an absolute path.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-ENOTTY</constant></term>
<listitem><para>The <varname>$MEMORY_PRESSURE_WATCH</varname> variable points to a regular file
outside of the procfs or cgroupfs file systems.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EOPNOTSUPP</constant></term>
<listitem><para>No configuration via <varname>$MEMORY_PRESSURE_WATCH</varname> has been specified
and the local kernel does not support the PSI interface.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EBUSY</constant></term>
<listitem><para>This is returned by <function>sd_event_source_set_memory_pressure_type()</function>
and <function>sd_event_source_set_memory_pressure_period()</function> if invoked on event sources
at a time later than immediately after allocating them.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-ESTALE</constant></term>
<listitem><para>The event loop is already terminated.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-ECHILD</constant></term>
<listitem><para>The event loop has been created in a different process, library or module instance.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
<varlistentry>
<term><constant>-EDOM</constant></term>
<listitem><para>The passed event source is not a signal event source.</para>
<xi:include href="version-info.xml" xpointer="v254"/></listitem>
</varlistentry>
</variablelist>
</refsect2>
</refsect1>
<xi:include href="libsystemd-pkgconfig.xml" />
<refsect1>
<title>History</title>
<para><function>sd_event_add_memory_pressure()</function>,
<function>sd_event_source_set_memory_pressure_type()</function>,
<function>sd_event_source_set_memory_pressure_period()</function>, and
<function>sd_event_trim_memory()</function> were added in version 254.</para>
</refsect1>
<refsect1>
<title>See Also</title>
<para>
<citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd-event</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_new</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_add_io</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_add_time</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_add_child</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_add_inotify</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_add_defer</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_description</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_userdata</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
<citerefentry><refentrytitle>sd_event_source_set_floating</refentrytitle><manvolnum>3</manvolnum></citerefentry>
</para>
</refsect1>
</refentry>
|