1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
|
.\" Copyright (C) 2003 Davide Libenzi
.\" Davide Libenzi <davidel@xmailserver.org>
.\" and Copyright 2009, 2014, 2016, 2018, 2019 Michael Kerrisk <tk.manpages@gmail.com>
.\"
.\" SPDX-License-Identifier: GPL-2.0-or-later
.\"
.TH epoll_ctl 2 2023-04-03 "Linux man-pages 6.05.01"
.SH NAME
epoll_ctl \- control interface for an epoll file descriptor
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
.SH SYNOPSIS
.nf
.B #include <sys/epoll.h>
.PP
.BI "int epoll_ctl(int " epfd ", int " op ", int " fd ,
.BI " struct epoll_event *_Nullable " event );
.fi
.SH DESCRIPTION
This system call is used to add, modify, or remove
entries in the interest list of the
.BR epoll (7)
instance
referred to by the file descriptor
.IR epfd .
It requests that the operation
.I op
be performed for the target file descriptor,
.IR fd .
.PP
Valid values for the
.I op
argument are:
.TP
.B EPOLL_CTL_ADD
Add an entry to the interest list of the epoll file descriptor,
.IR epfd .
The entry includes the file descriptor,
.IR fd ,
a reference to the corresponding open file description (see
.BR epoll (7)
and
.BR open (2)),
and the settings specified in
.IR event .
.TP
.B EPOLL_CTL_MOD
Change the settings associated with
.I fd
in the interest list to the new settings specified in
.IR event .
.TP
.B EPOLL_CTL_DEL
Remove (deregister) the target file descriptor
.I fd
from the interest list.
The
.I event
argument is ignored and can be NULL (but see BUGS below).
.PP
The
.I event
argument describes the object linked to the file descriptor
.IR fd .
The
.I struct epoll_event
is described in
.BR epoll_event (3type).
.PP
The
.I data
member of the
.I epoll_event
structure specifies data that the kernel should save and then return (via
.BR epoll_wait (2))
when this file descriptor becomes ready.
.PP
The
.I events
member of the
.I epoll_event
structure is a bit mask composed by ORing together zero or more event types,
returned by
.BR epoll_wait (2),
and input flags, which affect its behaviour, but aren't returned.
The available event types are:
.TP
.B EPOLLIN
The associated file is available for
.BR read (2)
operations.
.TP
.B EPOLLOUT
The associated file is available for
.BR write (2)
operations.
.TP
.BR EPOLLRDHUP " (since Linux 2.6.17)"
Stream socket peer closed connection,
or shut down writing half of connection.
(This flag is especially useful for writing simple code to detect
peer shutdown when using edge-triggered monitoring.)
.TP
.B EPOLLPRI
There is an exceptional condition on the file descriptor.
See the discussion of
.B POLLPRI
in
.BR poll (2).
.TP
.B EPOLLERR
Error condition happened on the associated file descriptor.
This event is also reported for the write end of a pipe when the read end
has been closed.
.IP
.BR epoll_wait (2)
will always report for this event; it is not necessary to set it in
.I events
when calling
.BR epoll_ctl ().
.TP
.B EPOLLHUP
Hang up happened on the associated file descriptor.
.IP
.BR epoll_wait (2)
will always wait for this event; it is not necessary to set it in
.I events
when calling
.BR epoll_ctl ().
.IP
Note that when reading from a channel such as a pipe or a stream socket,
this event merely indicates that the peer closed its end of the channel.
Subsequent reads from the channel will return 0 (end of file)
only after all outstanding data in the channel has been consumed.
.PP
And the available input flags are:
.TP
.B EPOLLET
Requests edge-triggered notification for the associated file descriptor.
The default behavior for
.B epoll
is level-triggered.
See
.BR epoll (7)
for more detailed information about edge-triggered and
level-triggered notification.
.TP
.BR EPOLLONESHOT " (since Linux 2.6.2)"
Requests one-shot notification for the associated file descriptor.
This means that after an event notified for the file descriptor by
.BR epoll_wait (2),
the file descriptor is disabled in the interest list and no other events
will be reported by the
.B epoll
interface.
The user must call
.BR epoll_ctl ()
with
.B EPOLL_CTL_MOD
to rearm the file descriptor with a new event mask.
.TP
.BR EPOLLWAKEUP " (since Linux 3.5)"
.\" commit 4d7e30d98939a0340022ccd49325a3d70f7e0238
If
.B EPOLLONESHOT
and
.B EPOLLET
are clear and the process has the
.B CAP_BLOCK_SUSPEND
capability,
ensure that the system does not enter "suspend" or
"hibernate" while this event is pending or being processed.
The event is considered as being "processed" from the time
when it is returned by a call to
.BR epoll_wait (2)
until the next call to
.BR epoll_wait (2)
on the same
.BR epoll (7)
file descriptor,
the closure of that file descriptor,
the removal of the event file descriptor with
.BR EPOLL_CTL_DEL ,
or the clearing of
.B EPOLLWAKEUP
for the event file descriptor with
.BR EPOLL_CTL_MOD .
See also BUGS.
.TP
.BR EPOLLEXCLUSIVE " (since Linux 4.5)"
Sets an exclusive wakeup mode for the epoll file descriptor that is being
attached to the target file descriptor,
.IR fd .
When a wakeup event occurs and multiple epoll file descriptors
are attached to the same target file using
.BR EPOLLEXCLUSIVE ,
one or more of the epoll file descriptors will receive an event with
.BR epoll_wait (2).
The default in this scenario (when
.B EPOLLEXCLUSIVE
is not set) is for all epoll file descriptors to receive an event.
.B EPOLLEXCLUSIVE
is thus useful for avoiding thundering herd problems in certain scenarios.
.IP
If the same file descriptor is in multiple epoll instances,
some with the
.B EPOLLEXCLUSIVE
flag, and others without, then events will be provided to all epoll
instances that did not specify
.BR EPOLLEXCLUSIVE ,
and at least one of the epoll instances that did specify
.BR EPOLLEXCLUSIVE .
.IP
The following values may be specified in conjunction with
.BR EPOLLEXCLUSIVE :
.BR EPOLLIN ,
.BR EPOLLOUT ,
.BR EPOLLWAKEUP ,
and
.BR EPOLLET .
.B EPOLLHUP
and
.B EPOLLERR
can also be specified, but this is not required:
as usual, these events are always reported if they occur,
regardless of whether they are specified in
.IR events .
Attempts to specify other values in
.I events
yield the error
.BR EINVAL .
.IP
.B EPOLLEXCLUSIVE
may be used only in an
.B EPOLL_CTL_ADD
operation; attempts to employ it with
.B EPOLL_CTL_MOD
yield an error.
If
.B EPOLLEXCLUSIVE
has been set using
.BR epoll_ctl (),
then a subsequent
.B EPOLL_CTL_MOD
on the same
.IR epfd ,\~ fd
pair yields an error.
A call to
.BR epoll_ctl ()
that specifies
.B EPOLLEXCLUSIVE
in
.I events
and specifies the target file descriptor
.I fd
as an epoll instance will likewise fail.
The error in all of these cases is
.BR EINVAL .
.SH RETURN VALUE
When successful,
.BR epoll_ctl ()
returns zero.
When an error occurs,
.BR epoll_ctl ()
returns \-1 and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
.B EBADF
.I epfd
or
.I fd
is not a valid file descriptor.
.TP
.B EEXIST
.I op
was
.BR EPOLL_CTL_ADD ,
and the supplied file descriptor
.I fd
is already registered with this epoll instance.
.TP
.B EINVAL
.I epfd
is not an
.B epoll
file descriptor,
or
.I fd
is the same as
.IR epfd ,
or the requested operation
.I op
is not supported by this interface.
.TP
.B EINVAL
An invalid event type was specified along with
.B EPOLLEXCLUSIVE
in
.IR events .
.TP
.B EINVAL
.I op
was
.B EPOLL_CTL_MOD
and
.I events
included
.BR EPOLLEXCLUSIVE .
.TP
.B EINVAL
.I op
was
.B EPOLL_CTL_MOD
and the
.B EPOLLEXCLUSIVE
flag has previously been applied to this
.IR epfd ,\~ fd
pair.
.TP
.B EINVAL
.B EPOLLEXCLUSIVE
was specified in
.I event
and
.I fd
refers to an epoll instance.
.TP
.B ELOOP
.I fd
refers to an epoll instance and this
.B EPOLL_CTL_ADD
operation would result in a circular loop of epoll instances
monitoring one another or a nesting depth of epoll instances
greater than 5.
.TP
.B ENOENT
.I op
was
.B EPOLL_CTL_MOD
or
.BR EPOLL_CTL_DEL ,
and
.I fd
is not registered with this epoll instance.
.TP
.B ENOMEM
There was insufficient memory to handle the requested
.I op
control operation.
.TP
.B ENOSPC
The limit imposed by
.I /proc/sys/fs/epoll/max_user_watches
was encountered while trying to register
.RB ( EPOLL_CTL_ADD )
a new file descriptor on an epoll instance.
See
.BR epoll (7)
for further details.
.TP
.B EPERM
The target file
.I fd
does not support
.BR epoll .
This error can occur if
.I fd
refers to, for example, a regular file or a directory.
.SH STANDARDS
Linux.
.SH HISTORY
Linux 2.6,
.\" To be precise: Linux 2.5.44.
.\" The interface should be finalized by Linux 2.5.66.
glibc 2.3.2.
.SH NOTES
The
.B epoll
interface supports all file descriptors that support
.BR poll (2).
.SH BUGS
Before Linux 2.6.9, the
.B EPOLL_CTL_DEL
operation required a non-null pointer in
.IR event ,
even though this argument is ignored.
Since Linux 2.6.9,
.I event
can be specified as NULL
when using
.BR EPOLL_CTL_DEL .
Applications that need to be portable to kernels before Linux 2.6.9
should specify a non-null pointer in
.IR event .
.PP
If
.B EPOLLWAKEUP
is specified in
.IR flags ,
but the caller does not have the
.B CAP_BLOCK_SUSPEND
capability, then the
.B EPOLLWAKEUP
flag is
.IR "silently ignored" .
This unfortunate behavior is necessary because no validity
checks were performed on the
.I flags
argument in the original implementation, and the addition of the
.B EPOLLWAKEUP
with a check that caused the call to fail if the caller did not have the
.B CAP_BLOCK_SUSPEND
capability caused a breakage in at least one existing user-space
application that happened to randomly (and uselessly) specify this bit.
.\" commit a8159414d7e3af7233e7a5a82d1c5d85379bd75c (behavior change)
.\" https://lwn.net/Articles/520198/
A robust application should therefore double check that it has the
.B CAP_BLOCK_SUSPEND
capability if attempting to use the
.B EPOLLWAKEUP
flag.
.SH SEE ALSO
.BR epoll_create (2),
.BR epoll_wait (2),
.BR poll (2),
.BR epoll (7)
|