1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
|
.\" This manpage is Copyright (C) 2023 Collabora;
.\" Written by Muhammad Usama Anjum <usama.anjum@collabora.com>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
.TH ioctl_pagemap_scan 2 2024-05-02 "Linux man-pages 6.8"
.SH NAME
ioctl_pagemap_scan \- get and/or clear page flags
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
.SH SYNOPSIS
.nf
.BR "#include <linux/fs.h>" " /* Definition of " "struct pm_scan_arg" ,
.BR " struct page_region" ", and " PAGE_IS_* " constants */"
.B #include <sys/ioctl.h>
.P
.BI "int ioctl(int " pagemap_fd ", PAGEMAP_SCAN, struct pm_scan_arg *" arg );
.fi
.SH DESCRIPTION
This
.BR ioctl (2)
is used to get and optionally clear some specific flags from page table entries.
The information is returned with
.B PAGE_SIZE
granularity.
.P
To start tracking the written state (flag) of a page or range of memory,
the
.B UFFD_FEATURE_WP_ASYNC
must be enabled by
.B UFFDIO_API
.BR ioctl (2)
on
.B userfaultfd
and memory range must be registered with
.B UFFDIO_REGISTER
.BR ioctl (2)
in
.B UFFDIO_REGISTER_MODE_WP
mode.
.SS Supported page flags
The following page table entry flags are supported:
.TP
.B PAGE_IS_WPALLOWED
The page has asynchronous write-protection enabled.
.TP
.B PAGE_IS_WRITTEN
The page has been written to from the time it was write protected.
.TP
.B PAGE_IS_FILE
The page is file backed.
.TP
.B PAGE_IS_PRESENT
The page is present in the memory.
.TP
.B PAGE_IS_SWAPPED
The page is swapped.
.TP
.B PAGE_IS_PFNZERO
The page has zero PFN.
.TP
.B PAGE_IS_HUGE
The page is THP or Hugetlb backed.
.SS Supported operations
The get operation is always performed
if the output buffer is specified.
The other operations are as following:
.TP
.B PM_SCAN_WP_MATCHING
Write protect the matched pages.
.TP
.B PM_SCAN_CHECK_WPASYNC
Abort the scan
when a page is found
which doesn't have the Userfaultfd Asynchronous Write protection enabled.
.SS The \f[I]struct pm_scan_arg\f[] argument
.EX
struct pm_scan_arg {
__u64 size;
__u64 flags;
__u64 start;
__u64 end;
__u64 walk_end;
__u64 vec;
__u64 vec_len;
__u64 max_pages
__u64 category_inverted;
__u64 category_mask;
__u64 category_anyof_mask
__u64 return_mask;
};
.EE
.TP
.B size
This field should be set to the size of the structure in bytes,
as in
.IR sizeof(struct\~pm_scan_arg) .
.TP
.B flags
The operations to be performed are specified in it.
.TP
.B start
The starting address of the scan is specified in it.
.TP
.B end
The ending address of the scan is specified in it.
.TP
.B walk_end
The kernel returns the scan's ending address in it.
The
.I walk_end
equal to
.I end
means that scan has completed on the entire range.
.TP
.B vec
The address of
.I page_region
array for output.
.IP
.in +4n
.EX
struct page_region {
__u64 start;
__u64 end;
__u64 categories;
};
.EE
.in
.TP
.B vec_len
The length of the
.I page_region
struct array.
.TP
.B max_pages
It is the optional limit for the number of output pages required.
.TP
.B category_inverted
.BI PAGE_IS_ *
categories which values match if 0 instead of 1.
.TP
.B category_mask
Skip pages for which any
.BI PAGE_IS_ *
category doesn't match.
.TP
.B category_anyof_mask
Skip pages for which no
.BI PAGE_IS_ *
category matches.
.TP
.B return_mask
.BI PAGE_IS_ *
categories that are to be reported in
.IR page_region .
.SH RETURN VALUE
On error, \-1 is returned, and
.I errno
is set to indicate the error.
.SH ERRORS
Error codes can be one of, but are not limited to, the following:
.TP
.B EINVAL
Invalid arguments i.e.,
invalid
.I size
of the argument,
invalid
.IR flags ,
invalid
.IR categories ,
the
.I start
address isn't aligned with
.BR PAGE_SIZE ,
or
.I vec_len
is specified when
.I vec
is NULL.
.TP
.B EFAULT
Invalid
.I arg
pointer,
invalid
.I vec
pointer,
or invalid address range specified by
.I start
and
.IR end .
.TP
.B ENOMEM
No memory is available.
.TP
.B EINTR
Fetal signal is pending.
.SH STANDARDS
Linux.
.SH HISTORY
Linux 6.7.
.SH SEE ALSO
.BR ioctl (2)
|