summaryrefslogtreecommitdiffstats
path: root/source/configuration/modules/mmanon.rst
blob: b1d1a4b8e019f1f5948173439ae16f915263a896 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
****************************************
IP Address Anonymization Module (mmanon)
****************************************

===========================  ===========================================================================
**Module Name:**             **mmanon**
**Author:**                  `Rainer Gerhards <https://rainer.gerhards.net/>`_ <rgerhards@adiscon.com>
**Available since:**         7.3.7
===========================  ===========================================================================


Purpose
=======

The mmanon module permits to anonymize IP addresses. It is a message
modification module that actually changes the IP address inside the
message, so after calling mmanon, the original message can no longer be
obtained. Note that anonymization will break digital signatures on the
message, if they exist.

Please note that log files can also be anonymized via
`SLFA <http://jan.gerhards.net/p/slfa.html>`_ after they
have been created.

*How are IP-Addresses defined?*

We assume that an IPv4 address consists of four octets in dotted notation,
where each of the octets has a value between 0 and 255, inclusively.

An IPv6 is defined by being bewtween zero and eight hex values between 0
and ffff. These are separated by ':'. Leading zeros in blocks can be omitted
and blocks full of zeros can be abbreviated by using '::'. However, this
can ony happen once in an IP address.

An IPv6 address with embedded IPv4 is an IPv6 address where the last two blocks
have been replaced by an IPv4 address. (see also: RFC4291, 2.2.3) 


Configuration Parameters
========================

.. note::

   Parameter names are case-insensitive.


Action Parameters
-----------------

Parameters starting with 'IPv4.' will configure IPv4 anonymization,
while 'IPv6.' parameters do the same for IPv6 anonymization.


ipv4.enable
^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "binary", "on", "no", "none"

Allows to enable or disable the anonymization of IPv4 addresses.


ipv4.mode
^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "word", "zero", "no", "none"

There exist the "simple", "random", "random-consitent", and "zero"
modes. In simple mode, only octets as whole can be anonymized
and the length of the message is never changed. This means
that when the last three octets of the address 10.1.12.123 are
anonymized, the result will be 10.0.00.000. This means that
the length of the original octets is still visible and may be used
to draw some privacy-evasive conclusions. This mode is slightly
faster than the other modes, and this may matter in high
throughput environments.

The modes "random" and "random-consistent" are very similar, in
that they both anonymize ip-addresses by randomizing the last bits (any
number) of a given address. However, while "random" mode assigns a new
random ip-address for every address in a message, "random-consitent" will
assign the same randomized address to every instance of the same original address.

The default "zero" mode will do full anonymization of any number
of bits and it will also normalize the address, so that no information
about the original IP address is available. So in the above example,
10.1.12.123 would be anonymized to 10.0.0.0.


ipv4.bits
^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "positive integer", "16", "no", "none"

This sets the number of bits that should be anonymized (bits are from
the right, so lower bits are anonymized first). This setting permits
to save network information while still anonymizing user-specific
data. The more bits you discard, the better the anonymization
obviously is. The default of 16 bits reflects what German data
privacy rules consider as being sufficinetly anonymized. We assume,
this can also be used as a rough but conservative guideline for other
countries.
Note: when in simple mode, only bits on a byte boundary can be
specified. As such, any value other than 8, 16, 24 or 32 is invalid.
If an invalid value is given, it is rounded to the next byte boundary
(so we favor stronger anonymization in that case). For example, a bit
value of 12 will become 16 in simple mode (an error message is also
emitted).


ipv4.replaceChar
^^^^^^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "char", "x", "no", "none"

In simple mode, this sets the character that the to-be-anonymized
part of the IP address is to be overwritten with. In any other
mode the parameter is ignored if set.


ipv6.enable
^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "binary", "on", "no", "none"

Allows to enable or disable the anonymization of IPv6 addresses.


ipv6.anonmode
^^^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "word", "zero", "no", "none"

This defines the mode, in which IPv6 addresses will be anonymized.
There exist the "random", "random-consistent", and "zero" modes.

The modes "random" and "random-consistent" are very similar, in
that they both anonymize ip-addresses by randomizing the last bits (any
number) of a given address. However, while "random" mode assigns a new
random ip-address for every address in a message, "random-consistent" will
assign the same randomized address to every instance of the same original address.

The default "zero" mode will do full anonymization of any number
of bits and it will also normalize the address, so that no information
about the original IP address is available.

Also note that an anonymmized IPv6 address will be normalized, meaning
there will be no abbreviations, leading zeros will **not** be displayed,
and capital letters in the hex numerals will be lowercase.


ipv6.bits
^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "positive integer", "96", "no", "none"

This sets the number of bits that should be anonymized (bits are from
the right, so lower bits are anonymized first). This setting permits
to save network information while still anonymizing user-specific
data. The more bits you discard, the better the anonymization
obviously is. The default of 96 bits reflects what German data
privacy rules consider as being sufficinetly anonymized. We assume,
this can also be used as a rough but conservative guideline for other
countries.


embeddedipv4.enable
^^^^^^^^^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "binary", "on", "no", "none"

Allows to enable or disable the anonymization of IPv6 addresses with embedded IPv4.


embeddedipv4.anonmode
^^^^^^^^^^^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "word", "zero", "no", "none"

This defines the mode, in which IPv6 addresses will be anonymized.
There exist the "random", "random-consistent", and "zero" modes.

The modes "random" and "random-consistent" are very similar, in
that they both anonymize ip-addresses by randomizing the last bits (any
number) of a given address. However, while "random" mode assigns a new
random ip-address for every address in a message, "random-consistent" will
assign the same randomized address to every instance of the same original address.

The default "zero" mode will do full anonymization of any number
of bits and it will also normalize the address, so that no information
about the original IP address is available.

Also note that an anonymmized IPv6 address will be normalized, meaning
there will be no abbreviations, leading zeros will **not** be displayed,
and capital letters in the hex numerals will be lowercase.


embeddedipv4.bits
^^^^^^^^^^^^^^^^^

.. csv-table::
   :header: "type", "default", "mandatory", "|FmtObsoleteName| directive"
   :widths: auto
   :class: parameter-table

   "positive integer", "96", "no", "none"

This sets the number of bits that should be anonymized (bits are from
the right, so lower bits are anonymized first). This setting permits
to save network information while still anonymizing user-specific
data. The more bits you discard, the better the anonymization
obviously is. The default of 96 bits reflects what German data
privacy rules consider as being sufficinetly anonymized. We assume,
this can also be used as a rough but conservative guideline for other
countries.


See Also
========

-  `Howto anonymize messages that go to specific
   files <http://www.rsyslog.com/howto-anonymize-messages-that-go-to-specific-files/>`_


Caveats/Known Bugs
==================

-  will **not** anonymize addresses in the header


Examples
========

Anonymizing messages
--------------------

In this snippet, we write one file without anonymization and another one
with the message anonymized. Note that once mmanon has run, access to
the original message is no longer possible (execept if stored in user
variables before anonymization).

.. code-block:: none

   module(load="mmanon")
   action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon" ipv6.enable="off")
   action(type="omfile" file="/path/to/anon.log")


Anonymizing a specific part of the ip address
---------------------------------------------

This next snippet is almost identical to the first one, but here we
anonymize the full IPv4 address. Note that by modifying the number of
bits, you can anonymize different parts of the address. Keep in mind
that in simple mode (used here), the bit values must match IP address
bytes, so for IPv4 only the values 8, 16, 24 and 32 are valid. Also, in
this example the replacement is done via asterisks instead of lower-case
"x"-letters. Also keep in mind that "replacementChar" can only be set in
simple mode.

.. code-block:: none

   module(load="mmanon") action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon" ipv4.bits="32" ipv4.mode="simple" replacementChar="\*" ipv6.enable="off")
   action(type="omfile" file="/path/to/anon.log")


Anonymizing an odd number of bits
---------------------------------

The next snippet is also based on the first one, but anonymizes an "odd"
number of bits, 12. The value of 12 is used by some folks as a
compromise between keeping privacy and still permitting to gain some more
in-depth insight from log files. Note that anonymizing 12 bits may be
insufficient to fulfill legal requirements (if such exist).

.. code-block:: none

   module(load="mmanon") action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon" ipv4.bits="12" ipv6.enable="off") action(type="omfile"
   file="/path/to/anon.log")


Anonymizing ipv4 and ipv6 addresses
-----------------------------------

You can also anonymize IPv4 and IPv6 in one go using a configuration like this.

.. code-block:: none

   module(load="mmanon") action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon" ipv4.bits="12" ipv6.bits="128" ipv6.anonmode="random") action(type="omfile"
   file="/path/to/anon.log")


Anonymizing with default values
-------------------------------

It is also possible to use the default configuration for both types of
anonymization. This will result in IPv4 addresses being anonymized in zero
mode anonymizing 16 bits. IPv6 addresses will also be anonymized in zero
mode anonymizing 96 bits.

.. code-block:: none

   module(load="mmanon")
   action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon")
   action(type="omfile" file="/path/to/anon.log")


Anonymizing only ipv6 addresses
-------------------------------

Another option is to only anonymize IPv6 addresses. When doing this you have to
disable IPv4 aonymization. This example will lead to only IPv6 addresses anonymized
(using the random-consistent mode).

.. code-block:: none

   module(load="mmanon")
   action(type="omfile" file="/path/to/non-anon.log")
   action(type="mmanon" ipv4.enable="off" ipv6.anonmode="random-consistent")
   action(type="omfile" file="/path/to/anon.log")