1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
|
.\" -*- mode: troff; coding: utf-8 -*-
.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
.ie n \{\
. ds C` ""
. ds C' ""
'br\}
.el\{\
. ds C`
. ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
. if \nF \{\
. de IX
. tm Index:\\$1\t\\n%\t"\\$2"
..
. if !\nF==2 \{\
. nr % 0
. nr F 2
. \}
. \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "Encode::MIME::Header 3pm"
.TH Encode::MIME::Header 3pm 2023-11-28 "perl v5.38.2" "Perl Programmers Reference Guide"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH NAME
Encode::MIME::Header \-\- MIME encoding for an unstructured email header
.SH SYNOPSIS
.IX Header "SYNOPSIS"
.Vb 1
\& use Encode qw(encode decode);
\&
\& my $mime_str = encode("MIME\-Header", "Sample:Text \eN{U+263A}");
\& # $mime_str is "=?UTF\-8?B?U2FtcGxlOlRleHQg4pi6?="
\&
\& my $mime_q_str = encode("MIME\-Q", "Sample:Text \eN{U+263A}");
\& # $mime_q_str is "=?UTF\-8?Q?Sample=3AText_=E2=98=BA?="
\&
\& my $str = decode("MIME\-Header",
\& "=?ISO\-8859\-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=\er\en " .
\& "=?ISO\-8859\-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?="
\& );
\& # $str is "If you can read this you understand the example."
\&
\& use Encode qw(decode :fallbacks);
\& use Encode::MIME::Header;
\& local $Encode::MIME::Header::STRICT_DECODE = 1;
\& my $strict_string = decode("MIME\-Header", $mime_string, FB_CROAK);
\& # use strict decoding and croak on errors
.Ve
.SH ABSTRACT
.IX Header "ABSTRACT"
This module implements RFC 2047 <https://tools.ietf.org/html/rfc2047> MIME
encoding for an unstructured field body of the email header. It can also be
used for RFC 822 <https://tools.ietf.org/html/rfc822> 'text' token. However,
it cannot be used directly for the whole header with the field name or for the
structured header fields like From, To, Cc, Message-Id, etc... There are 3
encoding names supported by this module: \f(CW\*(C`MIME\-Header\*(C'\fR, \f(CW\*(C`MIME\-B\*(C'\fR and
\&\f(CW\*(C`MIME\-Q\*(C'\fR.
.SH DESCRIPTION
.IX Header "DESCRIPTION"
Decode method takes an unstructured field body of the email header (or
RFC 822 <https://tools.ietf.org/html/rfc822> 'text' token) as its input and
decodes each MIME encoded-word from input string to a sequence of bytes
according to RFC 2047 <https://tools.ietf.org/html/rfc2047> and
RFC 2231 <https://tools.ietf.org/html/rfc2231>. Subsequently, each sequence
of bytes with the corresponding MIME charset is decoded with
the Encode module and finally, one output string is returned. Text
parts of the input string which do not contain MIME encoded-word stay
unmodified in the output string. Folded newlines between two consecutive MIME
encoded-words are discarded, others are preserved in the output string.
\&\f(CW\*(C`MIME\-B\*(C'\fR can decode Base64 variant, \f(CW\*(C`MIME\-Q\*(C'\fR can decode Quoted-Printable
variant and \f(CW\*(C`MIME\-Header\*(C'\fR can decode both of them. If Encode module
does not support particular MIME charset or chosen variant then an action based
on CHECK flags is performed (by default, the
MIME encoded-word is not decoded).
.PP
Encode method takes a scalar string as its input and uses
strict UTF\-8 encoder for encoding it to UTF\-8
bytes. Then a sequence of UTF\-8 bytes is encoded into MIME encoded-words
(\f(CW\*(C`MIME\-Header\*(C'\fR and \f(CW\*(C`MIME\-B\*(C'\fR use a Base64 variant while \f(CW\*(C`MIME\-Q\*(C'\fR uses a
Quoted-Printable variant) where each MIME encoded-word is limited to 75
characters. MIME encoded-words are separated by \f(CW\*(C`CRLF SPACE\*(C'\fR and joined to
one output string. Output string is suitable for unstructured field body of
the email header.
.PP
Both encode and decode methods propagate
CHECK flags when encoding and decoding the
MIME charset.
.SH BUGS
.IX Header "BUGS"
Versions prior to 2.22 (part of Encode 2.83) have a malfunctioning decoder
and encoder. The MIME encoder infamously inserted additional spaces or
discarded white spaces between consecutive MIME encoded-words, which led to
invalid MIME headers produced by this module. The MIME decoder had a tendency
to discard white spaces, incorrectly interpret data or attempt to decode Base64
MIME encoded-words as Quoted-Printable. These problems were fixed in version
2.22. It is highly recommended not to use any version prior 2.22!
.PP
Versions prior to 2.24 (part of Encode 2.87) ignored
CHECK flags. The MIME encoder used
not strict utf8 encoder for input Unicode
strings which could lead to invalid UTF\-8 sequences. MIME decoder used also
not strict utf8 decoder and additionally
called the decode method with a \f(CW\*(C`Encode::FB_PERLQQ\*(C'\fR flag (thus user-specified
CHECK flags were ignored). Moreover, it
automatically croaked when a MIME encoded-word contained unknown encoding.
Since version 2.24, this module uses
strict UTF\-8 encoder and decoder. And
CHECK flags are correctly propagated.
.PP
Since version 2.22 (part of Encode 2.83), the MIME encoder should be fully
compliant to RFC 2047 <https://tools.ietf.org/html/rfc2047> and
RFC 2231 <https://tools.ietf.org/html/rfc2231>. Due to the aforementioned
bugs in previous versions of the MIME encoder, there is a \fIless strict\fR
compatible mode for the MIME decoder which is used by default. It should be
able to decode MIME encoded-words encoded by pre 2.22 versions of this module.
However, note that this is not correct according to
RFC 2047 <https://tools.ietf.org/html/rfc2047>.
.PP
In default \fInot strict\fR mode the MIME decoder attempts to decode every substring
which looks like a MIME encoded-word. Therefore, the MIME encoded-words do not
need to be separated by white space. To enforce a correct \fIstrict\fR mode, set
variable \f(CW$Encode::MIME::Header::STRICT_DECODE\fR to 1 e.g. by localizing:
.PP
.Vb 2
\& use Encode::MIME::Header;
\& local $Encode::MIME::Header::STRICT_DECODE = 1;
.Ve
.SH AUTHORS
.IX Header "AUTHORS"
Pali <pali@cpan.org>
.SH "SEE ALSO"
.IX Header "SEE ALSO"
Encode,
RFC 822 <https://tools.ietf.org/html/rfc822>,
RFC 2047 <https://tools.ietf.org/html/rfc2047>,
RFC 2231 <https://tools.ietf.org/html/rfc2231>
|