1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
|
.\" -*- mode: troff; coding: utf-8 -*-
.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
.ie n \{\
. ds C` ""
. ds C' ""
'br\}
.el\{\
. ds C`
. ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
. if \nF \{\
. de IX
. tm Index:\\$1\t\\n%\t"\\$2"
..
. if !\nF==2 \{\
. nr % 0
. nr F 2
. \}
. \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "Pod::Man 3perl"
.TH Pod::Man 3perl 2024-02-11 "perl v5.38.2" "Perl Programmers Reference Guide"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH NAME
Pod::Man \- Convert POD data to formatted *roff input
.SH SYNOPSIS
.IX Header "SYNOPSIS"
.Vb 2
\& use Pod::Man;
\& my $parser = Pod::Man\->new (release => $VERSION, section => 8);
\&
\& # Read POD from STDIN and write to STDOUT.
\& $parser\->parse_file (\e*STDIN);
\&
\& # Read POD from file.pod and write to file.1.
\& $parser\->parse_from_file (\*(Aqfile.pod\*(Aq, \*(Aqfile.1\*(Aq);
.Ve
.SH DESCRIPTION
.IX Header "DESCRIPTION"
Pod::Man is a module to convert documentation in the POD format (the
preferred language for documenting Perl) into *roff input using the man
macro set. The resulting *roff code is suitable for display on a terminal
using \fBnroff\fR\|(1), normally via \fBman\fR\|(1), or printing using \fBtroff\fR\|(1).
It is conventionally invoked using the driver script \fBpod2man\fR, but it can
also be used directly.
.PP
By default (on non-EBCDIC systems), Pod::Man outputs UTF\-8. Its output should
work with the \fBman\fR program on systems that use \fBgroff\fR (most Linux
distributions) or \fBmandoc\fR (most BSD variants), but may result in mangled
output on older UNIX systems. To choose a different, possibly more
backward-compatible output mangling on such systems, set the \f(CW\*(C`encoding\*(C'\fR
option to \f(CW\*(C`roff\*(C'\fR (the default in earlier Pod::Man versions). See the
\&\f(CW\*(C`encoding\*(C'\fR option and "ENCODING" for more details.
.PP
See "COMPATIBILTY" for the versions of Pod::Man with significant
backward-incompatible changes (other than constructor options, whose versions
are documented below), and the versions of Perl that included them.
.SH "CLASS METHODS"
.IX Header "CLASS METHODS"
.IP new(ARGS) 4
.IX Item "new(ARGS)"
Create a new Pod::Man object. ARGS should be a list of key/value pairs, where
the keys are chosen from the following. Each option is annotated with the
version of Pod::Man in which that option was added with its current meaning.
.RS 4
.IP center 4
.IX Item "center"
[1.00] Sets the centered page header for the \f(CW\*(C`.TH\*(C'\fR macro. The default, if
this option is not specified, is \f(CW\*(C`User Contributed Perl Documentation\*(C'\fR.
.IP date 4
.IX Item "date"
[4.00] Sets the left-hand footer for the \f(CW\*(C`.TH\*(C'\fR macro. If this option is not
set, the contents of the environment variable POD_MAN_DATE, if set, will be
used. Failing that, the value of SOURCE_DATE_EPOCH, the modification date of
the input file, or the current time if \fBstat()\fR can't find that file (which will
be the case if the input is from \f(CW\*(C`STDIN\*(C'\fR) will be used. If taken from any
source other than POD_MAN_DATE (which is used verbatim), the date will be
formatted as \f(CW\*(C`YYYY\-MM\-DD\*(C'\fR and will be based on UTC (so that the output will
be reproducible regardless of local time zone).
.IP encoding 4
.IX Item "encoding"
[5.00] Specifies the encoding of the output. The value must be an encoding
recognized by the Encode module (see Encode::Supported), or the special
values \f(CW\*(C`roff\*(C'\fR or \f(CW\*(C`groff\*(C'\fR. The default on non-EBCDIC systems is UTF\-8.
.Sp
If the output contains characters that cannot be represented in this encoding,
that is an error that will be reported as configured by the \f(CW\*(C`errors\*(C'\fR option.
If error handling is other than \f(CW\*(C`die\*(C'\fR, the unrepresentable character will be
replaced with the Encode substitution character (normally \f(CW\*(C`?\*(C'\fR).
.Sp
If the \f(CW\*(C`encoding\*(C'\fR option is set to the special value \f(CW\*(C`groff\*(C'\fR (the default on
EBCDIC systems), or if the Encode module is not available and the encoding is
set to anything other than \f(CW\*(C`roff\*(C'\fR, Pod::Man will translate all non-ASCII
characters to \f(CW\*(C`\e[uNNNN]\*(C'\fR Unicode escapes. These are not traditionally part
of the *roff language, but are supported by \fBgroff\fR and \fBmandoc\fR and thus by
the majority of manual page processors in use today.
.Sp
If the \f(CW\*(C`encoding\*(C'\fR option is set to the special value \f(CW\*(C`roff\*(C'\fR, Pod::Man will
do its historic transformation of (some) ISO 8859\-1 characters into *roff
escapes that may be adequate in troff and may be readable (if ugly) in nroff.
This was the default behavior of versions of Pod::Man before 5.00. With this
encoding, all other non-ASCII characters will be replaced with \f(CW\*(C`X\*(C'\fR. It may
be required for very old troff and nroff implementations that do not support
UTF\-8, but its representation of any non-ASCII character is very poor and
often specific to European languages.
.Sp
If the output file handle has a PerlIO encoding layer set, setting \f(CW\*(C`encoding\*(C'\fR
to anything other than \f(CW\*(C`groff\*(C'\fR or \f(CW\*(C`roff\*(C'\fR will be ignored and no encoding
will be done by Pod::Man. It will instead rely on the encoding layer to make
whatever output encoding transformations are desired.
.Sp
WARNING: The input encoding of the POD source is independent from the output
encoding, and setting this option does not affect the interpretation of the
POD input. Unless your POD source is US-ASCII, its encoding should be
declared with the \f(CW\*(C`=encoding\*(C'\fR command in the source. If this is not done,
Pod::Simple will will attempt to guess the encoding and may be successful if
it's Latin\-1 or UTF\-8, but it will produce warnings. See \fBperlpod\fR\|(1) for
more information.
.IP errors 4
.IX Item "errors"
[2.27] How to report errors. \f(CW\*(C`die\*(C'\fR says to throw an exception on any POD
formatting error. \f(CW\*(C`stderr\*(C'\fR says to report errors on standard error, but not
to throw an exception. \f(CW\*(C`pod\*(C'\fR says to include a POD ERRORS section in the
resulting documentation summarizing the errors. \f(CW\*(C`none\*(C'\fR ignores POD errors
entirely, as much as possible.
.Sp
The default is \f(CW\*(C`pod\*(C'\fR.
.IP fixed 4
.IX Item "fixed"
[1.00] The fixed-width font to use for verbatim text and code. Defaults to
\&\f(CW\*(C`CW\*(C'\fR. Some systems prefer \f(CW\*(C`CR\*(C'\fR instead. Only matters for \fBtroff\fR output.
.IP fixedbold 4
.IX Item "fixedbold"
[1.00] Bold version of the fixed-width font. Defaults to \f(CW\*(C`CB\*(C'\fR. Only matters
for \fBtroff\fR output.
.IP fixeditalic 4
.IX Item "fixeditalic"
[1.00] Italic version of the fixed-width font (something of a misnomer, since
most fixed-width fonts only have an oblique version, not an italic version).
Defaults to \f(CW\*(C`CI\*(C'\fR. Only matters for \fBtroff\fR output.
.IP fixedbolditalic 4
.IX Item "fixedbolditalic"
[1.00] Bold italic (in theory, probably oblique in practice) version of the
fixed-width font. Pod::Man doesn't assume you have this, and defaults to
\&\f(CW\*(C`CB\*(C'\fR. Some systems (such as Solaris) have this font available as \f(CW\*(C`CX\*(C'\fR.
Only matters for \fBtroff\fR output.
.IP guesswork 4
.IX Item "guesswork"
[5.00] By default, Pod::Man applies some default formatting rules based on
guesswork and regular expressions that are intended to make writing Perl
documentation easier and require less explicit markup. These rules may not
always be appropriate, particularly for documentation that isn't about Perl.
This option allows turning all or some of it off.
.Sp
The special value \f(CW\*(C`all\*(C'\fR enables all guesswork. This is also the default for
backward compatibility reasons. The special value \f(CW\*(C`none\*(C'\fR disables all
guesswork. Otherwise, the value of this option should be a comma-separated
list of one or more of the following keywords:
.RS 4
.IP functions 4
.IX Item "functions"
Convert function references like \f(CWfoo()\fR to bold even if they have no markup.
The function name accepts valid Perl characters for function names (including
\&\f(CW\*(C`:\*(C'\fR), and the trailing parentheses must be present and empty.
.IP manref 4
.IX Item "manref"
Make the first part (before the parentheses) of manual page references like
\&\f(CWfoo(1)\fR bold even if they have no markup. The section must be a single
number optionally followed by lowercase letters.
.IP quoting 4
.IX Item "quoting"
If no guesswork is enabled, any text enclosed in C<> is surrounded by
double quotes in nroff (terminal) output unless the contents are already
quoted. When this guesswork is enabled, quote marks will also be suppressed
for Perl variables, function names, function calls, numbers, and hex
constants.
.IP variables 4
.IX Item "variables"
Convert Perl variable names to a fixed-width font even if they have no markup.
This transformation will only be apparent in troff output, or some other
output format (unlike nroff terminal output) that supports fixed-width fonts.
.RE
.RS 4
.Sp
Any unknown guesswork name is silently ignored (for potential future
compatibility), so be careful about spelling.
.RE
.IP language 4
.IX Item "language"
[5.00] Add commands telling \fBgroff\fR that the input file is in the given
language. The value of this setting must be a language abbreviation for which
\&\fBgroff\fR provides supplemental configuration, such as \f(CW\*(C`ja\*(C'\fR (for Japanese) or
\&\f(CW\*(C`zh\*(C'\fR (for Chinese).
.Sp
Specifically, this adds:
.Sp
.Vb 2
\& .mso <language>.tmac
\& .hla <language>
.Ve
.Sp
to the start of the file, which configure correct line breaking for the
specified language. Without these commands, groff may not know how to add
proper line breaks for Chinese and Japanese text if the manual page is
installed into the normal manual page directory, such as \fI/usr/share/man\fR.
.Sp
On many systems, this will be done automatically if the manual page is
installed into a language-specific manual page directory, such as
\&\fI/usr/share/man/zh_CN\fR. In that case, this option is not required.
.Sp
Unfortunately, the commands added with this option are specific to \fBgroff\fR
and will not work with other \fBtroff\fR and \fBnroff\fR implementations.
.IP lquote 4
.IX Item "lquote"
.PD 0
.IP rquote 4
.IX Item "rquote"
.PD
[4.08] Sets the quote marks used to surround C<> text. \f(CW\*(C`lquote\*(C'\fR sets the
left quote mark and \f(CW\*(C`rquote\*(C'\fR sets the right quote mark. Either may also be
set to the special value \f(CW\*(C`none\*(C'\fR, in which case no quote mark is added on that
side of C<> text (but the font is still changed for troff output).
.Sp
Also see the \f(CW\*(C`quotes\*(C'\fR option, which can be used to set both quotes at once.
If both \f(CW\*(C`quotes\*(C'\fR and one of the other options is set, \f(CW\*(C`lquote\*(C'\fR or \f(CW\*(C`rquote\*(C'\fR
overrides \f(CW\*(C`quotes\*(C'\fR.
.IP name 4
.IX Item "name"
[4.08] Set the name of the manual page for the \f(CW\*(C`.TH\*(C'\fR macro. Without this
option, the manual name is set to the uppercased base name of the file being
converted unless the manual section is 3, in which case the path is parsed to
see if it is a Perl module path. If it is, a path like \f(CW\*(C`.../lib/Pod/Man.pm\*(C'\fR
is converted into a name like \f(CW\*(C`Pod::Man\*(C'\fR. This option, if given, overrides
any automatic determination of the name.
.Sp
If generating a manual page from standard input, the name will be set to
\&\f(CW\*(C`STDIN\*(C'\fR if this option is not provided. In this case, providing this option
is strongly recommended to set a meaningful manual page name.
.IP nourls 4
.IX Item "nourls"
[2.27] Normally, L<> formatting codes with a URL but anchor text are
formatted to show both the anchor text and the URL. In other words:
.Sp
.Vb 1
\& L<foo|http://example.com/>
.Ve
.Sp
is formatted as:
.Sp
.Vb 1
\& foo <http://example.com/>
.Ve
.Sp
This option, if set to a true value, suppresses the URL when anchor text
is given, so this example would be formatted as just \f(CW\*(C`foo\*(C'\fR. This can
produce less cluttered output in cases where the URLs are not particularly
important.
.IP quotes 4
.IX Item "quotes"
[4.00] Sets the quote marks used to surround C<> text. If the value is a
single character, it is used as both the left and right quote. Otherwise, it
is split in half, and the first half of the string is used as the left quote
and the second is used as the right quote.
.Sp
This may also be set to the special value \f(CW\*(C`none\*(C'\fR, in which case no quote
marks are added around C<> text (but the font is still changed for troff
output).
.Sp
Also see the \f(CW\*(C`lquote\*(C'\fR and \f(CW\*(C`rquote\*(C'\fR options, which can be used to set the
left and right quotes independently. If both \f(CW\*(C`quotes\*(C'\fR and one of the other
options is set, \f(CW\*(C`lquote\*(C'\fR or \f(CW\*(C`rquote\*(C'\fR overrides \f(CW\*(C`quotes\*(C'\fR.
.IP release 4
.IX Item "release"
[1.00] Set the centered footer for the \f(CW\*(C`.TH\*(C'\fR macro. By default, this is set
to the version of Perl you run Pod::Man under. Setting this to the empty
string will cause some *roff implementations to use the system default value.
.Sp
Note that some system \f(CW\*(C`an\*(C'\fR macro sets assume that the centered footer will be
a modification date and will prepend something like \f(CW\*(C`Last modified: \*(C'\fR. If
this is the case for your target system, you may want to set \f(CW\*(C`release\*(C'\fR to the
last modified date and \f(CW\*(C`date\*(C'\fR to the version number.
.IP section 4
.IX Item "section"
[1.00] Set the section for the \f(CW\*(C`.TH\*(C'\fR macro. The standard section numbering
convention is to use 1 for user commands, 2 for system calls, 3 for functions,
4 for devices, 5 for file formats, 6 for games, 7 for miscellaneous
information, and 8 for administrator commands. There is a lot of variation
here, however; some systems (like Solaris) use 4 for file formats, 5 for
miscellaneous information, and 7 for devices. Still others use 1m instead of
8, or some mix of both. About the only section numbers that are reliably
consistent are 1, 2, and 3.
.Sp
By default, section 1 will be used unless the file ends in \f(CW\*(C`.pm\*(C'\fR in which
case section 3 will be selected.
.IP stderr 4
.IX Item "stderr"
[2.19] If set to a true value, send error messages about invalid POD to
standard error instead of appending a POD ERRORS section to the generated
*roff output. This is equivalent to setting \f(CW\*(C`errors\*(C'\fR to \f(CW\*(C`stderr\*(C'\fR if
\&\f(CW\*(C`errors\*(C'\fR is not already set.
.Sp
This option is for backward compatibility with Pod::Man versions that did not
support \f(CW\*(C`errors\*(C'\fR. Normally, the \f(CW\*(C`errors\*(C'\fR option should be used instead.
.IP utf8 4
.IX Item "utf8"
[2.21] This option used to set the output encoding to UTF\-8. Since this is
now the default, it is ignored and does nothing.
.RE
.RS 4
.RE
.SH "INSTANCE METHODS"
.IX Header "INSTANCE METHODS"
As a derived class from Pod::Simple, Pod::Man supports the same methods and
interfaces. See Pod::Simple for all the details. This section summarizes
the most-frequently-used methods and the ones added by Pod::Man.
.IP output_fh(FH) 4
.IX Item "output_fh(FH)"
Direct the output from \fBparse_file()\fR, \fBparse_lines()\fR, or \fBparse_string_document()\fR
to the file handle FH instead of \f(CW\*(C`STDOUT\*(C'\fR.
.IP output_string(REF) 4
.IX Item "output_string(REF)"
Direct the output from \fBparse_file()\fR, \fBparse_lines()\fR, or \fBparse_string_document()\fR
to the scalar variable pointed to by REF, rather than \f(CW\*(C`STDOUT\*(C'\fR. For example:
.Sp
.Vb 4
\& my $man = Pod::Man\->new();
\& my $output;
\& $man\->output_string(\e$output);
\& $man\->parse_file(\*(Aq/some/input/file\*(Aq);
.Ve
.Sp
Be aware that the output in that variable will already be encoded in UTF\-8.
.IP parse_file(PATH) 4
.IX Item "parse_file(PATH)"
Read the POD source from PATH and format it. By default, the output is sent
to \f(CW\*(C`STDOUT\*(C'\fR, but this can be changed with the \fBoutput_fh()\fR or \fBoutput_string()\fR
methods.
.IP "parse_from_file(INPUT, OUTPUT)" 4
.IX Item "parse_from_file(INPUT, OUTPUT)"
.PD 0
.IP "parse_from_filehandle(FH, OUTPUT)" 4
.IX Item "parse_from_filehandle(FH, OUTPUT)"
.PD
Read the POD source from INPUT, format it, and output the results to OUTPUT.
.Sp
\&\fBparse_from_filehandle()\fR is provided for backward compatibility with older
versions of Pod::Man. \fBparse_from_file()\fR should be used instead.
.IP "parse_lines(LINES[, ...[, undef]])" 4
.IX Item "parse_lines(LINES[, ...[, undef]])"
Parse the provided lines as POD source, writing the output to either \f(CW\*(C`STDOUT\*(C'\fR
or the file handle set with the \fBoutput_fh()\fR or \fBoutput_string()\fR methods. This
method can be called repeatedly to provide more input lines. An explicit
\&\f(CW\*(C`undef\*(C'\fR should be passed to indicate the end of input.
.Sp
This method expects raw bytes, not decoded characters.
.IP parse_string_document(INPUT) 4
.IX Item "parse_string_document(INPUT)"
Parse the provided scalar variable as POD source, writing the output to either
\&\f(CW\*(C`STDOUT\*(C'\fR or the file handle set with the \fBoutput_fh()\fR or \fBoutput_string()\fR
methods.
.Sp
This method expects raw bytes, not decoded characters.
.SH ENCODING
.IX Header "ENCODING"
As of Pod::Man 5.00, the default output encoding for Pod::Man is UTF\-8. This
should work correctly on any modern system that uses either \fBgroff\fR (most
Linux distributions) or \fBmandoc\fR (Alpine Linux and most BSD variants,
including macOS).
.PP
The user will probably have to use a UTF\-8 locale to see correct output. This
may be done by default; if not, set the LANG or LC_CTYPE environment variables
to an appropriate local. The locale \f(CW\*(C`C.UTF\-8\*(C'\fR is available on most systems
if one wants correct output without changing the other things locales affect,
such as collation.
.PP
The backward-compatible output format used in Pod::Man versions before 5.00 is
available by setting the \f(CW\*(C`encoding\*(C'\fR option to \f(CW\*(C`roff\*(C'\fR. This may produce
marginally nicer results on older UNIX versions that do not use \fBgroff\fR or
\&\fBmandoc\fR, but none of the available options will correctly render Unicode
characters on those systems.
.PP
Below are some additional details about how this choice was made and some
discussion of alternatives.
.SS History
.IX Subsection "History"
The default output encoding for Pod::Man has been a long-standing problem.
\&\fBtroff\fR and \fBnroff\fR predate Unicode by a significant margin, and their
implementations for many UNIX systems reflect that legacy. It's common for
Unicode to not be supported in any form.
.PP
Because of this, versions of Pod::Man prior to 5.00 maintained the highly
conservative output of the original pod2man, which output pure ASCII with
complex macros to simulate common western European accented characters when
processed with troff. The nroff output was awkward and sometimes incorrect,
and characters not used in western European scripts were replaced with \f(CW\*(C`X\*(C'\fR.
This choice maximized backwards compatibility with \fBman\fR and
\&\fBnroff\fR/\fBtroff\fR implementations at the cost of incorrect rendering of many
POD documents, particularly those containing people's names.
.PP
The modern implementations, \fBgroff\fR (used in most Linux distributions) and
\&\fBmandoc\fR (used by most BSD variants), do now support Unicode. Other UNIX
systems often do not, but they're now a tiny minority of the systems people
use on a daily basis. It's increasingly common (for very good reasons) to use
Unicode characters for POD documents rather than using ASCII conversions of
people's names or avoiding non-English text, making the limitations in the old
output format more apparent.
.PP
Four options have been proposed to fix this:
.IP \(bu 2
Optionally support UTF\-8 output but don't change the default. This is the
approach taken since Pod::Man 2.1.0, which added the \f(CW\*(C`utf8\*(C'\fR option. Some
Pod::Man users use this option for better output on platforms known to support
Unicode, but since the defaults have not changed, people continued to
encounter (and file bug reports about) the poor default rendering.
.IP \(bu 2
Convert characters to troff \f(CW\*(C`\e(xx\*(C'\fR escapes. This requires maintaining a
large translation table and addresses only a tiny part of the problem, since
many Unicode characters have no standard troff name. \fBgroff\fR has the largest
list, but if one is willing to assume \fBgroff\fR is the formatter, the next
option is better.
.IP \(bu 2
Convert characters to groff \f(CW\*(C`\e[uNNNN]\*(C'\fR escapes. This is implemented as the
\&\f(CW\*(C`groff\*(C'\fR encoding for those who want to use it, and is supported by both
\&\fBgroff\fR and \fBmandoc\fR. However, it is no better than UTF\-8 output for
portability to other implementations. See "Testing results" for more
details.
.IP \(bu 2
Change the default output format to UTF\-8 and ask those who want maximum
backward compatibility to explicitly select the old encoding. This fixes the
issue for most users at the cost of backwards compatibility. While the
rendering of non-ASCII characters is different on older systems that don't
support UTF\-8, it's not always worse than the old output.
.PP
Pod::Man 5.00 and later makes the last choice. This arguably produces worse
output when manual pages are formatted with \fBtroff\fR into PostScript or PDF,
but doing this is rare and normally manual, so the encoding can be changed in
those cases. The older output encoding is available by setting \f(CW\*(C`encoding\*(C'\fR to
\&\f(CW\*(C`roff\*(C'\fR.
.SS "Testing results"
.IX Subsection "Testing results"
Here is the results of testing \f(CW\*(C`encoding\*(C'\fR values of \f(CW\*(C`utf\-8\*(C'\fR and \f(CW\*(C`groff\*(C'\fR on
various operating systems. The testing methodology was to create \fIman/man1\fR
in the current directory, copy \fIencoding.utf8\fR or \fIencoding.groff\fR from the
podlators 5.00 distribution to \fIman/man1/encoding.1\fR, and then run:
.PP
.Vb 1
\& LANG=C.UTF\-8 MANPATH=$(pwd)/man man 1 encoding
.Ve
.PP
If the locale is not explicitly set to one that includes UTF\-8, the Unicode
characters were usually converted to ASCII (by, for example, dropping an
accent) or deleted or replaced with \f(CW\*(C`<?>\*(C'\fR if there was no conversion.
.PP
Tested on 2022\-09\-25. Many thanks to the GCC Compile Farm project for access
to testing hosts.
.PP
.Vb 12
\& OS UTF\-8 groff
\& \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\- \-\-\-\-\-\-\-
\& AIX 7.1 no [1] no [2]
\& Alpine 3.15.0 yes yes
\& CentOS 7.9 yes yes
\& Debian 7 yes yes
\& FreeBSD 13.0 yes yes
\& NetBSD 9.2 yes yes
\& OpenBSD 7.1 yes yes
\& openSUSE Leap 15.4 yes yes
\& Solaris 10 yes no [2]
\& Solaris 11 no [3] no [3]
.Ve
.PP
I did not have access to a macOS system for testing, but since it uses
\&\fBmandoc\fR, it's behavior is probably the same as the BSD hosts.
.PP
Notes:
.IP [1] 4
.IX Item "[1]"
Unicode characters were converted to one or two random ASCII characters
unrelated to the original character.
.IP [2] 4
.IX Item "[2]"
Unicode characters were shown as the body of the groff escape rather than the
indicated character (in other words, text like \f(CW\*(C`[u00EF]\*(C'\fR).
.IP [3] 4
.IX Item "[3]"
Unicode characters were deleted entirely, as if they weren't there. Using
\&\f(CW\*(C`nroff \-man\*(C'\fR instead of \fBman\fR to format the page showed the same results as
Solaris 10. Using \f(CW\*(C`groff \-k \-man \-Tutf8\*(C'\fR to format the page produced the
correct output.
.PP
PostScript and PDF output using groff on a Debian 12 system do not support
combining accent marks or SMP characters due to a lack of support in the
default output font.
.PP
Testing on additional platforms is welcome. Please let the author know if you
have additional results.
.SH DIAGNOSTICS
.IX Header "DIAGNOSTICS"
.IP "roff font should be 1 or 2 chars, not ""%s""" 4
.IX Item "roff font should be 1 or 2 chars, not ""%s"""
(F) You specified a *roff font (using \f(CW\*(C`fixed\*(C'\fR, \f(CW\*(C`fixedbold\*(C'\fR, etc.) that
wasn't either one or two characters. Pod::Man doesn't support *roff fonts
longer than two characters, although some *roff extensions do (the
canonical versions of \fBnroff\fR and \fBtroff\fR don't either).
.IP "Invalid errors setting ""%s""" 4
.IX Item "Invalid errors setting ""%s"""
(F) The \f(CW\*(C`errors\*(C'\fR parameter to the constructor was set to an unknown value.
.IP "Invalid quote specification ""%s""" 4
.IX Item "Invalid quote specification ""%s"""
(F) The quote specification given (the \f(CW\*(C`quotes\*(C'\fR option to the
constructor) was invalid. A quote specification must be either one
character long or an even number (greater than one) characters long.
.IP "POD document had syntax errors" 4
.IX Item "POD document had syntax errors"
(F) The POD document being formatted had syntax errors and the \f(CW\*(C`errors\*(C'\fR
option was set to \f(CW\*(C`die\*(C'\fR.
.SH ENVIRONMENT
.IX Header "ENVIRONMENT"
.IP PERL_CORE 4
.IX Item "PERL_CORE"
If set and Encode is not available, silently fall back to an encoding of
\&\f(CW\*(C`groff\*(C'\fR without complaining to standard error. This environment variable is
set during Perl core builds, which build Encode after podlators. Encode is
expected to not (yet) be available in that case.
.IP POD_MAN_DATE 4
.IX Item "POD_MAN_DATE"
If set, this will be used as the value of the left-hand footer unless the
\&\f(CW\*(C`date\*(C'\fR option is explicitly set, overriding the timestamp of the input
file or the current time. This is primarily useful to ensure reproducible
builds of the same output file given the same source and Pod::Man version,
even when file timestamps may not be consistent.
.IP SOURCE_DATE_EPOCH 4
.IX Item "SOURCE_DATE_EPOCH"
If set, and POD_MAN_DATE and the \f(CW\*(C`date\*(C'\fR options are not set, this will be
used as the modification time of the source file, overriding the timestamp of
the input file or the current time. It should be set to the desired time in
seconds since UNIX epoch. This is primarily useful to ensure reproducible
builds of the same output file given the same source and Pod::Man version,
even when file timestamps may not be consistent. See
<https://reproducible\-builds.org/specs/source\-date\-epoch/> for the full
specification.
.Sp
(Arguably, according to the specification, this variable should be used only
if the timestamp of the input file is not available and Pod::Man uses the
current time. However, for reproducible builds in Debian, results were more
reliable if this variable overrode the timestamp of the input file.)
.SH COMPATIBILITY
.IX Header "COMPATIBILITY"
Pod::Man 1.02 (based on Pod::Parser) was the first version included with
Perl, in Perl 5.6.0.
.PP
The current API based on Pod::Simple was added in Pod::Man 2.00. Pod::Man
2.04 was included in Perl 5.9.3, the first version of Perl to incorporate
those changes. This is the first version that correctly supports all modern
POD syntax. The \fBparse_from_filehandle()\fR method was re-added for backward
compatibility in Pod::Man 2.09, included in Perl 5.9.4.
.PP
Support for anchor text in L<> links of type URL was added in Pod::Man
2.23, included in Perl 5.11.5.
.PP
\&\fBparse_lines()\fR, \fBparse_string_document()\fR, and \fBparse_file()\fR set a default output
file handle of \f(CW\*(C`STDOUT\*(C'\fR if one was not already set as of Pod::Man 2.28,
included in Perl 5.19.5.
.PP
Support for SOURCE_DATE_EPOCH and POD_MAN_DATE was added in Pod::Man 4.00,
included in Perl 5.23.7, and generated dates were changed to use UTC instead
of the local time zone. This is also the first release that aligned the
module version and the version of the podlators distribution. All modules
included in podlators, and the podlators distribution itself, share the same
version number from this point forward.
.PP
Pod::Man 4.10, included in Perl 5.27.8, changed the formatting for manual page
references and function names to bold instead of italic, following the current
Linux manual page standard.
.PP
Pod::Man 5.00 changed the default output encoding to UTF\-8, overridable with
the new \f(CW\*(C`encoding\*(C'\fR option. It also fixed problems with bold or italic
extending too far when used with C<> escapes, and began converting Unicode
zero-width spaces (U+200B) to the \f(CW\*(C`\e:\*(C'\fR *roff escape. It also dropped
attempts to add subtle formatting corrections in the output that would only be
visible when typeset with \fBtroff\fR, which had previously been a significant
source of bugs.
.SH BUGS
.IX Header "BUGS"
There are numerous bugs and language-specific assumptions in the nroff
fallbacks for accented characters in the \f(CW\*(C`roff\*(C'\fR encoding. Since the point of
this encoding is backward compatibility with the output from earlier versions
of Pod::Man, and it is deprecated except when necessary to support old
systems, those bugs are unlikely to ever be fixed.
.PP
Pod::Man doesn't handle font names longer than two characters. Neither do
most \fBtroff\fR implementations, but groff does as an extension. It would be
nice to support as an option for those who want to use it.
.SH CAVEATS
.IX Header "CAVEATS"
.SS "Sentence spacing"
.IX Subsection "Sentence spacing"
Pod::Man copies the input spacing verbatim to the output *roff document. This
means your output will be affected by how \fBnroff\fR generally handles sentence
spacing.
.PP
\&\fBnroff\fR dates from an era in which it was standard to use two spaces after
sentences, and will always add two spaces after a line-ending period (or
similar punctuation) when reflowing text. For example, the following input:
.PP
.Vb 1
\& =pod
\&
\& One sentence.
\& Another sentence.
.Ve
.PP
will result in two spaces after the period when the text is reflowed. If you
use two spaces after sentences anyway, this will be consistent, although you
will have to be careful to not end a line with an abbreviation such as \f(CW\*(C`e.g.\*(C'\fR
or \f(CW\*(C`Ms.\*(C'\fR. Output will also be consistent if you use the *roff style guide
(and XKCD 1285 <https://xkcd.com/1285/>) recommendation of putting a line
break after each sentence, although that will consistently produce two spaces
after each sentence, which may not be what you want.
.PP
If you prefer one space after sentences (which is the more modern style), you
will unfortunately need to ensure that no line in the middle of a paragraph
ends in a period or similar sentence-ending paragraph. Otherwise, \fBnroff\fR
will add a two spaces after that sentence when reflowing, and your output
document will have inconsistent spacing.
.SS Hyphens
.IX Subsection "Hyphens"
The handling of hyphens versus dashes is somewhat fragile, and one may get a
the wrong one under some circumstances. This will normally only matter for
line breaking and possibly for troff output.
.SH AUTHOR
.IX Header "AUTHOR"
Written by Russ Allbery <rra@cpan.org>, based on the original \fBpod2man\fR by
Tom Christiansen <tchrist@mox.perl.com>.
.PP
The modifications to work with Pod::Simple instead of Pod::Parser were
contributed by Sean Burke <sburke@cpan.org>, but I've since hacked them beyond
recognition and all bugs are mine.
.SH "COPYRIGHT AND LICENSE"
.IX Header "COPYRIGHT AND LICENSE"
Copyright 1999\-2010, 2012\-2020, 2022 Russ Allbery <rra@cpan.org>
.PP
Substantial contributions by Sean Burke <sburke@cpan.org>.
.PP
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.
.SH "SEE ALSO"
.IX Header "SEE ALSO"
Encode::Supported, Pod::Simple, \fBperlpod\fR\|(1), \fBpod2man\fR\|(1),
\&\fBnroff\fR\|(1), \fBtroff\fR\|(1), \fBman\fR\|(1), \fBman\fR\|(7)
.PP
Ossanna, Joseph F., and Brian W. Kernighan. "Troff User's Manual,"
Computing Science Technical Report No. 54, AT&T Bell Laboratories. This is
the best documentation of standard \fBnroff\fR and \fBtroff\fR. At the time of
this writing, it's available at <http://www.troff.org/54.pdf>.
.PP
The manual page documenting the man macro set may be \fBman\fR\|(5) instead of
\&\fBman\fR\|(7) on your system.
.PP
See \fBperlpodstyle\fR\|(1) for documentation on writing manual pages in POD if
you've not done it before and aren't familiar with the conventions.
.PP
The current version of this module is always available from its web site at
<https://www.eyrie.org/~eagle/software/podlators/>. It is also part of the
Perl core distribution as of 5.6.0.
|