The ‘g’ prefix is
not used on all systems; see Invoking groff
.
Unix and related operating systems distinguish
standard output and standard error streams because of
troff
:
https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html.
See Line Layout.
Besides groff
, neatroff
is an
exception.
The
mso
request does not have these limitations. See I/O.
The remainder of this chapter is based on
Writing Papers with nroff using -me by Eric P. Allman,
which is distributed with groff
as meintro.me.
While manual pages are older, early ones used macros supplanted by the man package of Seventh Edition Unix (1979). ms shipped with Sixth Edition (1975) and was documented by Mike Lesk in a Bell Labs internal memorandum.
defined in Footnotes
Distinguish a
document title from “titles”, which are what roff
systems call
headers and footers collectively.
This idiosyncrasy arose through
feature accretion; for example, the B
macro in Version 6
Unix ms (1975) accepted only one argument, the text to be set in
boldface. By Version 7 (1979) it recognized a second argument; in
1990, groff
ms added a “pre” argument, placing it third
to avoid breaking support for older documents.
“Portable Document Format Publishing with GNU
Troff”, pdfmark.ms in the groff
distribution, uses this
technique.
Unix Version 7 ms, its descendants, and GNU
ms prior to groff
version 1.23.0
You could reset it
after each call to .1C
, .2C
, or .MC
.
Typing Documents on the UNIX System: Using the -ms Macros with Troff and Nroff, M. E. Lesk, Bell Laboratories, 1978
Register values are converted to and stored as basic units. See Measurements.
If you redefine the ms PT
macro
and desire special treatment of certain page numbers (like ‘1’),
you may need to handle a non-Arabic page number format, as groff
ms’s PT
does; see the macro package source. groff
ms aliases the PN
register to %
.
The removal beforehand is necessary
because groff
ms aliases these macros to a diagnostic
macro, and you want to redefine the aliased name, not its target.
See Device and Font Description Files.
Tabs and leaders also separate
words. Escape sequences can function as word characters, word
separators, or neither—the last simply have no effect on GNU
troff
’s idea of whether an input character is within a word.
We’ll discuss all of these in due course.
A
well-researched jeremiad appreciated by groff
contributors on
both sides of the sentence-spacing debate can be found at
https://web.archive.org/web/20171217060354/http://www.heracliteanriver.com/?p=324.
This statement oversimplifies; there are escape sequences whose purpose is precisely to produce glyphs on the output device, and input characters that aren’t part of escape sequences can undergo a great deal of processing before getting to the output.
The mnemonics for the special characters shown here are “dagger”, “double dagger”, “right (double) quote”, and “closing (single) quote”. See the groff_char(7) man page.
“Text lines” are defined in Requests and Macros.
“Tab” is short for “tabulation”, revealing the term’s origin as a spacing mechanism for table arrangement.
The \RET
escape sequence can alter how an
input line is classified; see Line Continuation.
Argument handling in macros is more flexible but also more complex. See Calling Macros.
Some escape sequences undergo interpolation as well.
GNU troff
offers additional ones. See Writing Macros.
Macro files and packages frequently define registers and strings as well.
The semantics of certain punctuation code points have gotten stricter with the successive standards, a cause of some frustration among man page writers; see the groff_char(7) man page.
The DVI output device defaults to using the Computer Modern (CM) fonts; ec.tmac loads the EC fonts instead, which provide Euro ‘\[Eu]’ and per mille ‘\[%0]’ glyphs.
Emacs: fill-column: 72
; Vim: textwidth=72
groff
does not yet support right-to-left
scripts.
groff
’s terminal output devices have page
offsets of zero.
Provision is made for interpreting and reporting decimal fractions in certain cases.
If that’s not enough, see the groff_tmac(5) man page for the 62bit.tmac macro package.
Control structure syntax creates an exception to this rule, but is designed to remain useful: recalling our example, ‘.if 1 .Underline this’ would underline only “this”, precisely. See Conditionals and Loops.
See Diversions.
Historically, control characters like
ASCII STX, ETX, and BEL (Control+B, Control+C, and
Control+G) have been observed in roff
documents,
particularly in macro packages employing them as delimiters with the
output comparison operator to try to avoid collisions with the content
of arbitrary user-supplied parameters (see Operators in Conditionals). We discourage this expedient; in GNU troff
it is
unnecessary (outside of compatibility mode) because delimited arguments
are parsed at a different input level than the surrounding context.
See Implementation Differences.
Consider what happens when a C1 control
0x80
–0x9F
is necessary as a continuation byte in a UTF-8
sequence.
Recall Identifiers.
In compatibility
mode, a space is not necessary after a request or macro name of two
characters’ length. Also, Plan 9 troff
allows tabs to
separate arguments.
\~
is fairly
portable; see Other Differences.
Strictly, you can neglect to close the last quoted macro argument, relying on the end of the control line to do so. We consider this lethargic practice poor style.
The omission of spaces before the comment escape sequences is necessary; see Strings.
TeX does have such a mechanism.
This claim may be more aspirational than descriptive.
See Conditional Blocks.
Exception: auto-incrementing registers defined outside
the ignored region will be modified if interpolated with
\n±
inside it. See Auto-increment.
A negative auto-increment can be considered an “auto-decrement”.
GNU troff
dynamically allocates memory for
as many registers as required.
unless diverted; see Diversions
See Line Continuation.
Recall Filling and Sentences for the definitions of word and sentence boundaries, respectively.
Whether a perfect algorithm for this application is even possible is an unsolved problem in computer science: https://tug.org/docs/liang/liang-thesis.pdf.
\%
itself stops marking
hyphenation points but still produces no output glyph.
“Soft” because it appears in output only where a hyphenation break is performed; a “hard” hyphen, as in “long-term”, always appears.
The mode is a vector of Booleans encoded as an integer. To a programmer, this fact is easily deduced from the exclusive use of powers of two for the configuration parameters; they are computationally easy to “mask off” and compare to zero. To almost everyone else, the arrangement seems recondite and unfriendly.
Hyphenation is prevented if the next page location trap is closer to the vertical drawing position than the next text baseline would be. See Page Location Traps.
For more on localization, see the groff_tmac(5) man page.
See Page Location Traps.
See Drawing Geometric Objects.
or geometric objects; see Drawing Geometric Objects
to the top-level diversion; see Diversions
Plan 9 troff
uses the register .S
for this purpose.
This is pronounced to rhyme with “feeder”, and refers to how the glyphs “lead” the eye across the page to the corresponding page number or other datum.
A
GNU nroff
program is available for convenience; it calls GNU
troff
to perform the formatting.
Historically, the \c
escape sequence has proven challenging to characterize. Some sources
say it “connects the next input text” (to the input line on which it
appears); others describe it as “interrupting” text, on the grounds
that a text line is interrupted without breaking, perhaps to inject a
request invocation or macro call.
See Traps.
See Diversions.
Terminals and some output devices have fonts that render
at only one or two sizes. As examples of the latter, take the
groff
lj4
device’s Lineprinter, and lbp
’s Courier
and Elite faces.
Font designers prepare families such that the styles share esthetic properties.
Historically, the fonts
troff
s dealt with were not Free Software or, as with the Graphic
Systems C/A/T, did not even exist in the digital domain.
See Font Description File Format.
See DESC File Format.
Not all versions of the man
program
support the -T option; use the subsequent example for an
alternative.
This is “Normalization Form D” as documented in Unicode Standard Annex #15 (https://unicode.org/reports/tr15/).
See Compatibility Mode.
Output glyphs
don’t—to GNU troff
, a glyph is simply a box with an index into
a font, a given height above and depth below the baseline, and a width.
Opinions of this escape sequence’s name abound.
“Zero-width space” is a popular misnomer: roff
formatters do
not treat it like a space. Ossanna called it a “non-printing,
zero-width character”, but the character causes output even
though it does not “print”. If no output line is pending, the dummy
character starts one. Contrast an empty input document with one
containing only \&
. The former produces no output; the latter, a
blank page.
In text fonts, the tallest glyphs are typically parentheses. Unfortunately, in many cases the actual dimensions of the glyphs in a font do not closely match its declared type size! For example, in the standard PostScript font families, 10-point Times sets better with 9-point Helvetica and 11-point Courier than if all three were used at 10 points.
Rhyme with “sledding”; mechanical typography used lead metal (Latin plumbum).
The claim appears to have been true of Ossanna
troff
for the C/A/T device; Kernighan made device-independent
troff
more flexible.
See Device and Font Description Files.
also known vulgarly as “ANSI colors”
See Copy Mode.
This refers to
vtroff
, a translator that would convert the C/A/T output from
early-vintage AT&T troff
to a form suitable for
Versatec and Benson-Varian plotters.
Strictly, letters not otherwise recognized are treated
as output comparison delimiters. For portability, it is wise to avoid
using letters not in the list above; for example, Plan 9
troff
uses ‘h’ to test a mode it calls htmlroff
, and
GNU troff
may provide additional operators in the future.
Because formatting of the comparands takes place in a dummy environment, vertical motions within them cannot spring traps.
All
of this is to say that the lists of output nodes created by formatting
xxx and yyy must be identical. See gtroff
Internals.
This bizarre behavior maintains compatibility with
AT&T troff
.
See while.
See Copy Mode.
unless you redefine it
“somewhat less” because things other than macro calls can be on the input stack
See Copy Mode.
While it is possible to define and call a macro ‘.’, you can’t use it as an end macro: during a macro definition, ‘..’ is never handled as calling ‘.’, even if ‘.de name .’ explicitly precedes it.
Its structure is adapted from, and isomorphic to, part of a solution by Tadziu Hoffman to the problem of reflowing text multiple times to find an optimal configuration for it. https://lists.gnu.org/archive/html/groff/2008-12/msg00006.html
If they were not,
parameter interpolations would be similar to command-line
parameters—fixed for the entire duration of a roff
program’s
run. The advantage of interpolating \$
escape sequences even in
copy mode is that they can interpolate different contents from one call
to the next, like function parameters in a procedural language. The
additional escape character is the price of this power.
Compare this to the \def
and \edef
commands in TeX.
These are lightly adapted from the groff
implementation of the ms macros.
At the
grops
defaults of 10-point type on 12-point vertical spacing, the
difference between half a vee and half an em can be subtle: large
spacings like ‘.vs .5i’ make it obvious.
See Strings, for an explanation of the trailing ‘\"’.
(hc, vc) is adjusted to the point nearest the perpendicular bisector of the arc’s chord.
A trap planted at ‘20i’ or ‘-30i’ will not be sprung on a page of length ‘11i’.
It may help to think of each trap location as
maintaining a queue; wh
operates on the head of the queue, and
ch
operates on its tail. Only the trap at the head of the queue
is visible.
See Debugging.
See Diversions.
While processing an end-of-input macro, the formatter assumes that the next page break must be the last; it goes into “sudden death overtime”.
Another, taken by the groff
man
macros, is
to intercept ne
requests and wrap bp
ones.
Thus, the “water” gets “higher” proceeding down the page.
The backslash is doubled. See Copy Mode.
that is, ISO 646:1991-IRV or, popularly, “US-ASCII”
They are bypassed because these parameters are not rendered as glyphs in the output; instead, they remain abstract characters—in a PDF bookmark or a URL, for example.
Recall Line Layout.
Historically,
tools named nrchbar
and changebar
were developed for
marking changes with margin characters and could be found in archives of
the comp.sources.unix
USENET group. Some proprietary
Unices also offer(ed) a diffmk
program.
Except the
escape sequences \f
, \F
, \H
, \m
, \M
,
\R
, \s
, and \S
, which are processed immediately if
not in copy mode.
The
Graphic Systems C/A/T phototypesetter (the original device target for
AT&T troff
) supported only a few discrete type sizes
in the range 6–36 points, so Ossanna contrived a special case in the
parser to do what the user must have meant. Kernighan warned of this in
the 1992 revision of CSTR #54 (§2.3), and more recently, McIlroy
referred to it as a “living fossil”.
DWB 3.3, Solaris, Heirloom Doctools, and
Plan 9 troff
all support it.
Naturally, if you’ve changed
the escape character, you need to prefix the e
with whatever it
is—and you’ll likely get something other than a backslash in the
output.
The rs
special character identifier was not
defined in AT&T troff
’s font description files, but is
in those of its lineal descendant, Heirloom Doctools troff
, as of
the latter’s 060716 release (July 2006).
The parser
and postprocessor for intermediate output can be found in the file
groff-source-dir/src/libs/libdriver/input.cpp.
Plan 9 troff
has also abandoned the binary
format.
800-point type is not practical for most purposes, but using it enables the quantities in the font description files to be expressed as integers.
groff
requests and escape sequences
interpret non-negative font names as mounting positions instead.
Further, a font named ‘0’ cannot be automatically mounted by the
fonts
directive of a DESC file.
For typesetter devices, this directive is misnamed since it starts a list of glyphs, not characters.
that is, any integer parsable by the C standard library’s strtol(3) function