From d318611dd6f23fcfedd50e9b9e24620b102ba96a Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Mon, 15 Apr 2024 21:44:05 +0200 Subject: Adding upstream version 1.23.0. Signed-off-by: Daniel Baumann --- doc/groff.html.node/Manipulating-Hyphenation.html | 580 ++++++++++++++++++++++ 1 file changed, 580 insertions(+) create mode 100644 doc/groff.html.node/Manipulating-Hyphenation.html (limited to 'doc/groff.html.node/Manipulating-Hyphenation.html') diff --git a/doc/groff.html.node/Manipulating-Hyphenation.html b/doc/groff.html.node/Manipulating-Hyphenation.html new file mode 100644 index 0000000..6d88ea8 --- /dev/null +++ b/doc/groff.html.node/Manipulating-Hyphenation.html @@ -0,0 +1,580 @@ + + + + + + +Manipulating Hyphenation (The GNU Troff Manual) + + + + + + + + + + + + + + + + + + + + +
+ +
+

5.10 Manipulating Hyphenation

+ + + + + +

When filling, GNU troff hyphenates words as needed at +user-specified and automatically determined hyphenation points. The +machine-driven determination of hyphenation points in words requires +algorithms and data, and is susceptible to conventions and preferences. +Before tackling such automatic hyphenation, let us consider how +hyphenation points can be set explicitly. +

+ + + + +

Explicitly hyphenated words such as “mother-in-law” are eligible for +breaking after each of their hyphens. Relatively few words in a +language offer such obvious break points, however, and automatic +detection of syllabic (or phonetic) boundaries for hyphenation is not +perfect,56 particularly for +unusual words found in technical literature. We can instruct GNU +troff how to hyphenate specific words if the need arises. +

+ +
+
Request: .hw word …
+
+

Define each hyphenation exception word with each hyphen ‘-’ +in the word indicating a hyphenation point. For example, the request +

+
+
.hw in-sa-lub-rious alpha
+
+ +

marks potential hyphenation points in “insalubrious”, and prevents +“alpha” from being hyphenated at all. +

+

Besides the space character, any character whose hyphenation code is +zero can be used to separate the arguments of hw (see the +hcode request below). In addition, this request can be used more +than once. +

+ +

Hyphenation points specified with hw are not subject to the +within-word placement restrictions imposed by the hy request (see +below). +

+

Hyphenation exceptions specified with the hw request are +associated with the hyphenation language (see the hla request +below) and environment (see Environments); invoking the hw +request in the absence of a hyphenation language is an error. +

+

The request is ignored if there are no parameters. +

+ +

These are known as hyphenation exceptions in the expectation +that most users will avail themselves of automatic hyphenation; these +exceptions override any rules that would normally apply to a word +matching a hyphenation exception defined with hw. +

+

Situations also arise when only a specific occurrence of a word needs +its hyphenation altered or suppressed, or when a URL or similar string +needs to be breakable in sensible places without hyphenation. +

+
+
Escape sequence: \%
+
+
Escape sequence: \:
+
+ + + + +

To tell GNU troff how to hyphenate words as they occur in input, +use the \% escape sequence; it is the default hyphenation +character. Each instance within a word indicates to GNU troff +that the word may be hyphenated at that point, while prefixing a word +with this escape sequence prevents it from being otherwise hyphenated. +This mechanism affects only that occurrence of the word; to change the +hyphenation of a word for the remainder of input processing, use the +hw request. +

+ + + +

GNU troff regards the escape sequences \X and \Y as +starting a word; that is, the \% escape sequence in, say, +‘\X'...'\%foobar or ‘\Y'...'\%foobar no longer +prevents hyphenation of ‘foobar’ but inserts a hyphenation point +just prior to it; most likely this isn’t what you want. +See Postprocessor Access. +

+ + + + + + +

\: inserts a non-printing break point; that is, a word can break +there, but the soft hyphen glyph (see below) is not written to the +output if it does. This escape sequence is an input word boundary, so +the remainder of the word is subject to hyphenation as normal. +

+

You can combine \: and \% to control breaking of a file +name or URL, or to permit hyphenation only after certain explicit +hyphens within a word. +

+
+
The \%Lethbridge-Stewart-\:\%Sackville-Baggins divorce
+was, in retrospect, inevitable once the contents of
+\%/var/log/\:\%httpd/\:\%access_log on the family web
+server came to light, revealing visitors from Hogwarts.
+
+
+ +
+
Request: .hc [char]
+
+

Change the hyphenation character to char. This character then +works as the \% escape sequence normally does, and thus no longer +appears in the output.57 Without an +argument, hc resets the hyphenation character to \% (the +default). The hyphenation character is associated with the environment +(see Environments). +

+ +
+
Request: .shc [c]
+
+ + + + + + +

Set the soft hyphen character, inserted when a word is hyphenated +automatically or at a hyphenation character, to the ordinary or special +character c.58 If the argument is omitted, the soft +hyphen character is set to the default, \[hy]. If no glyph for +c exists in the font in use at a potential hyphenation point, then +the line is not broken there. Neither character definitions (specified +with the char and similar requests) nor translations (specified +with the tr request) are applied to c. +

+ + + +

Several requests influence automatic hyphenation. Because conventions +vary, a variety of hyphenation modes is available to the hy +request; these determine whether hyphenation will apply to a +word prior to breaking a line at the end of a page (more or less; see +below for details), and at which positions within that word +automatically determined hyphenation points are permissible. The places +within a word that are eligible for hyphenation are determined by +language-specific data and lettercase relationships. Furthermore, +hyphenation of a word might be suppressed due to a limit on +consecutive hyphenated lines (hlm), a minimum line length +threshold (hym), or because the line can instead be adjusted with +additional inter-word space (hys). +

+ +
+
Request: .hy [mode]
+
+
Register: \n[.hy]
+
+

Set automatic hyphenation mode to mode, an integer encoding +conditions for hyphenation; if omitted, ‘1’ is implied. The +hyphenation mode is available in the read-only register ‘.hy’; it +is associated with the environment (see Environments). The default +hyphenation mode depends on the localization file loaded when GNU +troff starts up; see the hpf request below. +

+

Typesetting practice generally does not avail itself of every +opportunity for hyphenation, but the details differ by language and site +mandates. The hyphenation modes of AT&T troff were +implemented with English-language publishing practices of the 1970s in +mind, not a scrupulous enumeration of conceivable parameters. GNU +troff extends those modes such that finer-grained control is +possible, favoring compatibility with older implementations over a more +intuitive arrangement. The means of hyphenation mode control is a set +of numbers that can be added up to encode the behavior +sought.59 The entries in the +following table are termed values; the sum of the desired +values is the mode. +

+
+
0
+

disables hyphenation. +

+
+
1
+

enables hyphenation except after the first and before the last character +of a word. +

+
+ +

The remaining values “imply” 1; that is, they enable hyphenation +under the same conditions as ‘.hy 1’, and then apply or lift +restrictions relative to that basis. +

+
+
2
+

disables hyphenation of the last word on a page,60 even for explicitly hyphenated words. +

+
+
4
+

disables hyphenation before the last two characters of a word. +

+
+
8
+

disables hyphenation after the first two characters of a word. +

+
+
16
+

enables hyphenation before the last character of a word. +

+
+
32
+

enables hyphenation after the first character of a word. +

+
+ +

Apart from value 2, restrictions imposed by the hyphenation mode +are not respected for words whose hyphenations have been +specified with the hyphenation character (‘\%’ by default) or the +hw request. +

+

Nonzero values in the previous table are additive. For example, +mode 12 causes GNU troff to hyphenate neither the last two +nor the first two characters of a word. Some values cannot be used +together because they contradict; for instance, values 4 and 16, +and values 8 and 32. As noted, it is superfluous to add 1 to any +non-zero even mode. +

+ + +

The automatic placement of hyphens in words is determined by +pattern files, which are derived from TeX and available for +several languages. The number of characters at the beginning of a word +after which the first hyphenation point should be inserted is determined +by the patterns themselves; it can’t be reduced further without +introducing additional, invalid hyphenation points (unfortunately, this +information is not part of a pattern file—you have to know it in +advance). The same is true for the number of characters at the end of +a word before the last hyphenation point should be inserted. For +example, you can supply the following input to ‘echo $(nroff)’. +

+
+
.ll 1
+.hy 48
+splitting
+
+ +

You will get +

+
+
s- plit- t- in- g
+
+ +

instead of the correct ‘split- ting’. English patterns as distributed +with GNU troff need two characters at the beginning and three +characters at the end; this means that value 4 of hy is +mandatory. Value 8 is possible as an additional restriction, but +values 16 and 32 should be avoided, as should mode 1. +Modes 4 and 6 are typical. +

+

A table of left and right minimum character counts for hyphenation as +needed by the patterns distributed with GNU troff follows; see +the groff_tmac(5) man page for more information on GNU +troff’s language macro files. +

+ + + + + + + + + + +
languagepattern nameleft minright min
Czechcs22
Englishen23
Frenchfr23
German traditionaldet22
German reformedden22
Italianit22
Swedishsv12
+ +

Hyphenation exceptions within pattern files (i.e., the words within a +TeX \hyphenation group) obey the hyphenation restrictions +given by hy. +

+ +
+
Request: .nh
+
+

Disable automatic hyphenation; i.e., set the hyphenation mode to 0 +(see above). The hyphenation mode of the last call to hy is not +remembered. +

+ +
+
Request: .hpf pattern-file
+
+
Request: .hpfa pattern-file
+
+
Request: .hpfcode a b [c d] …
+
+ + +

Read hyphenation patterns from pattern-file, which is sought +in the same way that macro files are with the mso request or the +-mname command-line option to groff. The +pattern-file should have the same format as (simple) TeX +pattern files. More specifically, the following scanning rules are +implemented. +

+
    +
  • A percent sign starts a comment (up to the end of the line) even if +preceded by a backslash. + +
  • “Digraphs” like \$ are not supported. + +
  • ^^xx (where each x is 0–9 or a–f) and +^^c (character c in the code point range 0–127 +decimal) are recognized; other uses of ^ cause an error. + +
  • No macro expansion is performed. + +
  • hpf checks for the expression \patterns{…} +(possibly with whitespace before or after the braces). Everything +between the braces is taken as hyphenation patterns. Consequently, +{ and } are not allowed in patterns. + +
  • Similarly, \hyphenation{…} gives a list of hyphenation +exceptions. + +
  • \endinput is recognized also. + +
  • For backward compatibility, if \patterns is missing, the whole +file is treated as a list of hyphenation patterns (except that the +% character is recognized as the start of a comment). +
+ +

The hpfa request appends a file of patterns to the current list. +

+

The hpfcode request defines mapping values for character codes in +pattern files. It is an older mechanism no longer used by GNU +troff’s own macro files; for its successor, see hcode +below. hpf or hpfa apply the mapping after reading the +patterns but before replacing or appending to the active list of +patterns. Its arguments are pairs of character codes—integers from 0 +to 255. The request maps character code a to +code b, code c to code d, and so on. +Character codes that would otherwise be invalid in GNU troff can +be used. By default, every code maps to itself except those for letters +‘A’ to ‘Z’, which map to those for ‘a’ to ‘z’. +

+ + + + + + + + + + +

The set of hyphenation patterns is associated with the language set by +the hla request (see below). The hpf request is usually +invoked by a localization file loaded by the troffrc +file.61 +

+

A second call to hpf (for the same language) replaces the +hyphenation patterns with the new ones. Invoking hpf or +hpfa causes an error if there is no hyphenation language. If no +hpf request is specified (either in the document, in a file +loaded at startup, or in a macro package), GNU troff won’t +automatically hyphenate at all. +

+ +
+
Request: .hcode c1 code1 [c2 code2] …
+
+ + +

Set the hyphenation code of character c1 to code1, that of +c2 to code2, and so on. A hyphenation code must be an +ordinary character (not a special character escape sequence) other than +a digit or a space. The request is ignored if given no arguments. +

+

For hyphenation to work, hyphenation codes must be set up. At +startup, GNU troff assigns hyphenation codes to the letters +‘a’–‘z’ (mapped to themselves), to the letters +‘A’–‘Z’ (mapped to ‘a’–‘z’), and zero to all other +characters. Normally, hyphenation patterns contain only lowercase +letters which should be applied regardless of case. In other words, +they assume that the words ‘FOO’ and ‘Foo’ should be hyphenated exactly +as ‘foo’ is. The hcode request extends this principle to letters +outside the Unicode basic Latin alphabet; without it, words containing +such letters won’t be hyphenated properly even if the corresponding +hyphenation patterns contain them. +

+

For example, the following hcode requests are necessary to assign +hyphenation codes to the letters ‘ÄäÖöÜüß’, needed for German. +

+
+
.hcode ä ä  Ä ä
+.hcode ö ö  Ö ö
+.hcode ü ü  Ü ü
+.hcode ß ß
+
+ +

Without these assignments, GNU troff treats the German word +‘Kindergärten’ (the plural form of ‘kindergarten’) as two words +‘kinderg’ and ‘rten’ because the hyphenation code of the +umlaut a is zero by default, just like a space. There is a German +hyphenation pattern that covers ‘kinder’, so GNU troff finds +the hyphenation ‘kin-der’. The other two hyphenation points +(‘kin-der-gär-ten’) are missed. +

+ +
+
Request: .hla lang
+
+
Register: \n[.hla]
+
+ + + + +

Set the hyphenation language to lang. Hyphenation exceptions +specified with the hw request and hyphenation patterns and +exceptions specified with the hpf and hpfa requests are +associated with the hyphenation language. The hla request is +usually invoked by a localization file, which is turn loaded by the +troffrc or troffrc-end file; see the hpf request +above. +

+ +

The hyphenation language is available in the read-only string-valued +register ‘.hla’; it is associated with the environment +(see Environments). +

+ +
+
Request: .hlm [n]
+
+
Register: \n[.hlm]
+
+
Register: \n[.hlc]
+
+ + + + + +

Set the maximum quantity of consecutive hyphenated lines to n. If +n is negative, there is no maximum. If omitted, n +is −1. This value is associated with the environment +(see Environments). Only lines output from a given environment +count toward the maximum associated with that environment. Hyphens +resulting from \% are counted; explicit hyphens are not. +

+ + +

The .hlm read-only register stores this maximum. The count of +immediately preceding consecutive hyphenated lines is available in the +read-only register .hlc. +

+ +
+
Request: .hym [length]
+
+
Register: \n[.hym]
+
+ + + +

Set the (right) hyphenation margin to length. If the adjustment +mode is not ‘b’ or ‘n’, the line is not hyphenated if it is +shorter than length. Without an argument, the hyphenation margin +is reset to its default value, 0. The default scaling unit is ‘m’. +The hyphenation margin is associated with the environment +(see Environments). +

+

A negative argument resets the hyphenation margin to zero, emitting a +warning in category ‘range’. +

+ +

The hyphenation margin is available in the .hym read-only +register. +

+ +
+
Request: .hys [hyphenation-space]
+
+
Register: \n[.hys]
+
+ + + +

Suppress hyphenation of the line in adjustment modes ‘b’ or +‘n’ if it can be justified by adding no more than +hyphenation-space extra space to each inter-word space. Without +an argument, the hyphenation space adjustment threshold is set to its +default value, 0. The default scaling unit is ‘m’. The +hyphenation space adjustment threshold is associated with the +environment (see Environments). +

+

A negative argument resets the hyphenation space adjustment threshold to +zero, emitting a warning in category ‘range’. +

+ +

The hyphenation space adjustment threshold is available in the +.hys read-only register. +

+ + + +
+
+ + + + + + -- cgit v1.2.3