summaryrefslogtreecommitdiffstats
path: root/man7/locale.7
blob: 49aa36708946317753b2dc901208dd64f325de3b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
.\" Copyright (c) 1993 by Thomas Koenig (ig25@rz.uni-karlsruhe.de)
.\" and Copyright (C) 2014 Michael Kerrisk <mtk.manpages@gmail.com>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
.\" Modified Sat Jul 24 17:28:34 1993 by Rik Faith <faith@cs.unc.edu>
.\" Modified Sun Jun 01 17:16:34 1997 by Jochen Hein
.\"   <jochen.hein@delphi.central.de>
.\" Modified Thu Apr 25 00:43:19 2002 by Bruno Haible <bruno@clisp.org>
.\"
.TH locale 7 2023-05-03 "Linux man-pages 6.05.01"
.SH NAME
locale \- description of multilanguage support
.SH SYNOPSIS
.nf
.B #include <locale.h>
.fi
.SH DESCRIPTION
A locale is a set of language and cultural rules.
These cover aspects
such as language for messages, different character sets, lexicographic
conventions, and so on.
A program needs to be able to determine its locale
and act accordingly to be portable to different cultures.
.PP
The header
.I <locale.h>
declares data types, functions, and macros which are useful in this
task.
.PP
The functions it declares are
.BR setlocale (3)
to set the current locale, and
.BR localeconv (3)
to get information about number formatting.
.PP
There are different categories for locale information a program might
need; they are declared as macros.
Using them as the first argument
to the
.BR setlocale (3)
function, it is possible to set one of these to the desired locale:
.TP
.BR LC_ADDRESS " (GNU extension, since glibc 2.2)"
.\" See ISO/IEC Technical Report 14652
Change settings that describe the formats (e.g., postal addresses)
used to describe locations and geography-related items.
Applications that need this information can use
.BR nl_langinfo (3)
to retrieve nonstandard elements, such as
.B _NL_ADDRESS_COUNTRY_NAME
(country name, in the language of the locale)
and
.B _NL_ADDRESS_LANG_NAME
(language name, in the language of the locale),
which return strings such as "Deutschland" and "Deutsch"
(for German-language locales).
(Other element names are listed in
.IR <langinfo.h> .)
.TP
.B LC_COLLATE
This category governs the collation rules used for
sorting and regular expressions,
including character equivalence classes and
multicharacter collating elements.
This locale category changes the behavior of the functions
.BR strcoll (3)
and
.BR strxfrm (3),
which are used to compare strings in the local alphabet.
For example,
the German sharp s is sorted as "ss".
.TP
.B LC_CTYPE
This category determines the interpretation of byte sequences as characters
(e.g., single versus multibyte characters), character classifications
(e.g., alphabetic or digit), and the behavior of character classes.
On glibc systems, this category also determines
the character transliteration rules for
.BR iconv (1)
and
.BR iconv (3).
It changes the behavior of the character handling and
classification functions, such as
.BR isupper (3)
and
.BR toupper (3),
and the multibyte character functions such as
.BR mblen (3)
or
.BR wctomb (3).
.TP
.BR LC_IDENTIFICATION " (GNU extension, since glibc 2.2)"
.\" See ISO/IEC Technical Report 14652
Change settings that relate to the metadata for the locale.
Applications that need this information can use
.BR nl_langinfo (3)
to retrieve nonstandard elements, such as
.B _NL_IDENTIFICATION_TITLE
(title of this locale document)
and
.B _NL_IDENTIFICATION_TERRITORY
(geographical territory to which this locale document applies),
which might return strings such as "English locale for the USA"
and "USA".
(Other element names are listed in
.IR <langinfo.h> .)
.TP
.B LC_MONETARY
This category determines the formatting used for
monetary-related numeric values.
This changes the information returned by
.BR localeconv (3),
which describes the way numbers are usually printed, with details such
as decimal point versus decimal comma.
This information is internally
used by the function
.BR strfmon (3).
.TP
.B LC_MESSAGES
This category affects the language in which messages are displayed
and what an affirmative or negative answer looks like.
The GNU C library contains the
.BR gettext (3),
.BR ngettext (3),
and
.BR rpmatch (3)
functions to ease the use of this information.
The GNU gettext family of
functions also obey the environment variable
.B LANGUAGE
(containing a colon-separated list of locales)
if the category is set to a valid locale other than
.BR """C""" .
This category also affects the behavior of
.BR catopen (3).
.TP
.BR LC_MEASUREMENT " (GNU extension, since glibc 2.2)"
Change the settings relating to the measurement system in the locale
(i.e., metric versus US customary units).
Applications can use
.BR nl_langinfo (3)
to retrieve the nonstandard
.B _NL_MEASUREMENT_MEASUREMENT
element, which returns a pointer to a character
that has the value 1 (metric) or 2 (US customary units).
.TP
.BR LC_NAME " (GNU extension, since glibc 2.2)"
.\" See ISO/IEC Technical Report 14652
Change settings that describe the formats used to address persons.
Applications that need this information can use
.BR nl_langinfo (3)
to retrieve nonstandard elements, such as
.B _NL_NAME_NAME_MR
(general salutation for men)
and
.B _NL_NAME_NAME_MS
(general salutation for women)
elements, which return strings such as "Herr" and "Frau"
(for German-language locales).
(Other element names are listed in
.IR <langinfo.h> .)
.TP
.B LC_NUMERIC
This category determines the formatting rules used for nonmonetary
numeric values\[em]for example,
the thousands separator and the radix character
(a period in most English-speaking countries,
but a comma in many other regions).
It affects functions such as
.BR printf (3),
.BR scanf (3),
and
.BR strtod (3).
This information can also be read with the
.BR localeconv (3)
function.
.TP
.BR LC_PAPER " (GNU extension, since glibc 2.2)"
.\" See ISO/IEC Technical Report 14652
Change the settings relating to the dimensions of the standard paper size
(e.g., US letter versus A4).
Applications that need the dimensions can obtain them by using
.BR nl_langinfo (3)
to retrieve the nonstandard
.B _NL_PAPER_WIDTH
and
.B _NL_PAPER_HEIGHT
elements, which return
.I int
values specifying the dimensions in millimeters.
.TP
.BR LC_TELEPHONE " (GNU extension, since glibc 2.2)"
.\" See ISO/IEC Technical Report 14652
Change settings that describe the formats to be used with telephone services.
Applications that need this information can use
.BR nl_langinfo (3)
to retrieve nonstandard elements, such as
.B _NL_TELEPHONE_INT_PREFIX
(international prefix used to call numbers in this locale),
which returns a string such as "49" (for Germany).
(Other element names are listed in
.IR <langinfo.h> .)
.TP
.B LC_TIME
This category governs the formatting used for date and time values.
For example, most of Europe uses a 24-hour clock versus the
12-hour clock used in the United States.
The setting of this category affects the behavior of functions such as
.BR strftime (3)
and
.BR strptime (3).
.TP
.B LC_ALL
All of the above.
.PP
If the second argument to
.BR setlocale (3)
is an empty string,
.IR \[dq]\[dq] ,
for the default locale, it is determined using the following steps:
.IP (1) 5
If there is a non-null environment variable
.BR LC_ALL ,
the value of
.B LC_ALL
is used.
.IP (2)
If an environment variable with the same name as one of the categories
above exists and is non-null, its value is used for that category.
.IP (3)
If there is a non-null environment variable
.BR LANG ,
the value of
.B LANG
is used.
.PP
Values about local numeric formatting is made available in a
.I struct lconv
returned by the
.BR localeconv (3)
function, which has the following declaration:
.PP
.in +4n
.EX
struct lconv {
\&
    /* Numeric (nonmonetary) information */
\&
    char *decimal_point;     /* Radix character */
    char *thousands_sep;     /* Separator for digit groups to left
                                of radix character */
    char *grouping;     /* Each element is the number of digits in
                           a group; elements with higher indices
                           are further left.  An element with value
                           CHAR_MAX means that no further grouping
                           is done.  An element with value 0 means
                           that the previous element is used for
                           all groups further left. */
\&
    /* Remaining fields are for monetary information */
\&
    char *int_curr_symbol;   /* First three chars are a currency
                                symbol from ISO 4217.  Fourth char
                                is the separator.  Fifth char
                                is \[aq]\e0\[aq]. */
    char *currency_symbol;   /* Local currency symbol */
    char *mon_decimal_point; /* Radix character */
    char *mon_thousands_sep; /* Like \fIthousands_sep\fP above */
    char *mon_grouping;      /* Like \fIgrouping\fP above */
    char *positive_sign;     /* Sign for positive values */
    char *negative_sign;     /* Sign for negative values */
    char  int_frac_digits;   /* International fractional digits */
    char  frac_digits;       /* Local fractional digits */
    char  p_cs_precedes;     /* 1 if currency_symbol precedes a
                                positive value, 0 if succeeds */
    char  p_sep_by_space;    /* 1 if a space separates
                                currency_symbol from a positive
                                value */
    char  n_cs_precedes;     /* 1 if currency_symbol precedes a
                                negative value, 0 if succeeds */
    char  n_sep_by_space;    /* 1 if a space separates
                                currency_symbol from a negative
                                value */
    /* Positive and negative sign positions:
       0 Parentheses surround the quantity and currency_symbol.
       1 The sign string precedes the quantity and currency_symbol.
       2 The sign string succeeds the quantity and currency_symbol.
       3 The sign string immediately precedes the currency_symbol.
       4 The sign string immediately succeeds the currency_symbol. */
    char  p_sign_posn;
    char  n_sign_posn;
};
.EE
.in
.SS POSIX.1-2008 extensions to the locale API
POSIX.1-2008 standardized a number of extensions to the locale API,
based on implementations that first appeared in glibc 2.3.
These extensions are designed to address the problem that
the traditional locale APIs do not mix well with multithreaded applications
and with applications that must deal with multiple locales.
.PP
The extensions take the form of new functions for creating and
manipulating locale objects
.RB ( newlocale (3),
.BR freelocale (3),
.BR duplocale (3),
and
.BR uselocale (3))
and various new library functions with the suffix "_l" (e.g.,
.BR toupper_l (3))
that extend the traditional locale-dependent APIs (e.g.,
.BR toupper (3))
to allow the specification of a locale object that should apply when
executing the function.
.SH ENVIRONMENT
The following environment variable is used by
.BR newlocale (3)
and
.BR setlocale (3),
and thus affects all unprivileged localized programs:
.TP
.B LOCPATH
A list of pathnames, separated by colons (\[aq]:\[aq]),
that should be used to find locale data.
If this variable is set,
only the individual compiled locale data files from
.B LOCPATH
and the system default locale data path are used;
any available locale archives are not used (see
.BR localedef (1)).
The individual compiled locale data files are searched for under
subdirectories which depend on the currently used locale.
For example, when
.I en_GB.UTF\-8
is used for a category, the following subdirectories are searched for,
in this order:
.IR en_GB.UTF\-8 ,
.IR en_GB.utf8 ,
.IR en_GB ,
.IR en.UTF\-8 ,
.IR en.utf8 ,
and
.IR en .
.SH FILES
.TP
.I /usr/lib/locale/locale\-archive
Usual default locale archive location.
.TP
.I /usr/lib/locale
Usual default path for compiled individual locale files.
.SH STANDARDS
POSIX.1-2001.
.\"
.\" The GNU gettext functions are specified in LI18NUX2000.
.SH SEE ALSO
.BR iconv (1),
.BR locale (1),
.BR localedef (1),
.BR catopen (3),
.BR gettext (3),
.BR iconv (3),
.BR localeconv (3),
.BR mbstowcs (3),
.BR newlocale (3),
.BR ngettext (3),
.BR nl_langinfo (3),
.BR rpmatch (3),
.BR setlocale (3),
.BR strcoll (3),
.BR strfmon (3),
.BR strftime (3),
.BR strxfrm (3),
.BR uselocale (3),
.BR wcstombs (3),
.BR locale (5),
.BR charsets (7),
.BR unicode (7),
.BR utf\-8 (7)