1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
|
'\" t
.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
.\"
.\" SPDX-License-Identifier: GPL-2.0-or-later
.\"
.\" References consulted:
.\" GNU glibc-2 source code and manual
.\" Dinkumware C library reference http://www.dinkumware.com/
.\" OpenGroup's Single UNIX specification http://www.UNIX-systems.org/online.html
.\" ISO/IEC 9899:1999
.\"
.TH mbsinit 3 2023-07-20 "Linux man-pages 6.05.01"
.SH NAME
mbsinit \- test for initial shift state
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
.SH SYNOPSIS
.nf
.B #include <wchar.h>
.PP
.BI "int mbsinit(const mbstate_t *" ps );
.fi
.SH DESCRIPTION
Character conversion between the multibyte representation and the wide
character representation uses conversion state, of type
.IR mbstate_t .
Conversion of a string uses a finite-state machine; when it is interrupted
after the complete conversion of a number of characters, it may need to
save a state for processing the remaining characters.
Such a conversion
state is needed for the sake of encodings such as ISO-2022 and UTF-7.
.PP
The initial state is the state at the beginning of conversion of a string.
There are two kinds of state: the one used by multibyte to wide character
conversion functions, such as
.BR mbsrtowcs (3),
and the one used by wide
character to multibyte conversion functions, such as
.BR wcsrtombs (3),
but they both fit in a
.IR mbstate_t ,
and they both have the same
representation for an initial state.
.PP
For 8-bit encodings, all states are equivalent to the initial state.
For multibyte encodings like UTF-8, EUC-*, BIG5, or SJIS, the wide character
to multibyte conversion functions never produce non-initial states, but the
multibyte to wide-character conversion functions like
.BR mbrtowc (3)
do
produce non-initial states when interrupted in the middle of a character.
.PP
One possible way to create an
.I mbstate_t
in initial state is to set it to zero:
.PP
.in +4n
.EX
mbstate_t state;
memset(&state, 0, sizeof(state));
.EE
.in
.PP
On Linux, the following works as well, but might generate compiler warnings:
.PP
.in +4n
.EX
mbstate_t state = { 0 };
.EE
.in
.PP
The function
.BR mbsinit ()
tests whether
.I *ps
corresponds to an
initial state.
.SH RETURN VALUE
.BR mbsinit ()
returns nonzero if
.I *ps
is an initial state, or if
.I ps
is NULL.
Otherwise, it returns 0.
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
.TS
allbox;
lbx lb lb
l l l.
Interface Attribute Value
T{
.na
.nh
.BR mbsinit ()
T} Thread safety MT-Safe
.TE
.sp 1
.SH STANDARDS
C11, POSIX.1-2008.
.SH HISTORY
POSIX.1-2001, C99.
.SH NOTES
The behavior of
.BR mbsinit ()
depends on the
.B LC_CTYPE
category of the
current locale.
.SH SEE ALSO
.BR mbrlen (3),
.BR mbrtowc (3),
.BR mbsrtowcs (3),
.BR wcrtomb (3),
.BR wcsrtombs (3)
|