1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
|
'\" t
.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
.\"
.\" SPDX-License-Identifier: GPL-2.0-or-later
.\"
.\" References consulted:
.\" GNU glibc-2 source code and manual
.\" Dinkumware C library reference http://www.dinkumware.com/
.\" OpenGroup's Single UNIX specification http://www.UNIX-systems.org/online.html
.\" ISO/IEC 9899:1999
.\"
.TH mblen 3 2023-03-30 "Linux man-pages 6.04"
.SH NAME
mblen \- determine number of bytes in next multibyte character
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
.SH SYNOPSIS
.nf
.B #include <stdlib.h>
.PP
.BI "int mblen(const char " s [. n "], size_t " n );
.fi
.SH DESCRIPTION
If
.I s
is not NULL, the
.BR mblen ()
function inspects at most
.I n
bytes of the multibyte string starting at
.I s
and extracts the
next complete multibyte character.
It uses a static anonymous shift state known only to the
.BR mblen ()
function.
If the multibyte character is not the null wide
character, it returns the number of bytes that were consumed from
.IR s .
If the multibyte character is the null wide character, it returns 0.
.PP
If the
.I n
bytes starting at
.I s
do not contain a complete multibyte
character,
.BR mblen ()
returns \-1.
This can happen even if
.I n
is greater than or equal to
.IR MB_CUR_MAX ,
if the multibyte string contains redundant shift sequences.
.PP
If the multibyte string starting at
.I s
contains an invalid multibyte
sequence before the next complete character,
.BR mblen ()
also returns \-1.
.PP
If
.I s
is NULL, the
.BR mblen ()
function
.\" The Dinkumware doc and the Single UNIX specification say this, but
.\" glibc doesn't implement this.
resets the shift state, known to only this function, to the initial state, and
returns nonzero if the encoding has nontrivial shift state, or zero if the
encoding is stateless.
.SH RETURN VALUE
The
.BR mblen ()
function returns the number of
bytes parsed from the multibyte
sequence starting at
.IR s ,
if a non-null wide character was recognized.
It returns 0, if a null wide character was recognized.
It returns \-1, if an
invalid multibyte sequence was encountered or if it couldn't parse a complete
multibyte character.
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
.ad l
.nh
.TS
allbox;
lbx lb lb
l l l.
Interface Attribute Value
T{
.BR mblen ()
T} Thread safety MT-Unsafe race
.TE
.hy
.ad
.sp 1
.SH VERSIONS
The function
.BR mbrlen (3)
provides a better interface to the same
functionality.
.SH STANDARDS
C11, POSIX.1-2008.
.SH HISTORY
POSIX.1-2001, C99.
.SH NOTES
The behavior of
.BR mblen ()
depends on the
.B LC_CTYPE
category of the
current locale.
.SH SEE ALSO
.BR mbrlen (3)
|