1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
|
.\" -*- mode: troff; coding: utf-8 -*-
.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
.ie n \{\
. ds C` ""
. ds C' ""
'br\}
.el\{\
. ds C`
. ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD. Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
. if \nF \{\
. de IX
. tm Index:\\$1\t\\n%\t"\\$2"
..
. if !\nF==2 \{\
. nr % 0
. nr F 2
. \}
. \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "PERLCLIB 1"
.TH PERLCLIB 1 2023-11-28 "perl v5.38.2" "Perl Programmers Reference Guide"
.\" For nroff, turn off justification. Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH NAME
perlclib \- Internal replacements for standard C library functions
.SH DESCRIPTION
.IX Header "DESCRIPTION"
One thing Perl porters should note is that \fIperl\fR doesn't tend to use that
much of the C standard library internally; you'll see very little use of,
for example, the \fIctype.h\fR functions in there. This is because Perl
tends to reimplement or abstract standard library functions, so that we
know exactly how they're going to operate.
.PP
This is a reference card for people who are familiar with the C library
and who want to do things the Perl way; to tell them which functions
they ought to use instead of the more normal C functions.
.SS Conventions
.IX Subsection "Conventions"
In the following tables:
.ie n .IP """t""" 3
.el .IP \f(CWt\fR 3
.IX Item "t"
is a type.
.ie n .IP """p""" 3
.el .IP \f(CWp\fR 3
.IX Item "p"
is a pointer.
.ie n .IP """n""" 3
.el .IP \f(CWn\fR 3
.IX Item "n"
is a number.
.ie n .IP """s""" 3
.el .IP \f(CWs\fR 3
.IX Item "s"
is a string.
.PP
\&\f(CW\*(C`sv\*(C'\fR, \f(CW\*(C`av\*(C'\fR, \f(CW\*(C`hv\*(C'\fR, etc. represent variables of their respective types.
.SS "File Operations"
.IX Subsection "File Operations"
Instead of the \fIstdio.h\fR functions, you should use the Perl abstraction
layer. Instead of \f(CW\*(C`FILE*\*(C'\fR types, you need to be handling \f(CW\*(C`PerlIO*\*(C'\fR
types. Don't forget that with the new PerlIO layered I/O abstraction
\&\f(CW\*(C`FILE*\*(C'\fR types may not even be available. See also the \f(CW\*(C`perlapio\*(C'\fR
documentation for more information about the following functions:
.PP
.Vb 1
\& Instead Of: Use:
\&
\& stdin PerlIO_stdin()
\& stdout PerlIO_stdout()
\& stderr PerlIO_stderr()
\&
\& fopen(fn, mode) PerlIO_open(fn, mode)
\& freopen(fn, mode, stream) PerlIO_reopen(fn, mode, perlio) (Dep\-
\& recated)
\& fflush(stream) PerlIO_flush(perlio)
\& fclose(stream) PerlIO_close(perlio)
.Ve
.SS "File Input and Output"
.IX Subsection "File Input and Output"
.Vb 1
\& Instead Of: Use:
\&
\& fprintf(stream, fmt, ...) PerlIO_printf(perlio, fmt, ...)
\&
\& [f]getc(stream) PerlIO_getc(perlio)
\& [f]putc(stream, n) PerlIO_putc(perlio, n)
\& ungetc(n, stream) PerlIO_ungetc(perlio, n)
.Ve
.PP
Note that the PerlIO equivalents of \f(CW\*(C`fread\*(C'\fR and \f(CW\*(C`fwrite\*(C'\fR are slightly
different from their C library counterparts:
.PP
.Vb 2
\& fread(p, size, n, stream) PerlIO_read(perlio, buf, numbytes)
\& fwrite(p, size, n, stream) PerlIO_write(perlio, buf, numbytes)
\&
\& fputs(s, stream) PerlIO_puts(perlio, s)
.Ve
.PP
There is no equivalent to \f(CW\*(C`fgets\*(C'\fR; one should use \f(CW\*(C`sv_gets\*(C'\fR instead:
.PP
.Vb 1
\& fgets(s, n, stream) sv_gets(sv, perlio, append)
.Ve
.SS "File Positioning"
.IX Subsection "File Positioning"
.Vb 1
\& Instead Of: Use:
\&
\& feof(stream) PerlIO_eof(perlio)
\& fseek(stream, n, whence) PerlIO_seek(perlio, n, whence)
\& rewind(stream) PerlIO_rewind(perlio)
\&
\& fgetpos(stream, p) PerlIO_getpos(perlio, sv)
\& fsetpos(stream, p) PerlIO_setpos(perlio, sv)
\&
\& ferror(stream) PerlIO_error(perlio)
\& clearerr(stream) PerlIO_clearerr(perlio)
.Ve
.SS "Memory Management and String Handling"
.IX Subsection "Memory Management and String Handling"
.Vb 1
\& Instead Of: Use:
\&
\& t* p = malloc(n) Newx(p, n, t)
\& t* p = calloc(n, s) Newxz(p, n, t)
\& p = realloc(p, n) Renew(p, n, t)
\& memcpy(dst, src, n) Copy(src, dst, n, t)
\& memmove(dst, src, n) Move(src, dst, n, t)
\& memcpy(dst, src, sizeof(t)) StructCopy(src, dst, t)
\& memset(dst, 0, n * sizeof(t)) Zero(dst, n, t)
\& memzero(dst, 0) Zero(dst, n, char)
\& free(p) Safefree(p)
\&
\& strdup(p) savepv(p)
\& strndup(p, n) savepvn(p, n) (Hey, strndup doesn\*(Aqt
\& exist!)
\&
\& strstr(big, little) instr(big, little)
\& strcmp(s1, s2) strLE(s1, s2) / strEQ(s1, s2)
\& / strGT(s1,s2)
\& strncmp(s1, s2, n) strnNE(s1, s2, n) / strnEQ(s1, s2, n)
\&
\& memcmp(p1, p2, n) memNE(p1, p2, n)
\& !memcmp(p1, p2, n) memEQ(p1, p2, n)
.Ve
.PP
Notice the different order of arguments to \f(CW\*(C`Copy\*(C'\fR and \f(CW\*(C`Move\*(C'\fR than used
in \f(CW\*(C`memcpy\*(C'\fR and \f(CW\*(C`memmove\*(C'\fR.
.PP
Most of the time, though, you'll want to be dealing with SVs internally
instead of raw \f(CW\*(C`char *\*(C'\fR strings:
.PP
.Vb 6
\& strlen(s) sv_len(sv)
\& strcpy(dt, src) sv_setpv(sv, s)
\& strncpy(dt, src, n) sv_setpvn(sv, s, n)
\& strcat(dt, src) sv_catpv(sv, s)
\& strncat(dt, src) sv_catpvn(sv, s)
\& sprintf(s, fmt, ...) sv_setpvf(sv, fmt, ...)
.Ve
.PP
Note also the existence of \f(CW\*(C`sv_catpvf\*(C'\fR and \f(CW\*(C`sv_vcatpvfn\*(C'\fR, combining
concatenation with formatting.
.PP
Sometimes instead of zeroing the allocated heap by using \fBNewxz()\fR you
should consider "poisoning" the data. This means writing a bit
pattern into it that should be illegal as pointers (and floating point
numbers), and also hopefully surprising enough as integers, so that
any code attempting to use the data without forethought will break
sooner rather than later. Poisoning can be done using the \fBPoison()\fR
macros, which have similar arguments to \fBZero()\fR:
.PP
.Vb 4
\& PoisonWith(dst, n, t, b) scribble memory with byte b
\& PoisonNew(dst, n, t) equal to PoisonWith(dst, n, t, 0xAB)
\& PoisonFree(dst, n, t) equal to PoisonWith(dst, n, t, 0xEF)
\& Poison(dst, n, t) equal to PoisonFree(dst, n, t)
.Ve
.SS "Character Class Tests"
.IX Subsection "Character Class Tests"
There are several types of character class tests that Perl implements.
The only ones described here are those that directly correspond to C
library functions that operate on 8\-bit characters, but there are
equivalents that operate on wide characters, and UTF\-8 encoded strings.
All are more fully described in "Character classification" in perlapi and
"Character case changing" in perlapi.
.PP
The C library routines listed in the table below return values based on
the current locale. Use the entries in the final column for that
functionality. The other two columns always assume a POSIX (or C)
locale. The entries in the ASCII column are only meaningful for ASCII
inputs, returning FALSE for anything else. Use these only when you
\&\fBknow\fR that is what you want. The entries in the Latin1 column assume
that the non-ASCII 8\-bit characters are as Unicode defines, them, the
same as ISO\-8859\-1, often called Latin 1.
.PP
.Vb 1
\& Instead Of: Use for ASCII: Use for Latin1: Use for locale:
\&
\& isalnum(c) isALPHANUMERIC(c) isALPHANUMERIC_L1(c) isALPHANUMERIC_LC(c)
\& isalpha(c) isALPHA(c) isALPHA_L1(c) isALPHA_LC(u )
\& isascii(c) isASCII(c) isASCII_LC(c)
\& isblank(c) isBLANK(c) isBLANK_L1(c) isBLANK_LC(c)
\& iscntrl(c) isCNTRL(c) isCNTRL_L1(c) isCNTRL_LC(c)
\& isdigit(c) isDIGIT(c) isDIGIT_L1(c) isDIGIT_LC(c)
\& isgraph(c) isGRAPH(c) isGRAPH_L1(c) isGRAPH_LC(c)
\& islower(c) isLOWER(c) isLOWER_L1(c) isLOWER_LC(c)
\& isprint(c) isPRINT(c) isPRINT_L1(c) isPRINT_LC(c)
\& ispunct(c) isPUNCT(c) isPUNCT_L1(c) isPUNCT_LC(c)
\& isspace(c) isSPACE(c) isSPACE_L1(c) isSPACE_LC(c)
\& isupper(c) isUPPER(c) isUPPER_L1(c) isUPPER_LC(c)
\& isxdigit(c) isXDIGIT(c) isXDIGIT_L1(c) isXDIGIT_LC(c)
\&
\& tolower(c) toLOWER(c) toLOWER_L1(c)
\& toupper(c) toUPPER(c)
.Ve
.PP
To emphasize that you are operating only on ASCII characters, you can
append \f(CW\*(C`_A\*(C'\fR to each of the macros in the ASCII column: \f(CW\*(C`isALPHA_A\*(C'\fR,
\&\f(CW\*(C`isDIGIT_A\*(C'\fR, and so on.
.PP
(There is no entry in the Latin1 column for \f(CW\*(C`isascii\*(C'\fR even though there
is an \f(CW\*(C`isASCII_L1\*(C'\fR, which is identical to \f(CW\*(C`isASCII\*(C'\fR; the
latter name is clearer. There is no entry in the Latin1 column for
\&\f(CW\*(C`toupper\*(C'\fR because the result can be non\-Latin1. You have to use
\&\f(CW\*(C`toUPPER_uvchr\*(C'\fR, as described in "Character case changing" in perlapi.)
.SS "\fIstdlib.h\fP functions"
.IX Subsection "stdlib.h functions"
.Vb 1
\& Instead Of: Use:
\&
\& atof(s) Atof(s)
\& atoi(s) grok_atoUV(s, &uv, &e)
\& atol(s) grok_atoUV(s, &uv, &e)
\& strtod(s, &p) Strtod(s, &p)
\& strtol(s, &p, n) Strtol(s, &p, b)
\& strtoul(s, &p, n) Strtoul(s, &p, b)
.Ve
.PP
Typical use is to do range checks on \f(CW\*(C`uv\*(C'\fR before casting:
.PP
.Vb 9
\& int i; UV uv;
\& char* end_ptr = input_end;
\& if (grok_atoUV(input, &uv, &end_ptr)
\& && uv <= INT_MAX)
\& i = (int)uv;
\& ... /* continue parsing from end_ptr */
\& } else {
\& ... /* parse error: not a decimal integer in range 0 .. MAX_IV */
\& }
.Ve
.PP
Notice also the \f(CW\*(C`grok_bin\*(C'\fR, \f(CW\*(C`grok_hex\*(C'\fR, and \f(CW\*(C`grok_oct\*(C'\fR functions in
\&\fInumeric.c\fR for converting strings representing numbers in the respective
bases into \f(CW\*(C`NV\*(C'\fRs. Note that \fBgrok_atoUV()\fR doesn't handle negative inputs,
or leading whitespace (being purposefully strict).
.PP
Note that \fBstrtol()\fR and \fBstrtoul()\fR may be disguised as \fBStrtol()\fR, \fBStrtoul()\fR,
\&\fBAtol()\fR, \fBAtoul()\fR. Avoid those, too.
.PP
In theory \f(CW\*(C`Strtol\*(C'\fR and \f(CW\*(C`Strtoul\*(C'\fR may not be defined if the machine perl is
built on doesn't actually have strtol and strtoul. But as those 2
functions are part of the 1989 ANSI C spec we suspect you'll find them
everywhere by now.
.PP
.Vb 3
\& int rand() double Drand01()
\& srand(n) { seedDrand01((Rand_seed_t)n);
\& PL_srand_called = TRUE; }
\&
\& exit(n) my_exit(n)
\& system(s) Don\*(Aqt. Look at pp_system or use my_popen.
\&
\& getenv(s) PerlEnv_getenv(s)
\& setenv(s, val) my_setenv(s, val)
.Ve
.SS "Miscellaneous functions"
.IX Subsection "Miscellaneous functions"
You should not even \fBwant\fR to use \fIsetjmp.h\fR functions, but if you
think you do, use the \f(CW\*(C`JMPENV\*(C'\fR stack in \fIscope.h\fR instead.
.PP
For \f(CW\*(C`signal\*(C'\fR/\f(CW\*(C`sigaction\*(C'\fR, use \f(CW\*(C`rsignal(signo, handler)\*(C'\fR.
.SH "SEE ALSO"
.IX Header "SEE ALSO"
perlapi, perlapio, perlguts
|