diff options
Diffstat (limited to '')
-rw-r--r-- | upstream/mageia-cauldron/man1/perlopentut.1 | 472 |
1 files changed, 472 insertions, 0 deletions
diff --git a/upstream/mageia-cauldron/man1/perlopentut.1 b/upstream/mageia-cauldron/man1/perlopentut.1 new file mode 100644 index 00000000..2a4fe8c0 --- /dev/null +++ b/upstream/mageia-cauldron/man1/perlopentut.1 @@ -0,0 +1,472 @@ +.\" -*- mode: troff; coding: utf-8 -*- +.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. +.ie n \{\ +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds C` +. ds C' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is >0, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.\" +.\" Avoid warning from groff about undefined register 'F'. +.de IX +.. +.nr rF 0 +.if \n(.g .if rF .nr rF 1 +.if (\n(rF:(\n(.g==0)) \{\ +. if \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. if !\nF==2 \{\ +. nr % 0 +. nr F 2 +. \} +. \} +.\} +.rr rF +.\" ======================================================================== +.\" +.IX Title "PERLOPENTUT 1" +.TH PERLOPENTUT 1 2023-11-28 "perl v5.38.2" "Perl Programmers Reference Guide" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH NAME +perlopentut \- simple recipes for opening files and pipes in Perl +.SH DESCRIPTION +.IX Header "DESCRIPTION" +Whenever you do I/O on a file in Perl, you do so through what in Perl is +called a \fBfilehandle\fR. A filehandle is an internal name for an external +file. It is the job of the \f(CW\*(C`open\*(C'\fR function to make the association +between the internal name and the external name, and it is the job +of the \f(CW\*(C`close\*(C'\fR function to break that association. +.PP +For your convenience, Perl sets up a few special filehandles that are +already open when you run. These include \f(CW\*(C`STDIN\*(C'\fR, \f(CW\*(C`STDOUT\*(C'\fR, \f(CW\*(C`STDERR\*(C'\fR, +and \f(CW\*(C`ARGV\*(C'\fR. Since those are pre-opened, you can use them right away +without having to go to the trouble of opening them yourself: +.PP +.Vb 1 +\& print STDERR "This is a debugging message.\en"; +\& +\& print STDOUT "Please enter something: "; +\& $response = <STDIN> // die "how come no input?"; +\& print STDOUT "Thank you!\en"; +\& +\& while (<ARGV>) { ... } +.Ve +.PP +As you see from those examples, \f(CW\*(C`STDOUT\*(C'\fR and \f(CW\*(C`STDERR\*(C'\fR are output +handles, and \f(CW\*(C`STDIN\*(C'\fR and \f(CW\*(C`ARGV\*(C'\fR are input handles. They are +in all capital letters because they are reserved to Perl, much +like the \f(CW@ARGV\fR array and the \f(CW%ENV\fR hash are. Their external +associations were set up by your shell. +.PP +You will need to open every other filehandle on your own. Although there +are many variants, the most common way to call Perl's \fBopen()\fR function +is with three arguments and one return value: +.PP +\&\f(CW\*(C` \fR\f(CIOK\fR\f(CW = open(\fR\f(CIHANDLE\fR\f(CW, \fR\f(CIMODE\fR\f(CW, \fR\f(CIPATHNAME\fR\f(CW)\*(C'\fR +.PP +Where: +.IP \fIOK\fR 4 +.IX Item "OK" +will be some defined value if the open succeeds, but +\&\f(CW\*(C`undef\*(C'\fR if it fails; +.IP \fIHANDLE\fR 4 +.IX Item "HANDLE" +should be an undefined scalar variable to be filled in by the +\&\f(CW\*(C`open\*(C'\fR function if it succeeds; +.IP \fIMODE\fR 4 +.IX Item "MODE" +is the access mode and the encoding format to open the file with; +.IP \fIPATHNAME\fR 4 +.IX Item "PATHNAME" +is the external name of the file you want opened. +.PP +Most of the complexity of the \f(CW\*(C`open\*(C'\fR function lies in the many +possible values that the \fIMODE\fR parameter can take on. +.PP +One last thing before we show you how to open files: opening +files does not (usually) automatically lock them in Perl. See +perlfaq5 for how to lock. +.SH "Opening Text Files" +.IX Header "Opening Text Files" +.SS "Opening Text Files for Reading" +.IX Subsection "Opening Text Files for Reading" +If you want to read from a text file, first open it in +read-only mode like this: +.PP +.Vb 3 +\& my $filename = "/some/path/to/a/textfile/goes/here"; +\& my $encoding = ":encoding(UTF\-8)"; +\& my $handle = undef; # this will be filled in on success +\& +\& open($handle, "< $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename for reading: $!"; +.Ve +.PP +As with the shell, in Perl the \f(CW"<"\fR is used to open the file in +read-only mode. If it succeeds, Perl allocates a brand new filehandle for +you and fills in your previously undefined \f(CW$handle\fR argument with a +reference to that handle. +.PP +Now you may use functions like \f(CW\*(C`readline\*(C'\fR, \f(CW\*(C`read\*(C'\fR, \f(CW\*(C`getc\*(C'\fR, and +\&\f(CW\*(C`sysread\*(C'\fR on that handle. Probably the most common input function +is the one that looks like an operator: +.PP +.Vb 2 +\& $line = readline($handle); +\& $line = <$handle>; # same thing +.Ve +.PP +Because the \f(CW\*(C`readline\*(C'\fR function returns \f(CW\*(C`undef\*(C'\fR at end of file or +upon error, you will sometimes see it used this way: +.PP +.Vb 7 +\& $line = <$handle>; +\& if (defined $line) { +\& # do something with $line +\& } +\& else { +\& # $line is not valid, so skip it +\& } +.Ve +.PP +You can also just quickly \f(CW\*(C`die\*(C'\fR on an undefined value this way: +.PP +.Vb 1 +\& $line = <$handle> // die "no input found"; +.Ve +.PP +However, if hitting EOF is an expected and normal event, you do not want to +exit simply because you have run out of input. Instead, you probably just want +to exit an input loop. You can then test to see if an actual error has caused +the loop to terminate, and act accordingly: +.PP +.Vb 6 +\& while (<$handle>) { +\& # do something with data in $_ +\& } +\& if ($!) { +\& die "unexpected error while reading from $filename: $!"; +\& } +.Ve +.PP +\&\fBA Note on Encodings\fR: Having to specify the text encoding every time +might seem a bit of a bother. To set up a default encoding for \f(CW\*(C`open\*(C'\fR so +that you don't have to supply it each time, you can use the \f(CW\*(C`open\*(C'\fR pragma: +.PP +.Vb 1 +\& use open qw< :encoding(UTF\-8) >; +.Ve +.PP +Once you've done that, you can safely omit the encoding part of the +open mode: +.PP +.Vb 2 +\& open($handle, "<", $filename) +\& || die "$0: can\*(Aqt open $filename for reading: $!"; +.Ve +.PP +But never use the bare \f(CW"<"\fR without having set up a default encoding +first. Otherwise, Perl cannot know which of the many, many, many possible +flavors of text file you have, and Perl will have no idea how to correctly +map the data in your file into actual characters it can work with. Other +common encoding formats including \f(CW"ASCII"\fR, \f(CW"ISO\-8859\-1"\fR, +\&\f(CW"ISO\-8859\-15"\fR, \f(CW"Windows\-1252"\fR, \f(CW"MacRoman"\fR, and even \f(CW"UTF\-16LE"\fR. +See perlunitut for more about encodings. +.SS "Opening Text Files for Writing" +.IX Subsection "Opening Text Files for Writing" +When you want to write to a file, you first have to decide what to do about +any existing contents of that file. You have two basic choices here: to +preserve or to clobber. +.PP +If you want to preserve any existing contents, then you want to open the file +in append mode. As in the shell, in Perl you use \f(CW">>"\fR to open an +existing file in append mode. \f(CW">>"\fR creates the file if it does not +already exist. +.PP +.Vb 3 +\& my $handle = undef; +\& my $filename = "/some/path/to/a/textfile/goes/here"; +\& my $encoding = ":encoding(UTF\-8)"; +\& +\& open($handle, ">> $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename for appending: $!"; +.Ve +.PP +Now you can write to that filehandle using any of \f(CW\*(C`print\*(C'\fR, \f(CW\*(C`printf\*(C'\fR, +\&\f(CW\*(C`say\*(C'\fR, \f(CW\*(C`write\*(C'\fR, or \f(CW\*(C`syswrite\*(C'\fR. +.PP +As noted above, if the file does not already exist, then the append-mode open +will create it for you. But if the file does already exist, its contents are +safe from harm because you will be adding your new text past the end of the +old text. +.PP +On the other hand, sometimes you want to clobber whatever might already be +there. To empty out a file before you start writing to it, you can open it +in write-only mode: +.PP +.Vb 3 +\& my $handle = undef; +\& my $filename = "/some/path/to/a/textfile/goes/here"; +\& my $encoding = ":encoding(UTF\-8)"; +\& +\& open($handle, "> $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename in write\-open mode: $!"; +.Ve +.PP +Here again Perl works just like the shell in that the \f(CW">"\fR clobbers +an existing file. +.PP +As with the append mode, when you open a file in write-only mode, +you can now write to that filehandle using any of \f(CW\*(C`print\*(C'\fR, \f(CW\*(C`printf\*(C'\fR, +\&\f(CW\*(C`say\*(C'\fR, \f(CW\*(C`write\*(C'\fR, or \f(CW\*(C`syswrite\*(C'\fR. +.PP +What about read-write mode? You should probably pretend it doesn't exist, +because opening text files in read-write mode is unlikely to do what you +would like. See perlfaq5 for details. +.SH "Opening Binary Files" +.IX Header "Opening Binary Files" +If the file to be opened contains binary data instead of text characters, +then the \f(CW\*(C`MODE\*(C'\fR argument to \f(CW\*(C`open\*(C'\fR is a little different. Instead of +specifying the encoding, you tell Perl that your data are in raw bytes. +.PP +.Vb 3 +\& my $filename = "/some/path/to/a/binary/file/goes/here"; +\& my $encoding = ":raw :bytes" +\& my $handle = undef; # this will be filled in on success +.Ve +.PP +And then open as before, choosing \f(CW"<"\fR, \f(CW">>"\fR, or +\&\f(CW">"\fR as needed: +.PP +.Vb 2 +\& open($handle, "< $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename for reading: $!"; +\& +\& open($handle, ">> $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename for appending: $!"; +\& +\& open($handle, "> $encoding", $filename) +\& || die "$0: can\*(Aqt open $filename in write\-open mode: $!"; +.Ve +.PP +Alternately, you can change to binary mode on an existing handle this way: +.PP +.Vb 1 +\& binmode($handle) || die "cannot binmode handle"; +.Ve +.PP +This is especially handy for the handles that Perl has already opened for you. +.PP +.Vb 2 +\& binmode(STDIN) || die "cannot binmode STDIN"; +\& binmode(STDOUT) || die "cannot binmode STDOUT"; +.Ve +.PP +You can also pass \f(CW\*(C`binmode\*(C'\fR an explicit encoding to change it on the fly. +This isn't exactly "binary" mode, but we still use \f(CW\*(C`binmode\*(C'\fR to do it: +.PP +.Vb 2 +\& binmode(STDIN, ":encoding(MacRoman)") || die "cannot binmode STDIN"; +\& binmode(STDOUT, ":encoding(UTF\-8)") || die "cannot binmode STDOUT"; +.Ve +.PP +Once you have your binary file properly opened in the right mode, you can +use all the same Perl I/O functions as you used on text files. However, +you may wish to use the fixed-size \f(CW\*(C`read\*(C'\fR instead of the variable-sized +\&\f(CW\*(C`readline\*(C'\fR for your input. +.PP +Here's an example of how to copy a binary file: +.PP +.Vb 3 +\& my $BUFSIZ = 64 * (2 ** 10); +\& my $name_in = "/some/input/file"; +\& my $name_out = "/some/output/flie"; +\& +\& my($in_fh, $out_fh, $buffer); +\& +\& open($in_fh, "<", $name_in) +\& || die "$0: cannot open $name_in for reading: $!"; +\& open($out_fh, ">", $name_out) +\& || die "$0: cannot open $name_out for writing: $!"; +\& +\& for my $fh ($in_fh, $out_fh) { +\& binmode($fh) || die "binmode failed"; +\& } +\& +\& while (read($in_fh, $buffer, $BUFSIZ)) { +\& unless (print $out_fh $buffer) { +\& die "couldn\*(Aqt write to $name_out: $!"; +\& } +\& } +\& +\& close($in_fh) || die "couldn\*(Aqt close $name_in: $!"; +\& close($out_fh) || die "couldn\*(Aqt close $name_out: $!"; +.Ve +.SH "Opening Pipes" +.IX Header "Opening Pipes" +Perl also lets you open a filehandle into an external program or shell +command rather than into a file. You can do this in order to pass data +from your Perl program to an external command for further processing, or +to receive data from another program for your own Perl program to +process. +.PP +Filehandles into commands are also known as \fIpipes\fR, since they work on +similar inter-process communication principles as Unix pipelines. Such a +filehandle has an active program instead of a static file on its +external end, but in every other sense it works just like a more typical +file-based filehandle, with all the techniques discussed earlier in this +article just as applicable. +.PP +As such, you open a pipe using the same \f(CW\*(C`open\*(C'\fR call that you use for +opening files, setting the second (\f(CW\*(C`MODE\*(C'\fR) argument to special +characters that indicate either an input or an output pipe. Use \f(CW"\-|"\fR for a +filehandle that will let your Perl program read data from an external +program, and \f(CW"|\-"\fR for a filehandle that will send data to that +program instead. +.SS "Opening a pipe for reading" +.IX Subsection "Opening a pipe for reading" +Let's say you'd like your Perl program to process data stored in a nearby +directory called \f(CW\*(C`unsorted\*(C'\fR, which contains a number of textfiles. +You'd also like your program to sort all the contents from these files +into a single, alphabetically sorted list of unique lines before it +starts processing them. +.PP +You could do this through opening an ordinary filehandle into each of +those files, gradually building up an in-memory array of all the file +contents you load this way, and finally sorting and filtering that array +when you've run out of files to load. \fIOr\fR, you could offload all that +merging and sorting into your operating system's own \f(CW\*(C`sort\*(C'\fR command by +opening a pipe directly into its output, and get to work that much +faster. +.PP +Here's how that might look: +.PP +.Vb 2 +\& open(my $sort_fh, \*(Aq\-|\*(Aq, \*(Aqsort \-u unsorted/*.txt\*(Aq) +\& or die "Couldn\*(Aqt open a pipe into sort: $!"; +\& +\& # And right away, we can start reading sorted lines: +\& while (my $line = <$sort_fh>) { +\& # +\& # ... Do something interesting with each $line here ... +\& # +\& } +.Ve +.PP +The second argument to \f(CW\*(C`open\*(C'\fR, \f(CW"\-|"\fR, makes it a read-pipe into a +separate program, rather than an ordinary filehandle into a file. +.PP +Note that the third argument to \f(CW\*(C`open\*(C'\fR is a string containing the +program name (\f(CW\*(C`sort\*(C'\fR) plus all its arguments: in this case, \f(CW\*(C`\-u\*(C'\fR to +specify unqiue sort, and then a fileglob specifying the files to sort. +The resulting filehandle \f(CW$sort_fh\fR works just like a read-only (\f(CW"<"\fR) filehandle, and your program can subsequently read data +from it as if it were opened onto an ordinary, single file. +.SS "Opening a pipe for writing" +.IX Subsection "Opening a pipe for writing" +Continuing the previous example, let's say that your program has +completed its processing, and the results sit in an array called +\&\f(CW@processed\fR. You want to print these lines to a file called +\&\f(CW\*(C`numbered.txt\*(C'\fR with a neatly formatted column of line-numbers. +.PP +Certainly you could write your own code to do this — or, once again, +you could kick that work over to another program. In this case, \f(CW\*(C`cat\*(C'\fR, +running with its own \f(CW\*(C`\-n\*(C'\fR option to activate line numbering, should do +the trick: +.PP +.Vb 2 +\& open(my $cat_fh, \*(Aq|\-\*(Aq, \*(Aqcat \-n > numbered.txt\*(Aq) +\& or die "Couldn\*(Aqt open a pipe into cat: $!"; +\& +\& for my $line (@processed) { +\& print $cat_fh $line; +\& } +.Ve +.PP +Here, we use a second \f(CW\*(C`open\*(C'\fR argument of \f(CW"|\-"\fR, signifying that the +filehandle assigned to \f(CW$cat_fh\fR should be a write-pipe. We can then +use it just as we would a write-only ordinary filehandle, including the +basic function of \f(CW\*(C`print\*(C'\fR\-ing data to it. +.PP +Note that the third argument, specifying the command that we wish to +pipe to, sets up \f(CW\*(C`cat\*(C'\fR to redirect its output via that \f(CW">"\fR +symbol into the file \f(CW\*(C`numbered.txt\*(C'\fR. This can start to look a little +tricky, because that same symbol would have meant something +entirely different had it showed it in the second argument to \f(CW\*(C`open\*(C'\fR! +But here in the third argument, it's simply part of the shell command that +Perl will open the pipe into, and Perl itself doesn't invest any special +meaning to it. +.SS "Expressing the command as a list" +.IX Subsection "Expressing the command as a list" +For opening pipes, Perl offers the option to call \f(CW\*(C`open\*(C'\fR with a list +comprising the desired command and all its own arguments as separate +elements, rather than combining them into a single string as in the +examples above. For instance, we could have phrased the \f(CW\*(C`open\*(C'\fR call in +the first example like this: +.PP +.Vb 2 +\& open(my $sort_fh, \*(Aq\-|\*(Aq, \*(Aqsort\*(Aq, \*(Aq\-u\*(Aq, glob(\*(Aqunsorted/*.txt\*(Aq)) +\& or die "Couldn\*(Aqt open a pipe into sort: $!"; +.Ve +.PP +When you call \f(CW\*(C`open\*(C'\fR this way, Perl invokes the given command directly, +bypassing the shell. As such, the shell won't try to interpret any +special characters within the command's argument list, which might +overwise have unwanted effects. This can make for safer, less +error-prone \f(CW\*(C`open\*(C'\fR calls, useful in cases such as passing in variables +as arguments, or even just referring to filenames with spaces in them. +.PP +However, when you \fIdo\fR want to pass a meaningful metacharacter to the +shell, such with the \f(CW"*"\fR inside that final \f(CW\*(C`unsorted/*.txt\*(C'\fR argument +here, you can't use this alternate syntax. In this case, we have worked +around it via Perl's handy \f(CW\*(C`glob\*(C'\fR built-in function, which evaluates +its argument into a list of filenames — and we can safely pass that +resulting list right into \f(CW\*(C`open\*(C'\fR, as shown above. +.PP +Note also that representing piped-command arguments in list form like +this doesn't work on every platform. It will work on any Unix-based OS +that provides a real \f(CW\*(C`fork\*(C'\fR function (e.g. macOS or Linux), as well as +on Windows when running Perl 5.22 or later. +.SH "SEE ALSO" +.IX Header "SEE ALSO" +The full documentation for \f(CW\*(C`open\*(C'\fR +provides a thorough reference to this function, beyond the best-practice +basics covered here. +.SH "AUTHOR and COPYRIGHT" +.IX Header "AUTHOR and COPYRIGHT" +Copyright 2013 Tom Christiansen; now maintained by Perl5 Porters +.PP +This documentation is free; you can redistribute it and/or modify it under +the same terms as Perl itself. |