diff options
Diffstat (limited to 'upstream/mageia-cauldron/man1/perlref.1')
-rw-r--r-- | upstream/mageia-cauldron/man1/perlref.1 | 1123 |
1 files changed, 1123 insertions, 0 deletions
diff --git a/upstream/mageia-cauldron/man1/perlref.1 b/upstream/mageia-cauldron/man1/perlref.1 new file mode 100644 index 00000000..3ad05236 --- /dev/null +++ b/upstream/mageia-cauldron/man1/perlref.1 @@ -0,0 +1,1123 @@ +.\" -*- mode: troff; coding: utf-8 -*- +.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. +.ie n \{\ +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds C` +. ds C' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is >0, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.\" +.\" Avoid warning from groff about undefined register 'F'. +.de IX +.. +.nr rF 0 +.if \n(.g .if rF .nr rF 1 +.if (\n(rF:(\n(.g==0)) \{\ +. if \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. if !\nF==2 \{\ +. nr % 0 +. nr F 2 +. \} +. \} +.\} +.rr rF +.\" ======================================================================== +.\" +.IX Title "PERLREF 1" +.TH PERLREF 1 2023-11-28 "perl v5.38.2" "Perl Programmers Reference Guide" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH NAME +perlref \- Perl references and nested data structures +.IX Xref "reference pointer data structure structure struct" +.SH NOTE +.IX Header "NOTE" +This is complete documentation about all aspects of references. +For a shorter, tutorial introduction to just the essential features, +see perlreftut. +.SH DESCRIPTION +.IX Header "DESCRIPTION" +Before release 5 of Perl it was difficult to represent complex data +structures, because all references had to be symbolic\-\-and even then +it was difficult to refer to a variable instead of a symbol table entry. +Perl now not only makes it easier to use symbolic references to variables, +but also lets you have "hard" references to any piece of data or code. +Any scalar may hold a hard reference. Because arrays and hashes contain +scalars, you can now easily build arrays of arrays, arrays of hashes, +hashes of arrays, arrays of hashes of functions, and so on. +.PP +Hard references are smart\-\-they keep track of reference counts for you, +automatically freeing the thing referred to when its reference count goes +to zero. (Reference counts for values in self-referential or +cyclic data structures may not go to zero without a little help; see +"Circular References" for a detailed explanation.) +If that thing happens to be an object, the object is destructed. See +perlobj for more about objects. (In a sense, everything in Perl is an +object, but we usually reserve the word for references to objects that +have been officially "blessed" into a class package.) +.PP +Symbolic references are names of variables or other objects, just as a +symbolic link in a Unix filesystem contains merely the name of a file. +The \f(CW*glob\fR notation is something of a symbolic reference. (Symbolic +references are sometimes called "soft references", but please don't call +them that; references are confusing enough without useless synonyms.) +.IX Xref "reference, symbolic reference, soft symbolic reference soft reference" +.PP +In contrast, hard references are more like hard links in a Unix file +system: They are used to access an underlying object without concern for +what its (other) name is. When the word "reference" is used without an +adjective, as in the following paragraph, it is usually talking about a +hard reference. +.IX Xref "reference, hard hard reference" +.PP +References are easy to use in Perl. There is just one overriding +principle: in general, Perl does no implicit referencing or dereferencing. +When a scalar is holding a reference, it always behaves as a simple scalar. +It doesn't magically start being an array or hash or subroutine; you have to +tell it explicitly to do so, by dereferencing it. +.SS "Making References" +.IX Xref "reference, creation referencing" +.IX Subsection "Making References" +References can be created in several ways. +.PP +\fIBackslash Operator\fR +.IX Xref "\\ backslash" +.IX Subsection "Backslash Operator" +.PP +By using the backslash operator on a variable, subroutine, or value. +(This works much like the & (address-of) operator in C.) +This typically creates \fIanother\fR reference to a variable, because +there's already a reference to the variable in the symbol table. But +the symbol table reference might go away, and you'll still have the +reference that the backslash returned. Here are some examples: +.PP +.Vb 5 +\& $scalarref = \e$foo; +\& $arrayref = \e@ARGV; +\& $hashref = \e%ENV; +\& $coderef = \e&handler; +\& $globref = \e*foo; +.Ve +.PP +It isn't possible to create a true reference to an IO handle (filehandle +or dirhandle) using the backslash operator. The most you can get is a +reference to a typeglob, which is actually a complete symbol table entry. +But see the explanation of the \f(CW*foo{THING}\fR syntax below. However, +you can still use type globs and globrefs as though they were IO handles. +.PP +\fISquare Brackets\fR +.IX Xref "array, anonymous [ [] square bracket bracket, square arrayref array reference reference, array" +.IX Subsection "Square Brackets" +.PP +A reference to an anonymous array can be created using square +brackets: +.PP +.Vb 1 +\& $arrayref = [1, 2, [\*(Aqa\*(Aq, \*(Aqb\*(Aq, \*(Aqc\*(Aq]]; +.Ve +.PP +Here we've created a reference to an anonymous array of three elements +whose final element is itself a reference to another anonymous array of three +elements. (The multidimensional syntax described later can be used to +access this. For example, after the above, \f(CW\*(C`$arrayref\->[2][1]\*(C'\fR would have +the value "b".) +.PP +Taking a reference to an enumerated list is not the same +as using square brackets\-\-instead it's the same as creating +a list of references! +.PP +.Vb 2 +\& @list = (\e$a, \e@b, \e%c); +\& @list = \e($a, @b, %c); # same thing! +.Ve +.PP +As a special case, \f(CW\*(C`\e(@foo)\*(C'\fR returns a list of references to the contents +of \f(CW@foo\fR, not a reference to \f(CW@foo\fR itself. Likewise for \f(CW%foo\fR, +except that the key references are to copies (since the keys are just +strings rather than full-fledged scalars). +.PP +\fICurly Brackets\fR +.IX Xref "hash, anonymous { {} curly bracket bracket, curly brace hashref hash reference reference, hash" +.IX Subsection "Curly Brackets" +.PP +A reference to an anonymous hash can be created using curly +brackets: +.PP +.Vb 4 +\& $hashref = { +\& \*(AqAdam\*(Aq => \*(AqEve\*(Aq, +\& \*(AqClyde\*(Aq => \*(AqBonnie\*(Aq, +\& }; +.Ve +.PP +Anonymous hash and array composers like these can be intermixed freely to +produce as complicated a structure as you want. The multidimensional +syntax described below works for these too. The values above are +literals, but variables and expressions would work just as well, because +assignment operators in Perl (even within \fBlocal()\fR or \fBmy()\fR) are executable +statements, not compile-time declarations. +.PP +Because curly brackets (braces) are used for several other things +including BLOCKs, you may occasionally have to disambiguate braces at the +beginning of a statement by putting a \f(CW\*(C`+\*(C'\fR or a \f(CW\*(C`return\*(C'\fR in front so +that Perl realizes the opening brace isn't starting a BLOCK. The economy and +mnemonic value of using curlies is deemed worth this occasional extra +hassle. +.PP +For example, if you wanted a function to make a new hash and return a +reference to it, you have these options: +.PP +.Vb 3 +\& sub hashem { { @_ } } # silently wrong +\& sub hashem { +{ @_ } } # ok +\& sub hashem { return { @_ } } # ok +.Ve +.PP +On the other hand, if you want the other meaning, you can do this: +.PP +.Vb 4 +\& sub showem { { @_ } } # ambiguous (currently ok, +\& # but may change) +\& sub showem { {; @_ } } # ok +\& sub showem { { return @_ } } # ok +.Ve +.PP +The leading \f(CW\*(C`+{\*(C'\fR and \f(CW\*(C`{;\*(C'\fR always serve to disambiguate +the expression to mean either the HASH reference, or the BLOCK. +.PP +\fIAnonymous Subroutines\fR +.IX Xref "subroutine, anonymous subroutine, reference reference, subroutine scope, lexical closure lexical lexical scope" +.IX Subsection "Anonymous Subroutines" +.PP +A reference to an anonymous subroutine can be created by using +\&\f(CW\*(C`sub\*(C'\fR without a subname: +.PP +.Vb 1 +\& $coderef = sub { print "Boink!\en" }; +.Ve +.PP +Note the semicolon. Except for the code +inside not being immediately executed, a \f(CW\*(C`sub {}\*(C'\fR is not so much a +declaration as it is an operator, like \f(CW\*(C`do{}\*(C'\fR or \f(CW\*(C`eval{}\*(C'\fR. (However, no +matter how many times you execute that particular line (unless you're in an +\&\f(CWeval("...")\fR), \f(CW$coderef\fR will still have a reference to the \fIsame\fR +anonymous subroutine.) +.PP +Anonymous subroutines act as closures with respect to \fBmy()\fR variables, +that is, variables lexically visible within the current scope. Closure +is a notion out of the Lisp world that says if you define an anonymous +function in a particular lexical context, it pretends to run in that +context even when it's called outside the context. +.PP +In human terms, it's a funny way of passing arguments to a subroutine when +you define it as well as when you call it. It's useful for setting up +little bits of code to run later, such as callbacks. You can even +do object-oriented stuff with it, though Perl already provides a different +mechanism to do that\-\-see perlobj. +.PP +You might also think of closure as a way to write a subroutine +template without using \fBeval()\fR. Here's a small example of how +closures work: +.PP +.Vb 6 +\& sub newprint { +\& my $x = shift; +\& return sub { my $y = shift; print "$x, $y!\en"; }; +\& } +\& $h = newprint("Howdy"); +\& $g = newprint("Greetings"); +\& +\& # Time passes... +\& +\& &$h("world"); +\& &$g("earthlings"); +.Ve +.PP +This prints +.PP +.Vb 2 +\& Howdy, world! +\& Greetings, earthlings! +.Ve +.PP +Note particularly that \f(CW$x\fR continues to refer to the value passed +into \fBnewprint()\fR \fIdespite\fR "my \f(CW$x\fR" having gone out of scope by the +time the anonymous subroutine runs. That's what a closure is all +about. +.PP +This applies only to lexical variables, by the way. Dynamic variables +continue to work as they have always worked. Closure is not something +that most Perl programmers need trouble themselves about to begin with. +.PP +\fIConstructors\fR +.IX Xref "constructor new" +.IX Subsection "Constructors" +.PP +References are often returned by special subroutines called constructors. Perl +objects are just references to a special type of object that happens to know +which package it's associated with. Constructors are just special subroutines +that know how to create that association. They do so by starting with an +ordinary reference, and it remains an ordinary reference even while it's also +being an object. Constructors are often named \f(CWnew()\fR. You \fIcan\fR call them +indirectly: +.PP +.Vb 1 +\& $objref = new Doggie( Tail => \*(Aqshort\*(Aq, Ears => \*(Aqlong\*(Aq ); +.Ve +.PP +But that can produce ambiguous syntax in certain cases, so it's often +better to use the direct method invocation approach: +.PP +.Vb 1 +\& $objref = Doggie\->new(Tail => \*(Aqshort\*(Aq, Ears => \*(Aqlong\*(Aq); +\& +\& use Term::Cap; +\& $terminal = Term::Cap\->Tgetent( { OSPEED => 9600 }); +\& +\& use Tk; +\& $main = MainWindow\->new(); +\& $menubar = $main\->Frame(\-relief => "raised", +\& \-borderwidth => 2) +.Ve +.PP +This indirect object syntax is only available when +\&\f(CW\*(C`use feature "indirect"\*(C'\fR is in effect, +and that is not the case when \f(CW\*(C`use v5.36\*(C'\fR (or +higher) is requested, it is best to avoid indirect object syntax entirely. +.PP +\fIAutovivification\fR +.IX Xref "autovivification" +.IX Subsection "Autovivification" +.PP +References of the appropriate type can spring into existence if you +dereference them in a context that assumes they exist. Because we haven't +talked about dereferencing yet, we can't show you any examples yet. +.PP +\fITypeglob Slots\fR +.IX Xref "*foo{THING} *" +.IX Subsection "Typeglob Slots" +.PP +A reference can be created by using a special syntax, lovingly known as +the *foo{THING} syntax. *foo{THING} returns a reference to the THING +slot in *foo (which is the symbol table entry which holds everything +known as foo). +.PP +.Vb 9 +\& $scalarref = *foo{SCALAR}; +\& $arrayref = *ARGV{ARRAY}; +\& $hashref = *ENV{HASH}; +\& $coderef = *handler{CODE}; +\& $ioref = *STDIN{IO}; +\& $globref = *foo{GLOB}; +\& $formatref = *foo{FORMAT}; +\& $globname = *foo{NAME}; # "foo" +\& $pkgname = *foo{PACKAGE}; # "main" +.Ve +.PP +Most of these are self-explanatory, but \f(CW*foo{IO}\fR +deserves special attention. It returns +the IO handle, used for file handles ("open" in perlfunc), sockets +("socket" in perlfunc and "socketpair" in perlfunc), and directory +handles ("opendir" in perlfunc). For compatibility with previous +versions of Perl, \f(CW*foo{FILEHANDLE}\fR is a synonym for \f(CW*foo{IO}\fR, though it +is discouraged, to encourage a consistent use of one name: IO. On perls +between v5.8 and v5.22, it will issue a deprecation warning, but this +deprecation has since been rescinded. +.PP +\&\f(CW*foo{THING}\fR returns undef if that particular THING hasn't been used yet, +except in the case of scalars. \f(CW*foo{SCALAR}\fR returns a reference to an +anonymous scalar if \f(CW$foo\fR hasn't been used yet. This might change in a +future release. +.PP +\&\f(CW*foo{NAME}\fR and \f(CW*foo{PACKAGE}\fR are the exception, in that they return +strings, rather than references. These return the package and name of the +typeglob itself, rather than one that has been assigned to it. So, after +\&\f(CW\*(C`*foo=*Foo::bar\*(C'\fR, \f(CW*foo\fR will become "*Foo::bar" when used as a string, +but \f(CW*foo{PACKAGE}\fR and \f(CW*foo{NAME}\fR will continue to produce "main" and +"foo", respectively. +.PP +\&\f(CW*foo{IO}\fR is an alternative to the \f(CW*HANDLE\fR mechanism given in +"Typeglobs and Filehandles" in perldata for passing filehandles +into or out of subroutines, or storing into larger data structures. +Its disadvantage is that it won't create a new filehandle for you. +Its advantage is that you have less risk of clobbering more than +you want to with a typeglob assignment. (It still conflates file +and directory handles, though.) However, if you assign the incoming +value to a scalar instead of a typeglob as we do in the examples +below, there's no risk of that happening. +.PP +.Vb 2 +\& splutter(*STDOUT); # pass the whole glob +\& splutter(*STDOUT{IO}); # pass both file and dir handles +\& +\& sub splutter { +\& my $fh = shift; +\& print $fh "her um well a hmmm\en"; +\& } +\& +\& $rec = get_rec(*STDIN); # pass the whole glob +\& $rec = get_rec(*STDIN{IO}); # pass both file and dir handles +\& +\& sub get_rec { +\& my $fh = shift; +\& return scalar <$fh>; +\& } +.Ve +.SS "Using References" +.IX Xref "reference, use dereferencing dereference" +.IX Subsection "Using References" +That's it for creating references. By now you're probably dying to +know how to use references to get back to your long-lost data. There +are several basic methods. +.PP +\fISimple Scalar\fR +.IX Subsection "Simple Scalar" +.PP +Anywhere you'd put an identifier (or chain of identifiers) as part +of a variable or subroutine name, you can replace the identifier with +a simple scalar variable containing a reference of the correct type: +.PP +.Vb 6 +\& $bar = $$scalarref; +\& push(@$arrayref, $filename); +\& $$arrayref[0] = "January"; +\& $$hashref{"KEY"} = "VALUE"; +\& &$coderef(1,2,3); +\& print $globref "output\en"; +.Ve +.PP +It's important to understand that we are specifically \fInot\fR dereferencing +\&\f(CW$arrayref[0]\fR or \f(CW$hashref{"KEY"}\fR there. The dereference of the +scalar variable happens \fIbefore\fR it does any key lookups. Anything more +complicated than a simple scalar variable must use methods 2 or 3 below. +However, a "simple scalar" includes an identifier that itself uses method +1 recursively. Therefore, the following prints "howdy". +.PP +.Vb 2 +\& $refrefref = \e\e\e"howdy"; +\& print $$$$refrefref; +.Ve +.PP +\fIBlock\fR +.IX Subsection "Block" +.PP +Anywhere you'd put an identifier (or chain of identifiers) as part of a +variable or subroutine name, you can replace the identifier with a +BLOCK returning a reference of the correct type. In other words, the +previous examples could be written like this: +.PP +.Vb 6 +\& $bar = ${$scalarref}; +\& push(@{$arrayref}, $filename); +\& ${$arrayref}[0] = "January"; +\& ${$hashref}{"KEY"} = "VALUE"; +\& &{$coderef}(1,2,3); +\& $globref\->print("output\en"); # iff IO::Handle is loaded +.Ve +.PP +Admittedly, it's a little silly to use the curlies in this case, but +the BLOCK can contain any arbitrary expression, in particular, +subscripted expressions: +.PP +.Vb 1 +\& &{ $dispatch{$index} }(1,2,3); # call correct routine +.Ve +.PP +Because of being able to omit the curlies for the simple case of \f(CW$$x\fR, +people often make the mistake of viewing the dereferencing symbols as +proper operators, and wonder about their precedence. If they were, +though, you could use parentheses instead of braces. That's not the case. +Consider the difference below; case 0 is a short-hand version of case 1, +\&\fInot\fR case 2: +.PP +.Vb 4 +\& $$hashref{"KEY"} = "VALUE"; # CASE 0 +\& ${$hashref}{"KEY"} = "VALUE"; # CASE 1 +\& ${$hashref{"KEY"}} = "VALUE"; # CASE 2 +\& ${$hashref\->{"KEY"}} = "VALUE"; # CASE 3 +.Ve +.PP +Case 2 is also deceptive in that you're accessing a variable +called \f(CW%hashref\fR, not dereferencing through \f(CW$hashref\fR to the hash +it's presumably referencing. That would be case 3. +.PP +\fIArrow Notation\fR +.IX Subsection "Arrow Notation" +.PP +Subroutine calls and lookups of individual array elements arise often +enough that it gets cumbersome to use method 2. As a form of +syntactic sugar, the examples for method 2 may be written: +.PP +.Vb 3 +\& $arrayref\->[0] = "January"; # Array element +\& $hashref\->{"KEY"} = "VALUE"; # Hash element +\& $coderef\->(1,2,3); # Subroutine call +.Ve +.PP +The left side of the arrow can be any expression returning a reference, +including a previous dereference. Note that \f(CW$array[$x]\fR is \fInot\fR the +same thing as \f(CW\*(C`$array\->[$x]\*(C'\fR here: +.PP +.Vb 1 +\& $array[$x]\->{"foo"}\->[0] = "January"; +.Ve +.PP +This is one of the cases we mentioned earlier in which references could +spring into existence when in an lvalue context. Before this +statement, \f(CW$array[$x]\fR may have been undefined. If so, it's +automatically defined with a hash reference so that we can look up +\&\f(CW\*(C`{"foo"}\*(C'\fR in it. Likewise \f(CW\*(C`$array[$x]\->{"foo"}\*(C'\fR will automatically get +defined with an array reference so that we can look up \f(CW\*(C`[0]\*(C'\fR in it. +This process is called \fIautovivification\fR. +.PP +One more thing here. The arrow is optional \fIbetween\fR brackets +subscripts, so you can shrink the above down to +.PP +.Vb 1 +\& $array[$x]{"foo"}[0] = "January"; +.Ve +.PP +Which, in the degenerate case of using only ordinary arrays, gives you +multidimensional arrays just like C's: +.PP +.Vb 1 +\& $score[$x][$y][$z] += 42; +.Ve +.PP +Well, okay, not entirely like C's arrays, actually. C doesn't know how +to grow its arrays on demand. Perl does. +.PP +\fIObjects\fR +.IX Subsection "Objects" +.PP +If a reference happens to be a reference to an object, then there are +probably methods to access the things referred to, and you should probably +stick to those methods unless you're in the class package that defines the +object's methods. In other words, be nice, and don't violate the object's +encapsulation without a very good reason. Perl does not enforce +encapsulation. We are not totalitarians here. We do expect some basic +civility though. +.PP +\fIMiscellaneous Usage\fR +.IX Subsection "Miscellaneous Usage" +.PP +Using a string or number as a reference produces a symbolic reference, +as explained above. Using a reference as a number produces an +integer representing its storage location in memory. The only +useful thing to be done with this is to compare two references +numerically to see whether they refer to the same location. +.IX Xref "reference, numeric context" +.PP +.Vb 3 +\& if ($ref1 == $ref2) { # cheap numeric compare of references +\& print "refs 1 and 2 refer to the same thing\en"; +\& } +.Ve +.PP +Using a reference as a string produces both its referent's type, +including any package blessing as described in perlobj, as well +as the numeric address expressed in hex. The \fBref()\fR operator returns +just the type of thing the reference is pointing to, without the +address. See "ref" in perlfunc for details and examples of its use. +.IX Xref "reference, string context" +.PP +The \fBbless()\fR operator may be used to associate the object a reference +points to with a package functioning as an object class. See perlobj. +.PP +A typeglob may be dereferenced the same way a reference can, because +the dereference syntax always indicates the type of reference desired. +So \f(CW\*(C`${*foo}\*(C'\fR and \f(CW\*(C`${\e$foo}\*(C'\fR both indicate the same scalar variable. +.PP +Here's a trick for interpolating a subroutine call into a string: +.PP +.Vb 1 +\& print "My sub returned @{[mysub(1,2,3)]} that time.\en"; +.Ve +.PP +The way it works is that when the \f(CW\*(C`@{...}\*(C'\fR is seen in the double-quoted +string, it's evaluated as a block. The block creates a reference to an +anonymous array containing the results of the call to \f(CW\*(C`mysub(1,2,3)\*(C'\fR. So +the whole block returns a reference to an array, which is then +dereferenced by \f(CW\*(C`@{...}\*(C'\fR and stuck into the double-quoted string. This +chicanery is also useful for arbitrary expressions: +.PP +.Vb 1 +\& print "That yields @{[$n + 5]} widgets\en"; +.Ve +.PP +Similarly, an expression that returns a reference to a scalar can be +dereferenced via \f(CW\*(C`${...}\*(C'\fR. Thus, the above expression may be written +as: +.PP +.Vb 1 +\& print "That yields ${\e($n + 5)} widgets\en"; +.Ve +.SS "Circular References" +.IX Xref "circular reference reference, circular" +.IX Subsection "Circular References" +It is possible to create a "circular reference" in Perl, which can lead +to memory leaks. A circular reference occurs when two references +contain a reference to each other, like this: +.PP +.Vb 3 +\& my $foo = {}; +\& my $bar = { foo => $foo }; +\& $foo\->{bar} = $bar; +.Ve +.PP +You can also create a circular reference with a single variable: +.PP +.Vb 2 +\& my $foo; +\& $foo = \e$foo; +.Ve +.PP +In this case, the reference count for the variables will never reach 0, +and the references will never be garbage-collected. This can lead to +memory leaks. +.PP +Because objects in Perl are implemented as references, it's possible to +have circular references with objects as well. Imagine a TreeNode class +where each node references its parent and child nodes. Any node with a +parent will be part of a circular reference. +.PP +You can break circular references by creating a "weak reference". A +weak reference does not increment the reference count for a variable, +which means that the object can go out of scope and be destroyed. You +can weaken a reference with the \f(CW\*(C`weaken\*(C'\fR function exported by the +Scalar::Util module, or available as \f(CW\*(C`builtin::weaken\*(C'\fR directly in +Perl version 5.35.7 or later. +.PP +Here's how we can make the first example safer: +.PP +.Vb 1 +\& use Scalar::Util \*(Aqweaken\*(Aq; +\& +\& my $foo = {}; +\& my $bar = { foo => $foo }; +\& $foo\->{bar} = $bar; +\& +\& weaken $foo\->{bar}; +.Ve +.PP +The reference from \f(CW$foo\fR to \f(CW$bar\fR has been weakened. When the +\&\f(CW$bar\fR variable goes out of scope, it will be garbage-collected. The +next time you look at the value of the \f(CW\*(C`$foo\->{bar}\*(C'\fR key, it will +be \f(CW\*(C`undef\*(C'\fR. +.PP +This action at a distance can be confusing, so you should be careful +with your use of weaken. You should weaken the reference in the +variable that will go out of scope \fIfirst\fR. That way, the longer-lived +variable will contain the expected reference until it goes out of +scope. +.SS "Symbolic references" +.IX Xref "reference, symbolic reference, soft symbolic reference soft reference" +.IX Subsection "Symbolic references" +We said that references spring into existence as necessary if they are +undefined, but we didn't say what happens if a value used as a +reference is already defined, but \fIisn't\fR a hard reference. If you +use it as a reference, it'll be treated as a symbolic +reference. That is, the value of the scalar is taken to be the \fIname\fR +of a variable, rather than a direct link to a (possibly) anonymous +value. +.PP +People frequently expect it to work like this. So it does. +.PP +.Vb 9 +\& $name = "foo"; +\& $$name = 1; # Sets $foo +\& ${$name} = 2; # Sets $foo +\& ${$name x 2} = 3; # Sets $foofoo +\& $name\->[0] = 4; # Sets $foo[0] +\& @$name = (); # Clears @foo +\& &$name(); # Calls &foo() +\& $pack = "THAT"; +\& ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval +.Ve +.PP +This is powerful, and slightly dangerous, in that it's possible +to intend (with the utmost sincerity) to use a hard reference, and +accidentally use a symbolic reference instead. To protect against +that, you can say +.PP +.Vb 1 +\& use strict \*(Aqrefs\*(Aq; +.Ve +.PP +and then only hard references will be allowed for the rest of the enclosing +block. An inner block may countermand that with +.PP +.Vb 1 +\& no strict \*(Aqrefs\*(Aq; +.Ve +.PP +Only package variables (globals, even if localized) are visible to +symbolic references. Lexical variables (declared with \fBmy()\fR) aren't in +a symbol table, and thus are invisible to this mechanism. For example: +.PP +.Vb 6 +\& local $value = 10; +\& $ref = "value"; +\& { +\& my $value = 20; +\& print $$ref; +\& } +.Ve +.PP +This will still print 10, not 20. Remember that \fBlocal()\fR affects package +variables, which are all "global" to the package. +.SS "Not-so-symbolic references" +.IX Subsection "Not-so-symbolic references" +Brackets around a symbolic reference can simply +serve to isolate an identifier or variable name from the rest of an +expression, just as they always have within a string. For example, +.PP +.Vb 2 +\& $push = "pop on "; +\& print "${push}over"; +.Ve +.PP +has always meant to print "pop on over", even though push is +a reserved word. This is generalized to work the same +without the enclosing double quotes, so that +.PP +.Vb 1 +\& print ${push} . "over"; +.Ve +.PP +and even +.PP +.Vb 1 +\& print ${ push } . "over"; +.Ve +.PP +will have the same effect. This +construct is \fInot\fR considered to be a symbolic reference when you're +using strict refs: +.PP +.Vb 3 +\& use strict \*(Aqrefs\*(Aq; +\& ${ bareword }; # Okay, means $bareword. +\& ${ "bareword" }; # Error, symbolic reference. +.Ve +.PP +Similarly, because of all the subscripting that is done using single words, +the same rule applies to any bareword that is used for subscripting a hash. +So now, instead of writing +.PP +.Vb 1 +\& $hash{ "aaa" }{ "bbb" }{ "ccc" } +.Ve +.PP +you can write just +.PP +.Vb 1 +\& $hash{ aaa }{ bbb }{ ccc } +.Ve +.PP +and not worry about whether the subscripts are reserved words. In the +rare event that you do wish to do something like +.PP +.Vb 1 +\& $hash{ shift } +.Ve +.PP +you can force interpretation as a reserved word by adding anything that +makes it more than a bareword: +.PP +.Vb 3 +\& $hash{ shift() } +\& $hash{ +shift } +\& $hash{ shift @_ } +.Ve +.PP +The \f(CW\*(C`use warnings\*(C'\fR pragma or the \fB\-w\fR switch will warn you if it +interprets a reserved word as a string. +But it will no longer warn you about using lowercase words, because the +string is effectively quoted. +.SS "Pseudo-hashes: Using an array as a hash" +.IX Xref "pseudo-hash pseudo hash pseudohash" +.IX Subsection "Pseudo-hashes: Using an array as a hash" +Pseudo-hashes have been removed from Perl. The 'fields' pragma +remains available. +.SS "Function Templates" +.IX Xref "scope, lexical closure lexical lexical scope subroutine, nested sub, nested subroutine, local sub, local" +.IX Subsection "Function Templates" +As explained above, an anonymous function with access to the lexical +variables visible when that function was compiled, creates a closure. It +retains access to those variables even though it doesn't get run until +later, such as in a signal handler or a Tk callback. +.PP +Using a closure as a function template allows us to generate many functions +that act similarly. Suppose you wanted functions named after the colors +that generated HTML font changes for the various colors: +.PP +.Vb 1 +\& print "Be ", red("careful"), "with that ", green("light"); +.Ve +.PP +The \fBred()\fR and \fBgreen()\fR functions would be similar. To create these, +we'll assign a closure to a typeglob of the name of the function we're +trying to build. +.PP +.Vb 5 +\& @colors = qw(red blue green yellow orange purple violet); +\& for my $name (@colors) { +\& no strict \*(Aqrefs\*(Aq; # allow symbol table manipulation +\& *$name = *{uc $name} = sub { "<FONT COLOR=\*(Aq$name\*(Aq>@_</FONT>" }; +\& } +.Ve +.PP +Now all those different functions appear to exist independently. You can +call \fBred()\fR, \fBRED()\fR, \fBblue()\fR, \fBBLUE()\fR, \fBgreen()\fR, etc. This technique saves on +both compile time and memory use, and is less error-prone as well, since +syntax checks happen at compile time. It's critical that any variables in +the anonymous subroutine be lexicals in order to create a proper closure. +That's the reasons for the \f(CW\*(C`my\*(C'\fR on the loop iteration variable. +.PP +This is one of the only places where giving a prototype to a closure makes +much sense. If you wanted to impose scalar context on the arguments of +these functions (probably not a wise idea for this particular example), +you could have written it this way instead: +.PP +.Vb 1 +\& *$name = sub ($) { "<FONT COLOR=\*(Aq$name\*(Aq>$_[0]</FONT>" }; +.Ve +.PP +However, since prototype checking happens at compile time, the assignment +above happens too late to be of much use. You could address this by +putting the whole loop of assignments within a BEGIN block, forcing it +to occur during compilation. +.PP +Access to lexicals that change over time\-\-like those in the \f(CW\*(C`for\*(C'\fR loop +above, basically aliases to elements from the surrounding lexical scopes\-\- +only works with anonymous subs, not with named subroutines. Generally +said, named subroutines do not nest properly and should only be declared +in the main package scope. +.PP +This is because named subroutines are created at compile time so their +lexical variables get assigned to the parent lexicals from the first +execution of the parent block. If a parent scope is entered a second +time, its lexicals are created again, while the nested subs still +reference the old ones. +.PP +Anonymous subroutines get to capture each time you execute the \f(CW\*(C`sub\*(C'\fR +operator, as they are created on the fly. If you are accustomed to using +nested subroutines in other programming languages with their own private +variables, you'll have to work at it a bit in Perl. The intuitive coding +of this type of thing incurs mysterious warnings about "will not stay +shared" due to the reasons explained above. +For example, this won't work: +.PP +.Vb 5 +\& sub outer { +\& my $x = $_[0] + 35; +\& sub inner { return $x * 19 } # WRONG +\& return $x + inner(); +\& } +.Ve +.PP +A work-around is the following: +.PP +.Vb 5 +\& sub outer { +\& my $x = $_[0] + 35; +\& local *inner = sub { return $x * 19 }; +\& return $x + inner(); +\& } +.Ve +.PP +Now \fBinner()\fR can only be called from within \fBouter()\fR, because of the +temporary assignments of the anonymous subroutine. But when it does, +it has normal access to the lexical variable \f(CW$x\fR from the scope of +\&\fBouter()\fR at the time outer is invoked. +.PP +This has the interesting effect of creating a function local to another +function, something not normally supported in Perl. +.SS "Postfix Dereference Syntax" +.IX Subsection "Postfix Dereference Syntax" +Beginning in v5.20.0, a postfix syntax for using references is +available. It behaves as described in "Using References", but instead +of a prefixed sigil, a postfixed sigil-and-star is used. +.PP +For example: +.PP +.Vb 2 +\& $r = \e@a; +\& @b = $r\->@*; # equivalent to @$r or @{ $r } +\& +\& $r = [ 1, [ 2, 3 ], 4 ]; +\& $r\->[1]\->@*; # equivalent to @{ $r\->[1] } +.Ve +.PP +In Perl 5.20 and 5.22, this syntax must be enabled with \f(CWuse feature +\&\*(Aqpostderef\*(Aq\fR. As of Perl 5.24, no feature declarations are required to make +it available. +.PP +Postfix dereference should work in all circumstances where block +(circumfix) dereference worked, and should be entirely equivalent. This +syntax allows dereferencing to be written and read entirely +left-to-right. The following equivalencies are defined: +.PP +.Vb 6 +\& $sref\->$*; # same as ${ $sref } +\& $aref\->@*; # same as @{ $aref } +\& $aref\->$#*; # same as $#{ $aref } +\& $href\->%*; # same as %{ $href } +\& $cref\->&*; # same as &{ $cref } +\& $gref\->**; # same as *{ $gref } +.Ve +.PP +Note especially that \f(CW\*(C`$cref\->&*\*(C'\fR is \fInot\fR equivalent to \f(CW$cref\->()\fR, and can serve different purposes. +.PP +Glob elements can be extracted through the postfix dereferencing feature: +.PP +.Vb 1 +\& $gref\->*{SCALAR}; # same as *{ $gref }{SCALAR} +.Ve +.PP +Postfix array and scalar dereferencing \fIcan\fR be used in interpolating +strings (double quotes or the \f(CW\*(C`qq\*(C'\fR operator), but only if the +\&\f(CW\*(C`postderef_qq\*(C'\fR feature is enabled. Interpolation of postfix array highest index +access (\f(CW\*(C`\->$#*\*(C'\fR) is also supported when the \f(CW\*(C`postderef_qq\*(C'\fR feature is +enabled. +.SS "Postfix Reference Slicing" +.IX Subsection "Postfix Reference Slicing" +Value slices of arrays and hashes may also be taken with postfix +dereferencing notation, with the following equivalencies: +.PP +.Vb 2 +\& $aref\->@[ ... ]; # same as @$aref[ ... ] +\& $href\->@{ ... }; # same as @$href{ ... } +.Ve +.PP +Postfix key/value pair slicing, added in 5.20.0 and documented in +the Key/Value Hash Slices section of perldata, also behaves as expected: +.PP +.Vb 2 +\& $aref\->%[ ... ]; # same as %$aref[ ... ] +\& $href\->%{ ... }; # same as %$href{ ... } +.Ve +.PP +As with postfix array, postfix value slice dereferencing \fIcan\fR be used +in interpolating strings (double quotes or the \f(CW\*(C`qq\*(C'\fR operator), but only +if the \f(CW\*(C`postderef_qq\*(C'\fR feature is enabled. +.SS "Assigning to References" +.IX Subsection "Assigning to References" +Beginning in v5.22.0, the referencing operator can be assigned to. It +performs an aliasing operation, so that the variable name referenced on the +left-hand side becomes an alias for the thing referenced on the right-hand +side: +.PP +.Vb 2 +\& \e$a = \e$b; # $a and $b now point to the same scalar +\& \e&foo = \e&bar; # foo() now means bar() +.Ve +.PP +This syntax must be enabled with \f(CW\*(C`use feature \*(Aqrefaliasing\*(Aq\*(C'\fR. It is +experimental, and will warn by default unless \f(CWno warnings +\&\*(Aqexperimental::refaliasing\*(Aq\fR is in effect. +.PP +These forms may be assigned to, and cause the right-hand side to be +evaluated in scalar context: +.PP +.Vb 10 +\& \e$scalar +\& \e@array +\& \e%hash +\& \e&sub +\& \emy $scalar +\& \emy @array +\& \emy %hash +\& \estate $scalar # or @array, etc. +\& \eour $scalar # etc. +\& \elocal $scalar # etc. +\& \elocal our $scalar # etc. +\& \e$some_array[$index] +\& \e$some_hash{$key} +\& \elocal $some_array[$index] +\& \elocal $some_hash{$key} +\& condition ? \e$this : \e$that[0] # etc. +.Ve +.PP +Slicing operations and parentheses cause +the right-hand side to be evaluated in +list context: +.PP +.Vb 10 +\& \e@array[5..7] +\& (\e@array[5..7]) +\& \e(@array[5..7]) +\& \e@hash{\*(Aqfoo\*(Aq,\*(Aqbar\*(Aq} +\& (\e@hash{\*(Aqfoo\*(Aq,\*(Aqbar\*(Aq}) +\& \e(@hash{\*(Aqfoo\*(Aq,\*(Aqbar\*(Aq}) +\& (\e$scalar) +\& \e($scalar) +\& \e(my $scalar) +\& \emy($scalar) +\& (\e@array) +\& (\e%hash) +\& (\e&sub) +\& \e(&sub) +\& \e($foo, @bar, %baz) +\& (\e$foo, \e@bar, \e%baz) +.Ve +.PP +Each element on the right-hand side must be a reference to a datum of the +right type. Parentheses immediately surrounding an array (and possibly +also \f(CW\*(C`my\*(C'\fR/\f(CW\*(C`state\*(C'\fR/\f(CW\*(C`our\*(C'\fR/\f(CW\*(C`local\*(C'\fR) will make each element of the array an +alias to the corresponding scalar referenced on the right-hand side: +.PP +.Vb 5 +\& \e(@a) = \e(@b); # @a and @b now have the same elements +\& \emy(@a) = \e(@b); # likewise +\& \e(my @a) = \e(@b); # likewise +\& push @a, 3; # but now @a has an extra element that @b lacks +\& \e(@a) = (\e$a, \e$b, \e$c); # @a now contains $a, $b, and $c +.Ve +.PP +Combining that form with \f(CW\*(C`local\*(C'\fR and putting parentheses immediately +around a hash are forbidden (because it is not clear what they should do): +.PP +.Vb 2 +\& \elocal(@array) = foo(); # WRONG +\& \e(%hash) = bar(); # WRONG +.Ve +.PP +Assignment to references and non-references may be combined in lists and +conditional ternary expressions, as long as the values on the right-hand +side are the right type for each element on the left, though this may make +for obfuscated code: +.PP +.Vb 4 +\& (my $tom, \emy $dick, \emy @harry) = (\e1, \e2, [1..3]); +\& # $tom is now \e1 +\& # $dick is now 2 (read\-only) +\& # @harry is (1,2,3) +\& +\& my $type = ref $thingy; +\& ($type ? $type eq \*(AqARRAY\*(Aq ? \e@foo : \e$bar : $baz) = $thingy; +.Ve +.PP +The \f(CW\*(C`foreach\*(C'\fR loop can also take a reference constructor for its loop +variable, though the syntax is limited to one of the following, with an +optional \f(CW\*(C`my\*(C'\fR, \f(CW\*(C`state\*(C'\fR, or \f(CW\*(C`our\*(C'\fR after the backslash: +.PP +.Vb 4 +\& \e$s +\& \e@a +\& \e%h +\& \e&c +.Ve +.PP +No parentheses are permitted. This feature is particularly useful for +arrays-of-arrays, or arrays-of-hashes: +.PP +.Vb 3 +\& foreach \emy @a (@array_of_arrays) { +\& frobnicate($a[0], $a[\-1]); +\& } +\& +\& foreach \emy %h (@array_of_hashes) { +\& $h{gelastic}++ if $h{type} eq \*(Aqfunny\*(Aq; +\& } +.Ve +.PP +\&\fBCAVEAT:\fR Aliasing does not work correctly with closures. If you try to +alias lexical variables from an inner subroutine or \f(CW\*(C`eval\*(C'\fR, the aliasing +will only be visible within that inner sub, and will not affect the outer +subroutine where the variables are declared. This bizarre behavior is +subject to change. +.SS "Declaring a Reference to a Variable" +.IX Subsection "Declaring a Reference to a Variable" +Beginning in v5.26.0, the referencing operator can come after \f(CW\*(C`my\*(C'\fR, +\&\f(CW\*(C`state\*(C'\fR, \f(CW\*(C`our\*(C'\fR, or \f(CW\*(C`local\*(C'\fR. This syntax must be enabled with \f(CW\*(C`use +feature \*(Aqdeclared_refs\*(Aq\*(C'\fR. It is experimental, and will warn by default +unless \f(CW\*(C`no warnings \*(Aqexperimental::refaliasing\*(Aq\*(C'\fR is in effect. +.PP +This feature makes these: +.PP +.Vb 2 +\& my \e$x; +\& our \e$y; +.Ve +.PP +equivalent to: +.PP +.Vb 2 +\& \emy $x; +\& \eour $x; +.Ve +.PP +It is intended mainly for use in assignments to references (see +"Assigning to References", above). It also allows the backslash to be +used on just some items in a list of declared variables: +.PP +.Vb 1 +\& my ($foo, \e@bar, \e%baz); # equivalent to: my $foo, \emy(@bar, %baz); +.Ve +.SH "WARNING: Don't use references as hash keys" +.IX Xref "reference, string context reference, use as hash key" +.IX Header "WARNING: Don't use references as hash keys" +You may not (usefully) use a reference as the key to a hash. It will be +converted into a string: +.PP +.Vb 1 +\& $x{ \e$a } = $a; +.Ve +.PP +If you try to dereference the key, it won't do a hard dereference, and +you won't accomplish what you're attempting. You might want to do something +more like +.PP +.Vb 2 +\& $r = \e@a; +\& $x{ $r } = $r; +.Ve +.PP +And then at least you can use the \fBvalues()\fR, which will be +real refs, instead of the \fBkeys()\fR, which won't. +.PP +The standard Tie::RefHash module provides a convenient workaround to this. +.SH "SEE ALSO" +.IX Header "SEE ALSO" +Besides the obvious documents, source code can be instructive. +Some pathological examples of the use of references can be found +in the \fIt/op/ref.t\fR regression test in the Perl source directory. +.PP +See also perldsc and perllol for how to use references to create +complex data structures, and perlootut and perlobj +for how to use them to create objects. |