diff options
Diffstat (limited to 'upstream/debian-bookworm/man3/Encode::Encoding.3perl')
-rw-r--r-- | upstream/debian-bookworm/man3/Encode::Encoding.3perl | 289 |
1 files changed, 289 insertions, 0 deletions
diff --git a/upstream/debian-bookworm/man3/Encode::Encoding.3perl b/upstream/debian-bookworm/man3/Encode::Encoding.3perl new file mode 100644 index 00000000..00e04d9b --- /dev/null +++ b/upstream/debian-bookworm/man3/Encode::Encoding.3perl @@ -0,0 +1,289 @@ +.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.43) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" Set up some character translations and predefined strings. \*(-- will +.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left +.\" double quote, and \*(R" will give a right double quote. \*(C+ will +.\" give a nicer C++. Capital omega is used to do unbreakable dashes and +.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, +.\" nothing in troff, for use with C<>. +.tr \(*W- +.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' +.ie n \{\ +. ds -- \(*W- +. ds PI pi +. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch +. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch +. ds L" "" +. ds R" "" +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds -- \|\(em\| +. ds PI \(*p +. ds L" `` +. ds R" '' +. ds C` +. ds C' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is >0, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.\" +.\" Avoid warning from groff about undefined register 'F'. +.de IX +.. +.nr rF 0 +.if \n(.g .if rF .nr rF 1 +.if (\n(rF:(\n(.g==0)) \{\ +. if \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. if !\nF==2 \{\ +. nr % 0 +. nr F 2 +. \} +. \} +.\} +.rr rF +.\" ======================================================================== +.\" +.IX Title "Encode::Encoding 3perl" +.TH Encode::Encoding 3perl "2023-11-25" "perl v5.36.0" "Perl Programmers Reference Guide" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH "NAME" +Encode::Encoding \- Encode Implementation Base Class +.SH "SYNOPSIS" +.IX Header "SYNOPSIS" +.Vb 2 +\& package Encode::MyEncoding; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias)); +.Ve +.SH "DESCRIPTION" +.IX Header "DESCRIPTION" +As mentioned in Encode, encodings are (in the current +implementation at least) defined as objects. The mapping of encoding +name to object is via the \f(CW%Encode::Encoding\fR hash. Though you can +directly manipulate this hash, it is strongly encouraged to use this +base class module and add \fBencode()\fR and \fBdecode()\fR methods. +.SS "Methods you should implement" +.IX Subsection "Methods you should implement" +You are strongly encouraged to implement methods below, at least +either \fBencode()\fR or \fBdecode()\fR. +.IP "\->encode($string [,$check])" 4 +.IX Item "->encode($string [,$check])" +\&\s-1MUST\s0 return the octet sequence representing \fI\f(CI$string\fI\fR. +.RS 4 +.IP "\(bu" 2 +If \fI\f(CI$check\fI\fR is true, it \s-1SHOULD\s0 modify \fI\f(CI$string\fI\fR in place to remove +the converted part (i.e. the whole string unless there is an error). +If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0 +.IP "\(bu" 2 +If an error occurs, it \s-1SHOULD\s0 return the octet sequence for the +fragment of string that has been converted and modify \f(CW$string\fR in-place +to remove the converted part leaving it starting with the problem +fragment. If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0 +.IP "\(bu" 2 +If \fI\f(CI$check\fI\fR is false then \f(CW\*(C`encode\*(C'\fR \s-1MUST\s0 make a \*(L"best effort\*(R" to +convert the string \- for example, by using a replacement character. +.RE +.RS 4 +.RE +.IP "\->decode($octets [,$check])" 4 +.IX Item "->decode($octets [,$check])" +\&\s-1MUST\s0 return the string that \fI\f(CI$octets\fI\fR represents. +.RS 4 +.IP "\(bu" 2 +If \fI\f(CI$check\fI\fR is true, it \s-1SHOULD\s0 modify \fI\f(CI$octets\fI\fR in place to remove +the converted part (i.e. the whole sequence unless there is an +error). If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0 +.IP "\(bu" 2 +If an error occurs, it \s-1SHOULD\s0 return the fragment of string that has +been converted and modify \f(CW$octets\fR in-place to remove the converted +part leaving it starting with the problem fragment. If \fBperlio_ok()\fR is +true, \s-1SHOULD\s0 becomes \s-1MUST.\s0 +.IP "\(bu" 2 +If \fI\f(CI$check\fI\fR is false then \f(CW\*(C`decode\*(C'\fR should make a \*(L"best effort\*(R" to +convert the string \- for example by using Unicode's \*(L"\ex{\s-1FFFD\s0}\*(R" as a +replacement character. +.RE +.RS 4 +.RE +.PP +If you want your encoding to work with encoding pragma, you should +also implement the method below. +.ie n .IP "\->cat_decode($destination, $octets, $offset, $terminator [,$check])" 4 +.el .IP "\->cat_decode($destination, \f(CW$octets\fR, \f(CW$offset\fR, \f(CW$terminator\fR [,$check])" 4 +.IX Item "->cat_decode($destination, $octets, $offset, $terminator [,$check])" +\&\s-1MUST\s0 decode \fI\f(CI$octets\fI\fR with \fI\f(CI$offset\fI\fR and concatenate it to \fI\f(CI$destination\fI\fR. +Decoding will terminate when \f(CW$terminator\fR (a string) appears in output. +\&\fI\f(CI$offset\fI\fR will be modified to the last \f(CW$octets\fR position at end of decode. +Returns true if \f(CW$terminator\fR appears output, else returns false. +.SS "Other methods defined in Encode::Encodings" +.IX Subsection "Other methods defined in Encode::Encodings" +You do not have to override methods shown below unless you have to. +.IP "\->name" 4 +.IX Item "->name" +Predefined As: +.Sp +.Vb 1 +\& sub name { return shift\->{\*(AqName\*(Aq} } +.Ve +.Sp +\&\s-1MUST\s0 return the string representing the canonical name of the encoding. +.IP "\->mime_name" 4 +.IX Item "->mime_name" +Predefined As: +.Sp +.Vb 3 +\& sub mime_name{ +\& return Encode::MIME::Name::get_mime_name(shift\->name); +\& } +.Ve +.Sp +\&\s-1MUST\s0 return the string representing the \s-1IANA\s0 charset name of the encoding. +.IP "\->renew" 4 +.IX Item "->renew" +Predefined As: +.Sp +.Vb 6 +\& sub renew { +\& my $self = shift; +\& my $clone = bless { %$self } => ref($self); +\& $clone\->{renewed}++; +\& return $clone; +\& } +.Ve +.Sp +This method reconstructs the encoding object if necessary. If you need +to store the state during encoding, this is where you clone your object. +.Sp +PerlIO \s-1ALWAYS\s0 calls this method to make sure it has its own private +encoding object. +.IP "\->renewed" 4 +.IX Item "->renewed" +Predefined As: +.Sp +.Vb 1 +\& sub renewed { $_[0]\->{renewed} || 0 } +.Ve +.Sp +Tells whether the object is renewed (and how many times). Some +modules emit \f(CW\*(C`Use of uninitialized value in null operation\*(C'\fR warning +unless the value is numeric so return 0 for false. +.IP "\->\fBperlio_ok()\fR" 4 +.IX Item "->perlio_ok()" +Predefined As: +.Sp +.Vb 3 +\& sub perlio_ok { +\& return eval { require PerlIO::encoding } ? 1 : 0; +\& } +.Ve +.Sp +If your encoding does not support PerlIO for some reasons, just; +.Sp +.Vb 1 +\& sub perlio_ok { 0 } +.Ve +.IP "\->\fBneeds_lines()\fR" 4 +.IX Item "->needs_lines()" +Predefined As: +.Sp +.Vb 1 +\& sub needs_lines { 0 }; +.Ve +.Sp +If your encoding can work with PerlIO but needs line buffering, you +\&\s-1MUST\s0 define this method so it returns true. 7bit \s-1ISO\-2022\s0 encodings +are one example that needs this. When this method is missing, false +is assumed. +.SS "Example: Encode::ROT13" +.IX Subsection "Example: Encode::ROT13" +.Vb 3 +\& package Encode::ROT13; +\& use strict; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(\*(Aqrot13\*(Aq); +\& +\& sub encode($$;$){ +\& my ($obj, $str, $chk) = @_; +\& $str =~ tr/A\-Za\-z/N\-ZA\-Mn\-za\-m/; +\& $_[1] = \*(Aq\*(Aq if $chk; # this is what in\-place edit means +\& return $str; +\& } +\& +\& # Jr pna or ynml yvxr guvf; +\& *decode = \e&encode; +\& +\& 1; +.Ve +.SH "Why the heck Encode API is different?" +.IX Header "Why the heck Encode API is different?" +It should be noted that the \fI\f(CI$check\fI\fR behaviour is different from the +outer public \s-1API.\s0 The logic is that the \*(L"unchecked\*(R" case is useful +when the encoding is part of a stream which may be reporting errors +(e.g. \s-1STDERR\s0). In such cases, it is desirable to get everything +through somehow without causing additional errors which obscure the +original one. Also, the encoding is best placed to know what the +correct replacement character is, so if that is the desired behaviour +then letting low level code do it is the most efficient. +.PP +By contrast, if \fI\f(CI$check\fI\fR is true, the scheme above allows the +encoding to do as much as it can and tell the layer above how much +that was. What is lacking at present is a mechanism to report what +went wrong. The most likely interface will be an additional method +call to the object, or perhaps (to avoid forcing per-stream objects +on otherwise stateless encodings) an additional parameter. +.PP +It is also highly desirable that encoding classes inherit from +\&\f(CW\*(C`Encode::Encoding\*(C'\fR as a base class. This allows that class to define +additional behaviour for all encoding objects. +.PP +.Vb 2 +\& package Encode::MyEncoding; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias)); +.Ve +.PP +to create an object with \f(CW\*(C`bless {Name => ...}, $class\*(C'\fR, and call +define_encoding. They inherit their \f(CW\*(C`name\*(C'\fR method from +\&\f(CW\*(C`Encode::Encoding\*(C'\fR. +.SS "Compiled Encodings" +.IX Subsection "Compiled Encodings" +For the sake of speed and efficiency, most of the encodings are now +supported via a \fIcompiled form\fR: \s-1XS\s0 modules generated from \s-1UCM\s0 +files. Encode provides the enc2xs tool to achieve that. Please see +enc2xs for more details. +.SH "SEE ALSO" +.IX Header "SEE ALSO" +perlmod, enc2xs |