diff options
Diffstat (limited to 'upstream/debian-unstable/man3/Encode::Encoding.3perl')
-rw-r--r-- | upstream/debian-unstable/man3/Encode::Encoding.3perl | 273 |
1 files changed, 273 insertions, 0 deletions
diff --git a/upstream/debian-unstable/man3/Encode::Encoding.3perl b/upstream/debian-unstable/man3/Encode::Encoding.3perl new file mode 100644 index 00000000..6371ae14 --- /dev/null +++ b/upstream/debian-unstable/man3/Encode::Encoding.3perl @@ -0,0 +1,273 @@ +.\" -*- mode: troff; coding: utf-8 -*- +.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. +.ie n \{\ +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds C` +. ds C' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is >0, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.\" +.\" Avoid warning from groff about undefined register 'F'. +.de IX +.. +.nr rF 0 +.if \n(.g .if rF .nr rF 1 +.if (\n(rF:(\n(.g==0)) \{\ +. if \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. if !\nF==2 \{\ +. nr % 0 +. nr F 2 +. \} +. \} +.\} +.rr rF +.\" ======================================================================== +.\" +.IX Title "Encode::Encoding 3perl" +.TH Encode::Encoding 3perl 2024-01-12 "perl v5.38.2" "Perl Programmers Reference Guide" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH NAME +Encode::Encoding \- Encode Implementation Base Class +.SH SYNOPSIS +.IX Header "SYNOPSIS" +.Vb 2 +\& package Encode::MyEncoding; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias)); +.Ve +.SH DESCRIPTION +.IX Header "DESCRIPTION" +As mentioned in Encode, encodings are (in the current +implementation at least) defined as objects. The mapping of encoding +name to object is via the \f(CW%Encode::Encoding\fR hash. Though you can +directly manipulate this hash, it is strongly encouraged to use this +base class module and add \fBencode()\fR and \fBdecode()\fR methods. +.SS "Methods you should implement" +.IX Subsection "Methods you should implement" +You are strongly encouraged to implement methods below, at least +either \fBencode()\fR or \fBdecode()\fR. +.IP "\->encode($string [,$check])" 4 +.IX Item "->encode($string [,$check])" +MUST return the octet sequence representing \fR\f(CI$string\fR\fI\fR. +.RS 4 +.IP \(bu 2 +If \fR\f(CI$check\fR\fI\fR is true, it SHOULD modify \fI\fR\f(CI$string\fR\fI\fR in place to remove +the converted part (i.e. the whole string unless there is an error). +If \fBperlio_ok()\fR is true, SHOULD becomes MUST. +.IP \(bu 2 +If an error occurs, it SHOULD return the octet sequence for the +fragment of string that has been converted and modify \f(CW$string\fR in-place +to remove the converted part leaving it starting with the problem +fragment. If \fBperlio_ok()\fR is true, SHOULD becomes MUST. +.IP \(bu 2 +If \fR\f(CI$check\fR\fI\fR is false then \f(CW\*(C`encode\*(C'\fR MUST make a "best effort" to +convert the string \- for example, by using a replacement character. +.RE +.RS 4 +.RE +.IP "\->decode($octets [,$check])" 4 +.IX Item "->decode($octets [,$check])" +MUST return the string that \fR\f(CI$octets\fR\fI\fR represents. +.RS 4 +.IP \(bu 2 +If \fR\f(CI$check\fR\fI\fR is true, it SHOULD modify \fI\fR\f(CI$octets\fR\fI\fR in place to remove +the converted part (i.e. the whole sequence unless there is an +error). If \fBperlio_ok()\fR is true, SHOULD becomes MUST. +.IP \(bu 2 +If an error occurs, it SHOULD return the fragment of string that has +been converted and modify \f(CW$octets\fR in-place to remove the converted +part leaving it starting with the problem fragment. If \fBperlio_ok()\fR is +true, SHOULD becomes MUST. +.IP \(bu 2 +If \fR\f(CI$check\fR\fI\fR is false then \f(CW\*(C`decode\*(C'\fR should make a "best effort" to +convert the string \- for example by using Unicode's "\ex{FFFD}" as a +replacement character. +.RE +.RS 4 +.RE +.PP +If you want your encoding to work with encoding pragma, you should +also implement the method below. +.ie n .IP "\->cat_decode($destination, $octets, $offset, $terminator [,$check])" 4 +.el .IP "\->cat_decode($destination, \f(CW$octets\fR, \f(CW$offset\fR, \f(CW$terminator\fR [,$check])" 4 +.IX Item "->cat_decode($destination, $octets, $offset, $terminator [,$check])" +MUST decode \fR\f(CI$octets\fR\fI\fR with \fI\fR\f(CI$offset\fR\fI\fR and concatenate it to \fI\fR\f(CI$destination\fR\fI\fR. +Decoding will terminate when \f(CW$terminator\fR (a string) appears in output. +\&\fI\fR\f(CI$offset\fR\fI\fR will be modified to the last \f(CW$octets\fR position at end of decode. +Returns true if \f(CW$terminator\fR appears output, else returns false. +.SS "Other methods defined in Encode::Encodings" +.IX Subsection "Other methods defined in Encode::Encodings" +You do not have to override methods shown below unless you have to. +.IP \->name 4 +.IX Item "->name" +Predefined As: +.Sp +.Vb 1 +\& sub name { return shift\->{\*(AqName\*(Aq} } +.Ve +.Sp +MUST return the string representing the canonical name of the encoding. +.IP \->mime_name 4 +.IX Item "->mime_name" +Predefined As: +.Sp +.Vb 3 +\& sub mime_name{ +\& return Encode::MIME::Name::get_mime_name(shift\->name); +\& } +.Ve +.Sp +MUST return the string representing the IANA charset name of the encoding. +.IP \->renew 4 +.IX Item "->renew" +Predefined As: +.Sp +.Vb 6 +\& sub renew { +\& my $self = shift; +\& my $clone = bless { %$self } => ref($self); +\& $clone\->{renewed}++; +\& return $clone; +\& } +.Ve +.Sp +This method reconstructs the encoding object if necessary. If you need +to store the state during encoding, this is where you clone your object. +.Sp +PerlIO ALWAYS calls this method to make sure it has its own private +encoding object. +.IP \->renewed 4 +.IX Item "->renewed" +Predefined As: +.Sp +.Vb 1 +\& sub renewed { $_[0]\->{renewed} || 0 } +.Ve +.Sp +Tells whether the object is renewed (and how many times). Some +modules emit \f(CW\*(C`Use of uninitialized value in null operation\*(C'\fR warning +unless the value is numeric so return 0 for false. +.IP \->\fBperlio_ok()\fR 4 +.IX Item "->perlio_ok()" +Predefined As: +.Sp +.Vb 3 +\& sub perlio_ok { +\& return eval { require PerlIO::encoding } ? 1 : 0; +\& } +.Ve +.Sp +If your encoding does not support PerlIO for some reasons, just; +.Sp +.Vb 1 +\& sub perlio_ok { 0 } +.Ve +.IP \->\fBneeds_lines()\fR 4 +.IX Item "->needs_lines()" +Predefined As: +.Sp +.Vb 1 +\& sub needs_lines { 0 }; +.Ve +.Sp +If your encoding can work with PerlIO but needs line buffering, you +MUST define this method so it returns true. 7bit ISO\-2022 encodings +are one example that needs this. When this method is missing, false +is assumed. +.SS "Example: Encode::ROT13" +.IX Subsection "Example: Encode::ROT13" +.Vb 3 +\& package Encode::ROT13; +\& use strict; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(\*(Aqrot13\*(Aq); +\& +\& sub encode($$;$){ +\& my ($obj, $str, $chk) = @_; +\& $str =~ tr/A\-Za\-z/N\-ZA\-Mn\-za\-m/; +\& $_[1] = \*(Aq\*(Aq if $chk; # this is what in\-place edit means +\& return $str; +\& } +\& +\& # Jr pna or ynml yvxr guvf; +\& *decode = \e&encode; +\& +\& 1; +.Ve +.SH "Why the heck Encode API is different?" +.IX Header "Why the heck Encode API is different?" +It should be noted that the \fR\f(CI$check\fR\fI\fR behaviour is different from the +outer public API. The logic is that the "unchecked" case is useful +when the encoding is part of a stream which may be reporting errors +(e.g. STDERR). In such cases, it is desirable to get everything +through somehow without causing additional errors which obscure the +original one. Also, the encoding is best placed to know what the +correct replacement character is, so if that is the desired behaviour +then letting low level code do it is the most efficient. +.PP +By contrast, if \fR\f(CI$check\fR\fI\fR is true, the scheme above allows the +encoding to do as much as it can and tell the layer above how much +that was. What is lacking at present is a mechanism to report what +went wrong. The most likely interface will be an additional method +call to the object, or perhaps (to avoid forcing per-stream objects +on otherwise stateless encodings) an additional parameter. +.PP +It is also highly desirable that encoding classes inherit from +\&\f(CW\*(C`Encode::Encoding\*(C'\fR as a base class. This allows that class to define +additional behaviour for all encoding objects. +.PP +.Vb 2 +\& package Encode::MyEncoding; +\& use parent qw(Encode::Encoding); +\& +\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias)); +.Ve +.PP +to create an object with \f(CW\*(C`bless {Name => ...}, $class\*(C'\fR, and call +define_encoding. They inherit their \f(CW\*(C`name\*(C'\fR method from +\&\f(CW\*(C`Encode::Encoding\*(C'\fR. +.SS "Compiled Encodings" +.IX Subsection "Compiled Encodings" +For the sake of speed and efficiency, most of the encodings are now +supported via a \fIcompiled form\fR: XS modules generated from UCM +files. Encode provides the enc2xs tool to achieve that. Please see +enc2xs for more details. +.SH "SEE ALSO" +.IX Header "SEE ALSO" +perlmod, enc2xs |