summaryrefslogtreecommitdiffstats
path: root/upstream/debian-bookworm/man3/Encode::Encoding.3perl
diff options
context:
space:
mode:
Diffstat (limited to 'upstream/debian-bookworm/man3/Encode::Encoding.3perl')
-rw-r--r--upstream/debian-bookworm/man3/Encode::Encoding.3perl289
1 files changed, 289 insertions, 0 deletions
diff --git a/upstream/debian-bookworm/man3/Encode::Encoding.3perl b/upstream/debian-bookworm/man3/Encode::Encoding.3perl
new file mode 100644
index 00000000..00e04d9b
--- /dev/null
+++ b/upstream/debian-bookworm/man3/Encode::Encoding.3perl
@@ -0,0 +1,289 @@
+.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.43)
+.\"
+.\" Standard preamble:
+.\" ========================================================================
+.de Sp \" Vertical space (when we can't use .PP)
+.if t .sp .5v
+.if n .sp
+..
+.de Vb \" Begin verbatim text
+.ft CW
+.nf
+.ne \\$1
+..
+.de Ve \" End verbatim text
+.ft R
+.fi
+..
+.\" Set up some character translations and predefined strings. \*(-- will
+.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
+.\" double quote, and \*(R" will give a right double quote. \*(C+ will
+.\" give a nicer C++. Capital omega is used to do unbreakable dashes and
+.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,
+.\" nothing in troff, for use with C<>.
+.tr \(*W-
+.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
+.ie n \{\
+. ds -- \(*W-
+. ds PI pi
+. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
+. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
+. ds L" ""
+. ds R" ""
+. ds C` ""
+. ds C' ""
+'br\}
+.el\{\
+. ds -- \|\(em\|
+. ds PI \(*p
+. ds L" ``
+. ds R" ''
+. ds C`
+. ds C'
+'br\}
+.\"
+.\" Escape single quotes in literal strings from groff's Unicode transform.
+.ie \n(.g .ds Aq \(aq
+.el .ds Aq '
+.\"
+.\" If the F register is >0, we'll generate index entries on stderr for
+.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
+.\" entries marked with X<> in POD. Of course, you'll have to process the
+.\" output yourself in some meaningful fashion.
+.\"
+.\" Avoid warning from groff about undefined register 'F'.
+.de IX
+..
+.nr rF 0
+.if \n(.g .if rF .nr rF 1
+.if (\n(rF:(\n(.g==0)) \{\
+. if \nF \{\
+. de IX
+. tm Index:\\$1\t\\n%\t"\\$2"
+..
+. if !\nF==2 \{\
+. nr % 0
+. nr F 2
+. \}
+. \}
+.\}
+.rr rF
+.\" ========================================================================
+.\"
+.IX Title "Encode::Encoding 3perl"
+.TH Encode::Encoding 3perl "2023-11-25" "perl v5.36.0" "Perl Programmers Reference Guide"
+.\" For nroff, turn off justification. Always turn off hyphenation; it makes
+.\" way too many mistakes in technical documents.
+.if n .ad l
+.nh
+.SH "NAME"
+Encode::Encoding \- Encode Implementation Base Class
+.SH "SYNOPSIS"
+.IX Header "SYNOPSIS"
+.Vb 2
+\& package Encode::MyEncoding;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias));
+.Ve
+.SH "DESCRIPTION"
+.IX Header "DESCRIPTION"
+As mentioned in Encode, encodings are (in the current
+implementation at least) defined as objects. The mapping of encoding
+name to object is via the \f(CW%Encode::Encoding\fR hash. Though you can
+directly manipulate this hash, it is strongly encouraged to use this
+base class module and add \fBencode()\fR and \fBdecode()\fR methods.
+.SS "Methods you should implement"
+.IX Subsection "Methods you should implement"
+You are strongly encouraged to implement methods below, at least
+either \fBencode()\fR or \fBdecode()\fR.
+.IP "\->encode($string [,$check])" 4
+.IX Item "->encode($string [,$check])"
+\&\s-1MUST\s0 return the octet sequence representing \fI\f(CI$string\fI\fR.
+.RS 4
+.IP "\(bu" 2
+If \fI\f(CI$check\fI\fR is true, it \s-1SHOULD\s0 modify \fI\f(CI$string\fI\fR in place to remove
+the converted part (i.e. the whole string unless there is an error).
+If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0
+.IP "\(bu" 2
+If an error occurs, it \s-1SHOULD\s0 return the octet sequence for the
+fragment of string that has been converted and modify \f(CW$string\fR in-place
+to remove the converted part leaving it starting with the problem
+fragment. If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0
+.IP "\(bu" 2
+If \fI\f(CI$check\fI\fR is false then \f(CW\*(C`encode\*(C'\fR \s-1MUST\s0 make a \*(L"best effort\*(R" to
+convert the string \- for example, by using a replacement character.
+.RE
+.RS 4
+.RE
+.IP "\->decode($octets [,$check])" 4
+.IX Item "->decode($octets [,$check])"
+\&\s-1MUST\s0 return the string that \fI\f(CI$octets\fI\fR represents.
+.RS 4
+.IP "\(bu" 2
+If \fI\f(CI$check\fI\fR is true, it \s-1SHOULD\s0 modify \fI\f(CI$octets\fI\fR in place to remove
+the converted part (i.e. the whole sequence unless there is an
+error). If \fBperlio_ok()\fR is true, \s-1SHOULD\s0 becomes \s-1MUST.\s0
+.IP "\(bu" 2
+If an error occurs, it \s-1SHOULD\s0 return the fragment of string that has
+been converted and modify \f(CW$octets\fR in-place to remove the converted
+part leaving it starting with the problem fragment. If \fBperlio_ok()\fR is
+true, \s-1SHOULD\s0 becomes \s-1MUST.\s0
+.IP "\(bu" 2
+If \fI\f(CI$check\fI\fR is false then \f(CW\*(C`decode\*(C'\fR should make a \*(L"best effort\*(R" to
+convert the string \- for example by using Unicode's \*(L"\ex{\s-1FFFD\s0}\*(R" as a
+replacement character.
+.RE
+.RS 4
+.RE
+.PP
+If you want your encoding to work with encoding pragma, you should
+also implement the method below.
+.ie n .IP "\->cat_decode($destination, $octets, $offset, $terminator [,$check])" 4
+.el .IP "\->cat_decode($destination, \f(CW$octets\fR, \f(CW$offset\fR, \f(CW$terminator\fR [,$check])" 4
+.IX Item "->cat_decode($destination, $octets, $offset, $terminator [,$check])"
+\&\s-1MUST\s0 decode \fI\f(CI$octets\fI\fR with \fI\f(CI$offset\fI\fR and concatenate it to \fI\f(CI$destination\fI\fR.
+Decoding will terminate when \f(CW$terminator\fR (a string) appears in output.
+\&\fI\f(CI$offset\fI\fR will be modified to the last \f(CW$octets\fR position at end of decode.
+Returns true if \f(CW$terminator\fR appears output, else returns false.
+.SS "Other methods defined in Encode::Encodings"
+.IX Subsection "Other methods defined in Encode::Encodings"
+You do not have to override methods shown below unless you have to.
+.IP "\->name" 4
+.IX Item "->name"
+Predefined As:
+.Sp
+.Vb 1
+\& sub name { return shift\->{\*(AqName\*(Aq} }
+.Ve
+.Sp
+\&\s-1MUST\s0 return the string representing the canonical name of the encoding.
+.IP "\->mime_name" 4
+.IX Item "->mime_name"
+Predefined As:
+.Sp
+.Vb 3
+\& sub mime_name{
+\& return Encode::MIME::Name::get_mime_name(shift\->name);
+\& }
+.Ve
+.Sp
+\&\s-1MUST\s0 return the string representing the \s-1IANA\s0 charset name of the encoding.
+.IP "\->renew" 4
+.IX Item "->renew"
+Predefined As:
+.Sp
+.Vb 6
+\& sub renew {
+\& my $self = shift;
+\& my $clone = bless { %$self } => ref($self);
+\& $clone\->{renewed}++;
+\& return $clone;
+\& }
+.Ve
+.Sp
+This method reconstructs the encoding object if necessary. If you need
+to store the state during encoding, this is where you clone your object.
+.Sp
+PerlIO \s-1ALWAYS\s0 calls this method to make sure it has its own private
+encoding object.
+.IP "\->renewed" 4
+.IX Item "->renewed"
+Predefined As:
+.Sp
+.Vb 1
+\& sub renewed { $_[0]\->{renewed} || 0 }
+.Ve
+.Sp
+Tells whether the object is renewed (and how many times). Some
+modules emit \f(CW\*(C`Use of uninitialized value in null operation\*(C'\fR warning
+unless the value is numeric so return 0 for false.
+.IP "\->\fBperlio_ok()\fR" 4
+.IX Item "->perlio_ok()"
+Predefined As:
+.Sp
+.Vb 3
+\& sub perlio_ok {
+\& return eval { require PerlIO::encoding } ? 1 : 0;
+\& }
+.Ve
+.Sp
+If your encoding does not support PerlIO for some reasons, just;
+.Sp
+.Vb 1
+\& sub perlio_ok { 0 }
+.Ve
+.IP "\->\fBneeds_lines()\fR" 4
+.IX Item "->needs_lines()"
+Predefined As:
+.Sp
+.Vb 1
+\& sub needs_lines { 0 };
+.Ve
+.Sp
+If your encoding can work with PerlIO but needs line buffering, you
+\&\s-1MUST\s0 define this method so it returns true. 7bit \s-1ISO\-2022\s0 encodings
+are one example that needs this. When this method is missing, false
+is assumed.
+.SS "Example: Encode::ROT13"
+.IX Subsection "Example: Encode::ROT13"
+.Vb 3
+\& package Encode::ROT13;
+\& use strict;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(\*(Aqrot13\*(Aq);
+\&
+\& sub encode($$;$){
+\& my ($obj, $str, $chk) = @_;
+\& $str =~ tr/A\-Za\-z/N\-ZA\-Mn\-za\-m/;
+\& $_[1] = \*(Aq\*(Aq if $chk; # this is what in\-place edit means
+\& return $str;
+\& }
+\&
+\& # Jr pna or ynml yvxr guvf;
+\& *decode = \e&encode;
+\&
+\& 1;
+.Ve
+.SH "Why the heck Encode API is different?"
+.IX Header "Why the heck Encode API is different?"
+It should be noted that the \fI\f(CI$check\fI\fR behaviour is different from the
+outer public \s-1API.\s0 The logic is that the \*(L"unchecked\*(R" case is useful
+when the encoding is part of a stream which may be reporting errors
+(e.g. \s-1STDERR\s0). In such cases, it is desirable to get everything
+through somehow without causing additional errors which obscure the
+original one. Also, the encoding is best placed to know what the
+correct replacement character is, so if that is the desired behaviour
+then letting low level code do it is the most efficient.
+.PP
+By contrast, if \fI\f(CI$check\fI\fR is true, the scheme above allows the
+encoding to do as much as it can and tell the layer above how much
+that was. What is lacking at present is a mechanism to report what
+went wrong. The most likely interface will be an additional method
+call to the object, or perhaps (to avoid forcing per-stream objects
+on otherwise stateless encodings) an additional parameter.
+.PP
+It is also highly desirable that encoding classes inherit from
+\&\f(CW\*(C`Encode::Encoding\*(C'\fR as a base class. This allows that class to define
+additional behaviour for all encoding objects.
+.PP
+.Vb 2
+\& package Encode::MyEncoding;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias));
+.Ve
+.PP
+to create an object with \f(CW\*(C`bless {Name => ...}, $class\*(C'\fR, and call
+define_encoding. They inherit their \f(CW\*(C`name\*(C'\fR method from
+\&\f(CW\*(C`Encode::Encoding\*(C'\fR.
+.SS "Compiled Encodings"
+.IX Subsection "Compiled Encodings"
+For the sake of speed and efficiency, most of the encodings are now
+supported via a \fIcompiled form\fR: \s-1XS\s0 modules generated from \s-1UCM\s0
+files. Encode provides the enc2xs tool to achieve that. Please see
+enc2xs for more details.
+.SH "SEE ALSO"
+.IX Header "SEE ALSO"
+perlmod, enc2xs