summaryrefslogtreecommitdiffstats
path: root/upstream/debian-unstable/man3/Encode::Encoding.3perl
diff options
context:
space:
mode:
Diffstat (limited to 'upstream/debian-unstable/man3/Encode::Encoding.3perl')
-rw-r--r--upstream/debian-unstable/man3/Encode::Encoding.3perl273
1 files changed, 273 insertions, 0 deletions
diff --git a/upstream/debian-unstable/man3/Encode::Encoding.3perl b/upstream/debian-unstable/man3/Encode::Encoding.3perl
new file mode 100644
index 00000000..6371ae14
--- /dev/null
+++ b/upstream/debian-unstable/man3/Encode::Encoding.3perl
@@ -0,0 +1,273 @@
+.\" -*- mode: troff; coding: utf-8 -*-
+.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43)
+.\"
+.\" Standard preamble:
+.\" ========================================================================
+.de Sp \" Vertical space (when we can't use .PP)
+.if t .sp .5v
+.if n .sp
+..
+.de Vb \" Begin verbatim text
+.ft CW
+.nf
+.ne \\$1
+..
+.de Ve \" End verbatim text
+.ft R
+.fi
+..
+.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>.
+.ie n \{\
+. ds C` ""
+. ds C' ""
+'br\}
+.el\{\
+. ds C`
+. ds C'
+'br\}
+.\"
+.\" Escape single quotes in literal strings from groff's Unicode transform.
+.ie \n(.g .ds Aq \(aq
+.el .ds Aq '
+.\"
+.\" If the F register is >0, we'll generate index entries on stderr for
+.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
+.\" entries marked with X<> in POD. Of course, you'll have to process the
+.\" output yourself in some meaningful fashion.
+.\"
+.\" Avoid warning from groff about undefined register 'F'.
+.de IX
+..
+.nr rF 0
+.if \n(.g .if rF .nr rF 1
+.if (\n(rF:(\n(.g==0)) \{\
+. if \nF \{\
+. de IX
+. tm Index:\\$1\t\\n%\t"\\$2"
+..
+. if !\nF==2 \{\
+. nr % 0
+. nr F 2
+. \}
+. \}
+.\}
+.rr rF
+.\" ========================================================================
+.\"
+.IX Title "Encode::Encoding 3perl"
+.TH Encode::Encoding 3perl 2024-01-12 "perl v5.38.2" "Perl Programmers Reference Guide"
+.\" For nroff, turn off justification. Always turn off hyphenation; it makes
+.\" way too many mistakes in technical documents.
+.if n .ad l
+.nh
+.SH NAME
+Encode::Encoding \- Encode Implementation Base Class
+.SH SYNOPSIS
+.IX Header "SYNOPSIS"
+.Vb 2
+\& package Encode::MyEncoding;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias));
+.Ve
+.SH DESCRIPTION
+.IX Header "DESCRIPTION"
+As mentioned in Encode, encodings are (in the current
+implementation at least) defined as objects. The mapping of encoding
+name to object is via the \f(CW%Encode::Encoding\fR hash. Though you can
+directly manipulate this hash, it is strongly encouraged to use this
+base class module and add \fBencode()\fR and \fBdecode()\fR methods.
+.SS "Methods you should implement"
+.IX Subsection "Methods you should implement"
+You are strongly encouraged to implement methods below, at least
+either \fBencode()\fR or \fBdecode()\fR.
+.IP "\->encode($string [,$check])" 4
+.IX Item "->encode($string [,$check])"
+MUST return the octet sequence representing \fR\f(CI$string\fR\fI\fR.
+.RS 4
+.IP \(bu 2
+If \fR\f(CI$check\fR\fI\fR is true, it SHOULD modify \fI\fR\f(CI$string\fR\fI\fR in place to remove
+the converted part (i.e. the whole string unless there is an error).
+If \fBperlio_ok()\fR is true, SHOULD becomes MUST.
+.IP \(bu 2
+If an error occurs, it SHOULD return the octet sequence for the
+fragment of string that has been converted and modify \f(CW$string\fR in-place
+to remove the converted part leaving it starting with the problem
+fragment. If \fBperlio_ok()\fR is true, SHOULD becomes MUST.
+.IP \(bu 2
+If \fR\f(CI$check\fR\fI\fR is false then \f(CW\*(C`encode\*(C'\fR MUST make a "best effort" to
+convert the string \- for example, by using a replacement character.
+.RE
+.RS 4
+.RE
+.IP "\->decode($octets [,$check])" 4
+.IX Item "->decode($octets [,$check])"
+MUST return the string that \fR\f(CI$octets\fR\fI\fR represents.
+.RS 4
+.IP \(bu 2
+If \fR\f(CI$check\fR\fI\fR is true, it SHOULD modify \fI\fR\f(CI$octets\fR\fI\fR in place to remove
+the converted part (i.e. the whole sequence unless there is an
+error). If \fBperlio_ok()\fR is true, SHOULD becomes MUST.
+.IP \(bu 2
+If an error occurs, it SHOULD return the fragment of string that has
+been converted and modify \f(CW$octets\fR in-place to remove the converted
+part leaving it starting with the problem fragment. If \fBperlio_ok()\fR is
+true, SHOULD becomes MUST.
+.IP \(bu 2
+If \fR\f(CI$check\fR\fI\fR is false then \f(CW\*(C`decode\*(C'\fR should make a "best effort" to
+convert the string \- for example by using Unicode's "\ex{FFFD}" as a
+replacement character.
+.RE
+.RS 4
+.RE
+.PP
+If you want your encoding to work with encoding pragma, you should
+also implement the method below.
+.ie n .IP "\->cat_decode($destination, $octets, $offset, $terminator [,$check])" 4
+.el .IP "\->cat_decode($destination, \f(CW$octets\fR, \f(CW$offset\fR, \f(CW$terminator\fR [,$check])" 4
+.IX Item "->cat_decode($destination, $octets, $offset, $terminator [,$check])"
+MUST decode \fR\f(CI$octets\fR\fI\fR with \fI\fR\f(CI$offset\fR\fI\fR and concatenate it to \fI\fR\f(CI$destination\fR\fI\fR.
+Decoding will terminate when \f(CW$terminator\fR (a string) appears in output.
+\&\fI\fR\f(CI$offset\fR\fI\fR will be modified to the last \f(CW$octets\fR position at end of decode.
+Returns true if \f(CW$terminator\fR appears output, else returns false.
+.SS "Other methods defined in Encode::Encodings"
+.IX Subsection "Other methods defined in Encode::Encodings"
+You do not have to override methods shown below unless you have to.
+.IP \->name 4
+.IX Item "->name"
+Predefined As:
+.Sp
+.Vb 1
+\& sub name { return shift\->{\*(AqName\*(Aq} }
+.Ve
+.Sp
+MUST return the string representing the canonical name of the encoding.
+.IP \->mime_name 4
+.IX Item "->mime_name"
+Predefined As:
+.Sp
+.Vb 3
+\& sub mime_name{
+\& return Encode::MIME::Name::get_mime_name(shift\->name);
+\& }
+.Ve
+.Sp
+MUST return the string representing the IANA charset name of the encoding.
+.IP \->renew 4
+.IX Item "->renew"
+Predefined As:
+.Sp
+.Vb 6
+\& sub renew {
+\& my $self = shift;
+\& my $clone = bless { %$self } => ref($self);
+\& $clone\->{renewed}++;
+\& return $clone;
+\& }
+.Ve
+.Sp
+This method reconstructs the encoding object if necessary. If you need
+to store the state during encoding, this is where you clone your object.
+.Sp
+PerlIO ALWAYS calls this method to make sure it has its own private
+encoding object.
+.IP \->renewed 4
+.IX Item "->renewed"
+Predefined As:
+.Sp
+.Vb 1
+\& sub renewed { $_[0]\->{renewed} || 0 }
+.Ve
+.Sp
+Tells whether the object is renewed (and how many times). Some
+modules emit \f(CW\*(C`Use of uninitialized value in null operation\*(C'\fR warning
+unless the value is numeric so return 0 for false.
+.IP \->\fBperlio_ok()\fR 4
+.IX Item "->perlio_ok()"
+Predefined As:
+.Sp
+.Vb 3
+\& sub perlio_ok {
+\& return eval { require PerlIO::encoding } ? 1 : 0;
+\& }
+.Ve
+.Sp
+If your encoding does not support PerlIO for some reasons, just;
+.Sp
+.Vb 1
+\& sub perlio_ok { 0 }
+.Ve
+.IP \->\fBneeds_lines()\fR 4
+.IX Item "->needs_lines()"
+Predefined As:
+.Sp
+.Vb 1
+\& sub needs_lines { 0 };
+.Ve
+.Sp
+If your encoding can work with PerlIO but needs line buffering, you
+MUST define this method so it returns true. 7bit ISO\-2022 encodings
+are one example that needs this. When this method is missing, false
+is assumed.
+.SS "Example: Encode::ROT13"
+.IX Subsection "Example: Encode::ROT13"
+.Vb 3
+\& package Encode::ROT13;
+\& use strict;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(\*(Aqrot13\*(Aq);
+\&
+\& sub encode($$;$){
+\& my ($obj, $str, $chk) = @_;
+\& $str =~ tr/A\-Za\-z/N\-ZA\-Mn\-za\-m/;
+\& $_[1] = \*(Aq\*(Aq if $chk; # this is what in\-place edit means
+\& return $str;
+\& }
+\&
+\& # Jr pna or ynml yvxr guvf;
+\& *decode = \e&encode;
+\&
+\& 1;
+.Ve
+.SH "Why the heck Encode API is different?"
+.IX Header "Why the heck Encode API is different?"
+It should be noted that the \fR\f(CI$check\fR\fI\fR behaviour is different from the
+outer public API. The logic is that the "unchecked" case is useful
+when the encoding is part of a stream which may be reporting errors
+(e.g. STDERR). In such cases, it is desirable to get everything
+through somehow without causing additional errors which obscure the
+original one. Also, the encoding is best placed to know what the
+correct replacement character is, so if that is the desired behaviour
+then letting low level code do it is the most efficient.
+.PP
+By contrast, if \fR\f(CI$check\fR\fI\fR is true, the scheme above allows the
+encoding to do as much as it can and tell the layer above how much
+that was. What is lacking at present is a mechanism to report what
+went wrong. The most likely interface will be an additional method
+call to the object, or perhaps (to avoid forcing per-stream objects
+on otherwise stateless encodings) an additional parameter.
+.PP
+It is also highly desirable that encoding classes inherit from
+\&\f(CW\*(C`Encode::Encoding\*(C'\fR as a base class. This allows that class to define
+additional behaviour for all encoding objects.
+.PP
+.Vb 2
+\& package Encode::MyEncoding;
+\& use parent qw(Encode::Encoding);
+\&
+\& _\|_PACKAGE_\|_\->Define(qw(myCanonical myAlias));
+.Ve
+.PP
+to create an object with \f(CW\*(C`bless {Name => ...}, $class\*(C'\fR, and call
+define_encoding. They inherit their \f(CW\*(C`name\*(C'\fR method from
+\&\f(CW\*(C`Encode::Encoding\*(C'\fR.
+.SS "Compiled Encodings"
+.IX Subsection "Compiled Encodings"
+For the sake of speed and efficiency, most of the encodings are now
+supported via a \fIcompiled form\fR: XS modules generated from UCM
+files. Encode provides the enc2xs tool to achieve that. Please see
+enc2xs for more details.
+.SH "SEE ALSO"
+.IX Header "SEE ALSO"
+perlmod, enc2xs