diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-15 19:43:11 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-15 19:43:11 +0000 |
commit | fc22b3d6507c6745911b9dfcc68f1e665ae13dbc (patch) | |
tree | ce1e3bce06471410239a6f41282e328770aa404a /upstream/mageia-cauldron/man3pm/encoding::warnings.3pm | |
parent | Initial commit. (diff) | |
download | manpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.tar.xz manpages-l10n-fc22b3d6507c6745911b9dfcc68f1e665ae13dbc.zip |
Adding upstream version 4.22.0.upstream/4.22.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'upstream/mageia-cauldron/man3pm/encoding::warnings.3pm')
-rw-r--r-- | upstream/mageia-cauldron/man3pm/encoding::warnings.3pm | 226 |
1 files changed, 226 insertions, 0 deletions
diff --git a/upstream/mageia-cauldron/man3pm/encoding::warnings.3pm b/upstream/mageia-cauldron/man3pm/encoding::warnings.3pm new file mode 100644 index 00000000..a16c72ec --- /dev/null +++ b/upstream/mageia-cauldron/man3pm/encoding::warnings.3pm @@ -0,0 +1,226 @@ +.\" -*- mode: troff; coding: utf-8 -*- +.\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. +.ie n \{\ +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds C` +. ds C' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is >0, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.\" +.\" Avoid warning from groff about undefined register 'F'. +.de IX +.. +.nr rF 0 +.if \n(.g .if rF .nr rF 1 +.if (\n(rF:(\n(.g==0)) \{\ +. if \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. if !\nF==2 \{\ +. nr % 0 +. nr F 2 +. \} +. \} +.\} +.rr rF +.\" ======================================================================== +.\" +.IX Title "encoding::warnings 3pm" +.TH encoding::warnings 3pm 2023-11-28 "perl v5.38.2" "Perl Programmers Reference Guide" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH NAME +encoding::warnings \- Warn on implicit encoding conversions +.SH VERSION +.IX Header "VERSION" +This document describes version 0.13 of encoding::warnings, released +June 20, 2016. +.SH NOTICE +.IX Header "NOTICE" +As of Perl 5.26.0, this module has no effect. The internal Perl feature +that was used to implement this module has been removed. In recent years, +much work has been done on the Perl core to eliminate discrepancies in the +treatment of upgraded versus downgraded strings. In addition, the +encoding pragma, which caused many of the problems, is no longer +supported. Thus, the warnings this module produced are no longer +necessary. +.PP +Hence, if you load this module on Perl 5.26.0, you will get one warning +that the module is no longer supported; and the module will do nothing +thereafter. +.SH SYNOPSIS +.IX Header "SYNOPSIS" +.Vb 1 +\& use encoding::warnings; # or \*(AqFATAL\*(Aq to raise fatal exceptions +\& +\& utf8::encode($a = chr(20000)); # a byte\-string (raw bytes) +\& $b = chr(20000); # a unicode\-string (wide characters) +\& +\& # "Bytes implicitly upgraded into wide characters as iso\-8859\-1" +\& $c = $a . $b; +.Ve +.SH DESCRIPTION +.IX Header "DESCRIPTION" +.SS "Overview of the problem" +.IX Subsection "Overview of the problem" +By default, there is a fundamental asymmetry in Perl's unicode model: +implicit upgrading from byte-strings to unicode-strings assumes that +they were encoded in \fIISO 8859\-1 (Latin\-1)\fR, but unicode-strings are +downgraded with UTF\-8 encoding. This happens because the first 256 +codepoints in Unicode happens to agree with Latin\-1. +.PP +However, this silent upgrading can easily cause problems, if you happen +to mix unicode strings with non\-Latin1 data \-\- i.e. byte-strings encoded +in UTF\-8 or other encodings. The error will not manifest until the +combined string is written to output, at which time it would be impossible +to see where did the silent upgrading occur. +.SS "Detecting the problem" +.IX Subsection "Detecting the problem" +This module simplifies the process of diagnosing such problems. Just put +this line on top of your main program: +.PP +.Vb 1 +\& use encoding::warnings; +.Ve +.PP +Afterwards, implicit upgrading of high-bit bytes will raise a warning. +Ex.: \f(CW\*(C`Bytes implicitly upgraded into wide characters as iso\-8859\-1 at +\&\- line 7\*(C'\fR. +.PP +However, strings composed purely of ASCII code points (\f(CW0x00\fR..\f(CW0x7F\fR) +will \fInot\fR trigger this warning. +.PP +You can also make the warnings fatal by importing this module as: +.PP +.Vb 1 +\& use encoding::warnings \*(AqFATAL\*(Aq; +.Ve +.SS "Solving the problem" +.IX Subsection "Solving the problem" +Most of the time, this warning occurs when a byte-string is concatenated +with a unicode-string. There are a number of ways to solve it: +.IP \(bu 4 +Upgrade both sides to unicode-strings +.Sp +If your program does not need compatibility for Perl 5.6 and earlier, +the recommended approach is to apply appropriate IO disciplines, so all +data in your program become unicode-strings. See encoding, open and +"binmode" in perlfunc for how. +.IP \(bu 4 +Downgrade both sides to byte-strings +.Sp +The other way works too, especially if you are sure that all your data +are under the same encoding, or if compatibility with older versions +of Perl is desired. +.Sp +You may downgrade strings with \f(CW\*(C`Encode::encode\*(C'\fR and \f(CW\*(C`utf8::encode\*(C'\fR. +See Encode and utf8 for details. +.IP \(bu 4 +Specify the encoding for implicit byte-string upgrading +.Sp +If you are confident that all byte-strings will be in a specific +encoding like UTF\-8, \fIand\fR need not support older versions of Perl, +use the \f(CW\*(C`encoding\*(C'\fR pragma: +.Sp +.Vb 1 +\& use encoding \*(Aqutf8\*(Aq; +.Ve +.Sp +Similarly, this will silence warnings from this module, and preserve the +default behaviour: +.Sp +.Vb 1 +\& use encoding \*(Aqiso\-8859\-1\*(Aq; +.Ve +.Sp +However, note that \f(CW\*(C`use encoding\*(C'\fR actually had three distinct effects: +.RS 4 +.IP \(bu 4 +PerlIO layers for \fBSTDIN\fR and \fBSTDOUT\fR +.Sp +This is similar to what open pragma does. +.IP \(bu 4 +Literal conversions +.Sp +This turns \fIall\fR literal string in your program into unicode-strings +(equivalent to a \f(CW\*(C`use utf8\*(C'\fR), by decoding them using the specified +encoding. +.IP \(bu 4 +Implicit upgrading for byte-strings +.Sp +This will silence warnings from this module, as shown above. +.RE +.RS 4 +.Sp +Because literal conversions also work on empty strings, it may surprise +some people: +.Sp +.Vb 1 +\& use encoding \*(Aqbig5\*(Aq; +\& +\& my $byte_string = pack("C*", 0xA4, 0x40); +\& print length $a; # 2 here. +\& $a .= ""; # concatenating with a unicode string... +\& print length $a; # 1 here! +.Ve +.Sp +In other words, do not \f(CW\*(C`use encoding\*(C'\fR unless you are certain that the +program will not deal with any raw, 8\-bit binary data at all. +.Sp +However, the \f(CW\*(C`Filter => 1\*(C'\fR flavor of \f(CW\*(C`use encoding\*(C'\fR will \fInot\fR +affect implicit upgrading for byte-strings, and is thus incapable of +silencing warnings from this module. See encoding for more details. +.RE +.SH CAVEATS +.IX Header "CAVEATS" +For Perl 5.9.4 or later, this module's effect is lexical. +.PP +For Perl versions prior to 5.9.4, this module affects the whole script, +instead of inside its lexical block. +.SH "SEE ALSO" +.IX Header "SEE ALSO" +perlunicode, perluniintro +.PP +open, utf8, encoding, Encode +.SH AUTHORS +.IX Header "AUTHORS" +Audrey Tang +.SH COPYRIGHT +.IX Header "COPYRIGHT" +Copyright 2004, 2005, 2006, 2007 by Audrey Tang <cpan@audreyt.org>. +.PP +This program is free software; you can redistribute it and/or modify it +under the same terms as Perl itself. +.PP +See <http://www.perl.com/perl/misc/Artistic.html> |