summaryrefslogtreecommitdiffstats
path: root/third_party/rust/encoding_rs/doc
diff options
context:
space:
mode:
Diffstat (limited to 'third_party/rust/encoding_rs/doc')
-rw-r--r--third_party/rust/encoding_rs/doc/Big5.txt16
-rw-r--r--third_party/rust/encoding_rs/doc/EUC-JP.txt12
-rw-r--r--third_party/rust/encoding_rs/doc/EUC-KR.txt10
-rw-r--r--third_party/rust/encoding_rs/doc/GBK.txt16
-rw-r--r--third_party/rust/encoding_rs/doc/IBM866.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-2022-JP.txt10
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-10.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-13.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-14.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-15.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-16.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-2.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-3.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-4.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-5.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-6.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-7.txt11
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt9
-rw-r--r--third_party/rust/encoding_rs/doc/ISO-8859-8.txt9
-rw-r--r--third_party/rust/encoding_rs/doc/KOI8-R.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/KOI8-U.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/Shift_JIS.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/UTF-16BE.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/UTF-16LE.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/UTF-8.txt5
-rw-r--r--third_party/rust/encoding_rs/doc/gb18030.txt9
-rw-r--r--third_party/rust/encoding_rs/doc/macintosh.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/replacement.txt10
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1250.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1251.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1252.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1253.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1254.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1255.txt8
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1256.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1257.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/windows-1258.txt11
-rw-r--r--third_party/rust/encoding_rs/doc/windows-874.txt7
-rw-r--r--third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt6
-rw-r--r--third_party/rust/encoding_rs/doc/x-user-defined.txt6
40 files changed, 323 insertions, 0 deletions
diff --git a/third_party/rust/encoding_rs/doc/Big5.txt b/third_party/rust/encoding_rs/doc/Big5.txt
new file mode 100644
index 0000000000..61e8fd5801
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/Big5.txt
@@ -0,0 +1,16 @@
+/// This is Big5 with HKSCS with mappings to more recent Unicode assignments
+/// instead of the Private Use Area code points that have been used historically.
+/// It is believed to be able to decode existing Web content in a way that makes
+/// sense.
+///
+/// To avoid form submissions generating data that Web servers don't understand,
+/// the encoder doesn't use the HKSCS byte sequences that precede the unextended
+/// Big5 in the lexical order.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/big5.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/big5-bmp.html)
+///
+/// This encoding is designed to be suited for decoding the Windows code page 950
+/// and its HKSCS patched "951" variant such that the text makes sense, given
+/// assignments that Unicode has made after those encodings used Private Use
+/// Area characters.
diff --git a/third_party/rust/encoding_rs/doc/EUC-JP.txt b/third_party/rust/encoding_rs/doc/EUC-JP.txt
new file mode 100644
index 0000000000..f90a735e52
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/EUC-JP.txt
@@ -0,0 +1,12 @@
+/// This is the legacy Unix encoding for Japanese.
+///
+/// For compatibility with Web servers that don't expect three-byte sequences
+/// in form submissions, the encoder doesn't generate three-byte sequences.
+/// That is, the JIS X 0212 support is decode-only.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/euc-jp.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/euc-jp-bmp.html)
+///
+/// This encoding roughly matches the Windows code page 20932. There are error
+/// handling differences and a handful of 2-byte sequences that decode differently.
+/// Additionall, Windows doesn't support 3-byte sequences.
diff --git a/third_party/rust/encoding_rs/doc/EUC-KR.txt b/third_party/rust/encoding_rs/doc/EUC-KR.txt
new file mode 100644
index 0000000000..ef24c980e0
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/EUC-KR.txt
@@ -0,0 +1,10 @@
+/// This is the Korean encoding for Windows. It extends the Unix legacy encoding
+/// for Korean, based on KS X 1001 (which also formed the base of MacKorean on Mac OS
+/// Classic), with all the characters from the Hangul Syllables block of Unicode.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/euc-kr.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/euc-kr-bmp.html)
+///
+/// This encoding matches the Windows code page 949, except Windows decodes byte 0x80
+/// to U+0080 and some byte sequences that are error per the Encoding Standard to
+/// the question mark or the Private Use Area.
diff --git a/third_party/rust/encoding_rs/doc/GBK.txt b/third_party/rust/encoding_rs/doc/GBK.txt
new file mode 100644
index 0000000000..2faefff45e
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/GBK.txt
@@ -0,0 +1,16 @@
+/// The decoder for this encoding is the same as the decoder for gb18030.
+/// The encoder side of this encoding is GBK with Windows code page 936 euro
+/// sign behavior. GBK extends GB2312-80 to cover the CJK Unified Ideographs
+/// Unicode block as well as a handful of ideographs from the CJK Unified
+/// Ideographs Extension A and CJK Compatibility Ideographs blocks.
+///
+/// Unlike e.g. in the case of ISO-8859-1 and windows-1252, GBK encoder wasn't
+/// unified with the gb18030 encoder in the Encoding Standard out of concern
+/// that servers that expect GBK form submissions might not be able to handle
+/// the four-byte sequences.
+///
+/// [Index visualization for the two-byte sequences](https://encoding.spec.whatwg.org/gb18030.html),
+/// [Visualization of BMP coverage of the two-byte index](https://encoding.spec.whatwg.org/gb18030-bmp.html)
+///
+/// The encoder of this encoding roughly matches the Windows code page 936.
+/// The decoder side is a superset.
diff --git a/third_party/rust/encoding_rs/doc/IBM866.txt b/third_party/rust/encoding_rs/doc/IBM866.txt
new file mode 100644
index 0000000000..871ff42139
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/IBM866.txt
@@ -0,0 +1,8 @@
+/// This the most notable one of the DOS Cyrillic code pages. It has the same
+/// box drawing characters as code page 437, so it can be used for decoding
+/// DOS-era ASCII + box drawing data.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/ibm866.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/ibm866-bmp.html)
+///
+/// This encoding matches the Windows code page 866.
diff --git a/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt b/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt
new file mode 100644
index 0000000000..65713a1e9f
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt
@@ -0,0 +1,10 @@
+/// This the primary pre-UTF-8 encoding for Japanese email. It uses the ASCII
+/// byte range to encode non-Basic Latin characters. It's the only encoding
+/// supported by this crate whose encoder is stateful.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/jis0208.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/jis0208-bmp.html)
+///
+/// This encoding roughly matches the Windows code page 50220. Notably, Windows
+/// uses U+30FB in place of the REPLACEMENT CHARACTER and otherwise differs in
+/// error handling.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-10.txt b/third_party/rust/encoding_rs/doc/ISO-8859-10.txt
new file mode 100644
index 0000000000..8aca388e7c
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-10.txt
@@ -0,0 +1,8 @@
+/// This is the Nordic part of the ISO/IEC 8859 encoding family. This encoding
+/// is also known as Latin 6.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-10.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-10-bmp.html)
+///
+/// The Windows code page number for this encoding is 28600, but kernel32.dll
+/// does not support this encoding.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-13.txt b/third_party/rust/encoding_rs/doc/ISO-8859-13.txt
new file mode 100644
index 0000000000..20cd549673
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-13.txt
@@ -0,0 +1,8 @@
+/// This is the Baltic part of the ISO/IEC 8859 encoding family. This encoding
+/// is also known as Latin 7.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-13.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-13-bmp.html)
+///
+/// This encoding matches the Windows code page 28603, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-14.txt b/third_party/rust/encoding_rs/doc/ISO-8859-14.txt
new file mode 100644
index 0000000000..3e4833bf19
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-14.txt
@@ -0,0 +1,8 @@
+/// This is the Celtic part of the ISO/IEC 8859 encoding family. This encoding
+/// is also known as Latin 8.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-14.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-14-bmp.html)
+///
+/// The Windows code page number for this encoding is 28604, but kernel32.dll
+/// does not support this encoding.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-15.txt b/third_party/rust/encoding_rs/doc/ISO-8859-15.txt
new file mode 100644
index 0000000000..922896a882
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-15.txt
@@ -0,0 +1,7 @@
+/// This is the revised Western European part of the ISO/IEC 8859 encoding
+/// family. This encoding is also known as Latin 9.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-15.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-15-bmp.html)
+///
+/// This encoding matches the Windows code page 28605.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-16.txt b/third_party/rust/encoding_rs/doc/ISO-8859-16.txt
new file mode 100644
index 0000000000..d1ae50bf08
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-16.txt
@@ -0,0 +1,8 @@
+/// This is the South-Eastern European part of the ISO/IEC 8859 encoding
+/// family. This encoding is also known as Latin 10.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-16.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-16-bmp.html)
+///
+/// The Windows code page number for this encoding is 28606, but kernel32.dll
+/// does not support this encoding.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-2.txt b/third_party/rust/encoding_rs/doc/ISO-8859-2.txt
new file mode 100644
index 0000000000..298df0916a
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-2.txt
@@ -0,0 +1,6 @@
+/// This is the Central European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 2.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-2.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-2-bmp.html)
+///
+/// This encoding matches the Windows code page 28592.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-3.txt b/third_party/rust/encoding_rs/doc/ISO-8859-3.txt
new file mode 100644
index 0000000000..c462ce8f7f
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-3.txt
@@ -0,0 +1,6 @@
+/// This is the South European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 3.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-3.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-3-bmp.html)
+///
+/// This encoding matches the Windows code page 28593.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-4.txt b/third_party/rust/encoding_rs/doc/ISO-8859-4.txt
new file mode 100644
index 0000000000..40449c4398
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-4.txt
@@ -0,0 +1,6 @@
+/// This is the North European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 4.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-4.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-4-bmp.html)
+///
+/// This encoding matches the Windows code page 28594.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-5.txt b/third_party/rust/encoding_rs/doc/ISO-8859-5.txt
new file mode 100644
index 0000000000..41774ec542
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-5.txt
@@ -0,0 +1,6 @@
+/// This is the Cyrillic part of the ISO/IEC 8859 encoding family.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-5.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-5-bmp.html)
+///
+/// This encoding matches the Windows code page 28595.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-6.txt b/third_party/rust/encoding_rs/doc/ISO-8859-6.txt
new file mode 100644
index 0000000000..4c70c22583
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-6.txt
@@ -0,0 +1,7 @@
+/// This is the Arabic part of the ISO/IEC 8859 encoding family.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-6.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-6-bmp.html)
+///
+/// This encoding matches the Windows code page 28596, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-7.txt b/third_party/rust/encoding_rs/doc/ISO-8859-7.txt
new file mode 100644
index 0000000000..b78ed38e41
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-7.txt
@@ -0,0 +1,11 @@
+/// This is the Greek part of the ISO/IEC 8859 encoding family.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-7.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-7-bmp.html)
+///
+/// This encoding roughly matches the Windows code page 28597. Windows decodes
+/// unassigned code points, the currency signs at 0xA4 and 0xA5 as well as
+/// 0xAA, which should be U+037A GREEK YPOGEGRAMMENI, to the Private Use Area
+/// of Unicode. Windows decodes 0xA1 to U+02BD MODIFIER LETTER REVERSED COMMA
+/// instead of U+2018 LEFT SINGLE QUOTATION MARK and 0xA2 to U+02BC MODIFIER
+/// LETTER APOSTROPHE instead of U+2019 RIGHT SINGLE QUOTATION MARK.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt b/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt
new file mode 100644
index 0000000000..b73e572e15
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt
@@ -0,0 +1,9 @@
+/// This is the Hebrew part of the ISO/IEC 8859 encoding family in logical order.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-8.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-8-bmp.html)
+///
+/// This encoding roughly matches the Windows code page 38598. Windows decodes
+/// 0xAF to OVERLINE instead of MACRON and 0xFE and 0xFD to the Private Use
+/// Area instead of LRM and RLM. Windows decodes unassigned code points to
+/// the private use area.
diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-8.txt b/third_party/rust/encoding_rs/doc/ISO-8859-8.txt
new file mode 100644
index 0000000000..c5600e38fe
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/ISO-8859-8.txt
@@ -0,0 +1,9 @@
+/// This is the Hebrew part of the ISO/IEC 8859 encoding family in visual order.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-8.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-8-bmp.html)
+///
+/// This encoding roughly matches the Windows code page 28598. Windows decodes
+/// 0xAF to OVERLINE instead of MACRON and 0xFE and 0xFD to the Private Use
+/// Area instead of LRM and RLM. Windows decodes unassigned code points to
+/// the private use area.
diff --git a/third_party/rust/encoding_rs/doc/KOI8-R.txt b/third_party/rust/encoding_rs/doc/KOI8-R.txt
new file mode 100644
index 0000000000..46dcfe7659
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/KOI8-R.txt
@@ -0,0 +1,6 @@
+/// This is an encoding for Russian from [RFC 1489](https://tools.ietf.org/html/rfc1489).
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/koi8-r.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/koi8-r-bmp.html)
+///
+/// This encoding matches the Windows code page 20866.
diff --git a/third_party/rust/encoding_rs/doc/KOI8-U.txt b/third_party/rust/encoding_rs/doc/KOI8-U.txt
new file mode 100644
index 0000000000..a263745ef1
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/KOI8-U.txt
@@ -0,0 +1,6 @@
+/// This is an encoding for Ukrainian adapted from KOI8-R.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/koi8-u.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/koi8-u-bmp.html)
+///
+/// This encoding matches the Windows code page 21866.
diff --git a/third_party/rust/encoding_rs/doc/Shift_JIS.txt b/third_party/rust/encoding_rs/doc/Shift_JIS.txt
new file mode 100644
index 0000000000..b982ab5b3e
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/Shift_JIS.txt
@@ -0,0 +1,8 @@
+/// This is the Japanese encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/shift_jis.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/shift_jis-bmp.html)
+///
+/// This encoding matches the Windows code page 932, except Windows decodes some byte
+/// sequences that are error per the Encoding Standard to the question mark or the
+/// Private Use Area and generally uses U+30FB in place of the REPLACEMENT CHARACTER.
diff --git a/third_party/rust/encoding_rs/doc/UTF-16BE.txt b/third_party/rust/encoding_rs/doc/UTF-16BE.txt
new file mode 100644
index 0000000000..0a7df99a4f
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/UTF-16BE.txt
@@ -0,0 +1,8 @@
+/// This decode-only encoding uses 16-bit code units due to Unicode originally
+/// having been designed as a 16-bit reportoire. In the absence of a byte order
+/// mark the big endian byte order is assumed.
+///
+/// There is no corresponding encoder in this crate or in the Encoding
+/// Standard. The output encoding of this encoding is UTF-8.
+///
+/// This encoding matches the Windows code page 1201.
diff --git a/third_party/rust/encoding_rs/doc/UTF-16LE.txt b/third_party/rust/encoding_rs/doc/UTF-16LE.txt
new file mode 100644
index 0000000000..3a98e8b986
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/UTF-16LE.txt
@@ -0,0 +1,8 @@
+/// This decode-only encoding uses 16-bit code units due to Unicode originally
+/// having been designed as a 16-bit reportoire. In the absence of a byte order
+/// mark the little endian byte order is assumed.
+///
+/// There is no corresponding encoder in this crate or in the Encoding
+/// Standard. The output encoding of this encoding is UTF-8.
+///
+/// This encoding matches the Windows code page 1200.
diff --git a/third_party/rust/encoding_rs/doc/UTF-8.txt b/third_party/rust/encoding_rs/doc/UTF-8.txt
new file mode 100644
index 0000000000..3a93e67dce
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/UTF-8.txt
@@ -0,0 +1,5 @@
+/// This is the encoding that should be used for all new development it can
+/// represent all of Unicode.
+///
+/// This encoding matches the Windows code page 65001, except Windows differs
+/// in the number of errors generated for some erroneous byte sequences.
diff --git a/third_party/rust/encoding_rs/doc/gb18030.txt b/third_party/rust/encoding_rs/doc/gb18030.txt
new file mode 100644
index 0000000000..572a593d08
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/gb18030.txt
@@ -0,0 +1,9 @@
+/// This encoding matches GB18030-2005 except the two-byte sequence 0xA3 0xA0
+/// maps to U+3000 for compatibility with existing Web content. As a result,
+/// this encoding can represent all of Unicode except for the private-use
+/// character U+E5E5.
+///
+/// [Index visualization for the two-byte sequences](https://encoding.spec.whatwg.org/gb18030.html),
+/// [Visualization of BMP coverage of the two-byte index](https://encoding.spec.whatwg.org/gb18030-bmp.html)
+///
+/// This encoding matches the Windows code page 54936.
diff --git a/third_party/rust/encoding_rs/doc/macintosh.txt b/third_party/rust/encoding_rs/doc/macintosh.txt
new file mode 100644
index 0000000000..d00fece7c8
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/macintosh.txt
@@ -0,0 +1,7 @@
+/// This is the MacRoman encoding from Mac OS Classic.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/macintosh.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/macintosh-bmp.html)
+///
+/// This encoding matches the Windows code page 10000, except Windows decodes
+/// 0xBD to U+2126 OHM SIGN instead of U+03A9 GREEK CAPITAL LETTER OMEGA.
diff --git a/third_party/rust/encoding_rs/doc/replacement.txt b/third_party/rust/encoding_rs/doc/replacement.txt
new file mode 100644
index 0000000000..2398df06d4
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/replacement.txt
@@ -0,0 +1,10 @@
+/// This decode-only encoding decodes all non-zero-length streams to a single
+/// REPLACEMENT CHARACTER. Its purpose is to avoid the use of an
+/// ASCII-compatible fallback encoding (typically windows-1252) for some
+/// encodings that are no longer supported by the Web Platform and that
+/// would be dangerous to treat as ASCII-compatible.
+///
+/// There is no corresponding encoder. The output encoding of this encoding
+/// is UTF-8.
+///
+/// This encoding does not have a Windows code page number.
diff --git a/third_party/rust/encoding_rs/doc/windows-1250.txt b/third_party/rust/encoding_rs/doc/windows-1250.txt
new file mode 100644
index 0000000000..96e38ef4a0
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1250.txt
@@ -0,0 +1,6 @@
+/// This is the Central European encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1250.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1250-bmp.html)
+///
+/// This encoding matches the Windows code page 1250.
diff --git a/third_party/rust/encoding_rs/doc/windows-1251.txt b/third_party/rust/encoding_rs/doc/windows-1251.txt
new file mode 100644
index 0000000000..9645611e23
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1251.txt
@@ -0,0 +1,6 @@
+/// This is the Cyrillic encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1251.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1251-bmp.html)
+///
+/// This encoding matches the Windows code page 1251.
diff --git a/third_party/rust/encoding_rs/doc/windows-1252.txt b/third_party/rust/encoding_rs/doc/windows-1252.txt
new file mode 100644
index 0000000000..d613fbe25c
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1252.txt
@@ -0,0 +1,7 @@
+/// This is the Western encoding for Windows. It is an extension of ISO-8859-1,
+/// which is known as Latin 1.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1252.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1252-bmp.html)
+///
+/// This encoding matches the Windows code page 1252.
diff --git a/third_party/rust/encoding_rs/doc/windows-1253.txt b/third_party/rust/encoding_rs/doc/windows-1253.txt
new file mode 100644
index 0000000000..edcacd9037
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1253.txt
@@ -0,0 +1,8 @@
+/// This is the Greek encoding for Windows. It is mostly an extension of
+/// ISO-8859-7, but U+0386 is mapped to a different byte.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1253.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1253-bmp.html)
+///
+/// This encoding matches the Windows code page 1253, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/windows-1254.txt b/third_party/rust/encoding_rs/doc/windows-1254.txt
new file mode 100644
index 0000000000..26491a93a4
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1254.txt
@@ -0,0 +1,7 @@
+/// This is the Turkish encoding for Windows. It is an extension of ISO-8859-9,
+/// which is known as Latin 5.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1254.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1254-bmp.html)
+///
+/// This encoding matches the Windows code page 1254.
diff --git a/third_party/rust/encoding_rs/doc/windows-1255.txt b/third_party/rust/encoding_rs/doc/windows-1255.txt
new file mode 100644
index 0000000000..cbcf86dc1c
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1255.txt
@@ -0,0 +1,8 @@
+/// This is the Hebrew encoding for Windows. It is an extension of ISO-8859-8-I,
+/// except for a currency sign swap.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1255.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1255-bmp.html)
+///
+/// This encoding matches the Windows code page 1255, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/windows-1256.txt b/third_party/rust/encoding_rs/doc/windows-1256.txt
new file mode 100644
index 0000000000..38bf2ef4e6
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1256.txt
@@ -0,0 +1,6 @@
+/// This is the Arabic encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1256.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1256-bmp.html)
+///
+/// This encoding matches the Windows code page 1256.
diff --git a/third_party/rust/encoding_rs/doc/windows-1257.txt b/third_party/rust/encoding_rs/doc/windows-1257.txt
new file mode 100644
index 0000000000..fc3fad21d4
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1257.txt
@@ -0,0 +1,7 @@
+/// This is the Baltic encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1257.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1257-bmp.html)
+///
+/// This encoding matches the Windows code page 1257, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/windows-1258.txt b/third_party/rust/encoding_rs/doc/windows-1258.txt
new file mode 100644
index 0000000000..1ae5bbb12c
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-1258.txt
@@ -0,0 +1,11 @@
+/// This is the Vietnamese encoding for Windows.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-1258.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1258-bmp.html)
+///
+/// This encoding matches the Windows code page 1258 when used in the
+/// non-normalizing mode. Unlike with the other single-byte encodings, the
+/// result of decoding is not necessarily in Normalization Form C. On the
+/// other hand, input in the Normalization Form C is not encoded without
+/// replacement. In general, it's a bad idea to encode to encodings other
+/// than UTF-8, but this encoding is especially hazardous to encode to.
diff --git a/third_party/rust/encoding_rs/doc/windows-874.txt b/third_party/rust/encoding_rs/doc/windows-874.txt
new file mode 100644
index 0000000000..ddbc71143f
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/windows-874.txt
@@ -0,0 +1,7 @@
+/// This is the Thai encoding for Windows. It is an extension of TIS-620 / ISO-8859-11.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/windows-874.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-874-bmp.html)
+///
+/// This encoding matches the Windows code page 874, except Windows decodes
+/// unassigned code points to the Private Use Area of Unicode.
diff --git a/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt b/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt
new file mode 100644
index 0000000000..b5519a122c
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt
@@ -0,0 +1,6 @@
+/// This is the MacUkrainian encoding from Mac OS Classic.
+///
+/// [Index visualization](https://encoding.spec.whatwg.org/x-mac-cyrillic.html),
+/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/x-mac-cyrillic-bmp.html)
+///
+/// This encoding matches the Windows code page 10017.
diff --git a/third_party/rust/encoding_rs/doc/x-user-defined.txt b/third_party/rust/encoding_rs/doc/x-user-defined.txt
new file mode 100644
index 0000000000..e00ddc662e
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/x-user-defined.txt
@@ -0,0 +1,6 @@
+/// This encoding offsets the non-ASCII bytes by `0xF700` thereby decoding
+/// them to the Private Use Area of Unicode. It was used for loading binary
+/// data into a JavaScript string using `XMLHttpRequest` before XHR supported
+/// the `"arraybuffer"` response type.
+///
+/// This encoding does not have a Windows code page number.