summaryrefslogtreecommitdiffstats
path: root/third_party/rust/encoding_rs/doc/GBK.txt
diff options
context:
space:
mode:
Diffstat (limited to 'third_party/rust/encoding_rs/doc/GBK.txt')
-rw-r--r--third_party/rust/encoding_rs/doc/GBK.txt16
1 files changed, 16 insertions, 0 deletions
diff --git a/third_party/rust/encoding_rs/doc/GBK.txt b/third_party/rust/encoding_rs/doc/GBK.txt
new file mode 100644
index 0000000000..2faefff45e
--- /dev/null
+++ b/third_party/rust/encoding_rs/doc/GBK.txt
@@ -0,0 +1,16 @@
+/// The decoder for this encoding is the same as the decoder for gb18030.
+/// The encoder side of this encoding is GBK with Windows code page 936 euro
+/// sign behavior. GBK extends GB2312-80 to cover the CJK Unified Ideographs
+/// Unicode block as well as a handful of ideographs from the CJK Unified
+/// Ideographs Extension A and CJK Compatibility Ideographs blocks.
+///
+/// Unlike e.g. in the case of ISO-8859-1 and windows-1252, GBK encoder wasn't
+/// unified with the gb18030 encoder in the Encoding Standard out of concern
+/// that servers that expect GBK form submissions might not be able to handle
+/// the four-byte sequences.
+///
+/// [Index visualization for the two-byte sequences](https://encoding.spec.whatwg.org/gb18030.html),
+/// [Visualization of BMP coverage of the two-byte index](https://encoding.spec.whatwg.org/gb18030-bmp.html)
+///
+/// The encoder of this encoding roughly matches the Windows code page 936.
+/// The decoder side is a superset.