summaryrefslogtreecommitdiffstats
path: root/docs/README.chartrans
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 16:37:15 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 16:37:15 +0000
commitae5d181b854d3ccb373b6bc01b4869e44ff4d87a (patch)
tree91f59efb48c56a84cc798e012fccb667b63d3fee /docs/README.chartrans
parentInitial commit. (diff)
downloadlynx-ae5d181b854d3ccb373b6bc01b4869e44ff4d87a.tar.xz
lynx-ae5d181b854d3ccb373b6bc01b4869e44ff4d87a.zip
Adding upstream version 2.9.0dev.12.upstream/2.9.0dev.12upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/README.chartrans')
-rw-r--r--docs/README.chartrans164
1 files changed, 164 insertions, 0 deletions
diff --git a/docs/README.chartrans b/docs/README.chartrans
new file mode 100644
index 0000000..a13ef36
--- /dev/null
+++ b/docs/README.chartrans
@@ -0,0 +1,164 @@
+Lynx CHARTRANS
+
+ Features (in addition to those which Lynx 2.7.1 already has):
+
+ - Can (attempt to) translate from any document charset to any display
+ character set, *IF* the document charset is known by a translation
+ table (compiled in at installation).
+
+ - New method to define character sets: used for input charset as well
+ as display character set, translation tables compiled in from
+ separate files (one per charset). One table is designated as default
+ and can be used for fallback translation to 7-bit replacements for
+ display.
+
+ - New method for specifying translations of SGML entities.
+
+ - Unicode (UTF-8) support: can (attempt to) decode and translate UTF-8 to
+ display character set, or pass through UTF to display (if terminal
+ or console understands UTF-8). [raw display of UTF only tested with Slang
+ so far, does not always position everything correctly on screen]
+
+ - Support for CHARSET attribute on A tag (and sometimes LINK), as in HTML
+ i18n RFC 2070 and W3C HTML 4.0 drafts. A link can suggest the target's
+ charset in this way.
+
+ - Support for ACCEPT-CHARSET attribute of FORM tags.
+
+ - EXPERIMENTAL, currently enabled only for Linux console:
+ can (attempt to) automatically switch terminal mode and load new
+ code pages on change of display character set.
+
+ - some minor changes: sometimes invalid characters were displayed in a hex
+ notation Uxxxx (helps debugging, but I also regard it as at least not
+ worse than showing the wrong char without warning), now they are not
+ displayed to reduce garbage.
+
+Additions/changes to user interface:
+
+ - many new Display Character Sets are available on O)ptions screen.
+ (One can use arrow keys, HOME, END etc. for cycling through the list
+ or use selection from popup box, as for other options.)
+
+ - new command line flags:
+ -assume_charset=... assume this as charset for documents that don't
+ specify a charset parameter in HTTP headers
+ -assume_local_charset=... assume this as charset of local file
+ -assume_unrec_charset=... in case a charset parameter is not recognized;
+ docs also available as ASSUME_CHARSET etc. in lynx.cfg
+ In "Advanced User" mode, ASSUME_CHARSET can be changed during a session
+ from the Options Screen.
+
+ - The "Raw" toggle (from -raw flag, '@' key, or Options screen)
+ o toggles the assumption "Default remote charset is same as Display
+ Character Set" on or off.
+ Toggling of the assumed charset is between Display Character Set and
+ the specified ASSUME_CHARSET or, if they are the same, between the
+ specified ASSUME_CHARSET and ISO-8859-1.
+ o The default for raw mode now depends on the Display Character Set as
+ well as on the specified ASSUME_CHARSET value.
+ o should work as before for CJK charsets (turning CJK-mode on or off).
+ o If the effective ASSUME_CHARSET and the Display Character Set are
+ unchanged from the ISO-8859-1 default, toggling "Raw" may have some
+ additional effect for characters that can't be translated.
+ (Try the "Transparent" Display Character Set for more "rawness".)
+
+
+Requirements: same as for Lynx in general :)
+
+The chartrans code is now merged with Wayne Buttle's changes for
+32-bit MS Windows and DOS/DJGPP, with Thomas Dickey's and Jim Spath's
+emerging auto-configure mechanism, and with BUGFIXES from Foteos
+Macrides. See the accompanying file CHANGES for the current
+status.
+
+
+A warning:
+In some cases undisplayable bytes may still get sent to the terminal
+which are then interpreted as control chars, there is no protection
+against if strange things are defined in the table files.
+
+
+HOW TO INSTALL:
+
+(4) before compiling:
+
+ Check top level makefile or Makefile and userdefs.h as usual.
+
+ NOTE that there is a new "#define" in userdefs.h for MAX_CHARSETS
+ near the end (in "Section 3.").
+
+(5) Building Lynx:
+
+ Compiling the chartrans code is now integrated into the normal
+ installation procedures for UNIX (configure script) and other
+ platforms.
+
+ What's supposed to happen (in addition to the usual things when
+ building Lynx): in the new subdirectory src/chrtrans, make should
+ first compile the auxiliary program `makeuctb', then invoke that
+ program to create xxxxx_yyy.h files from the provided xxxxx_yyy.tab
+ translation table files. (See README.* files in src/chrtrans for
+ more info.)
+
+ If all goes well, just invoking make from the top-level Lynx dir
+ as usual should do everything automatically. If not, the makefiles
+ may need some tweaking... or:
+
+(6) Some things to look at if compilation fails:
+
+ In src/chrtrans/UCkd.h there is a typedef for an unsigned 16bit
+ numeric type which may need to be changed for your system.
+ See comment near top there.
+
+ For recompiling Lynx, `make clean' should not be necessary if only
+ files in src/chrtrans have been changed. On the other hand
+ may not propagate to the src/chrtrans directory (depending how things
+ are going with auto-config), you may have to cd to that directory
+ and `make clean' there to really clean up there.
+
+(7) To customize (add/change translation tables etc.):
+
+ See README.* files in src/chrtrans.
+ Make the necessary changes there, then recompile.
+ (A general `make clean' should not be necessary, but make sure
+ the ...uni.h file in src/chrtrans gets regenerated.)
+
+ Note that definition of new character entities (if e.g., you want
+ Lynx to recognize &Zcaron;) are not covered by these table files,
+ they have to be listed in entities.h.
+
+ _If you are on a Linux system_ and using Lynx on the console (i.e.
+ not xterm, not a dialup *into* the Linux box), you can compile
+ with -DEXP_CHARTRANS_AUTOSWITCH. This is very useful for testing
+ the various Display Character Sets, Lynx will try to automatically
+ change the console state. You need to have the Linux kbd package
+ installed, with a working `setfont' command executable by the user,
+ and the right font files - check the source in src/UCAuto.c for
+ the files used and/or to change them!
+ NOTE that with this enabled,
+ - Lynx currently will not clean up the console state at exit,
+ it will probably left like the last Display Character Set you used.
+ - Loading a font is global across _all_ virtual text consoles, so
+ using Lynx (compiled with this flag) may change the appearance of
+ text on other consoles (if that text contains characters
+ beyond US-ASCII).
+
+(8) Some suggested Web pages for testing:
+
+ <URL: http://www.tezcat.com/~kweide/lynx-chartrans/test/>
+
+ <URL: http://www.isoc.org:8080/>,
+ especially
+ <URL: http://www.isoc.org:8080/liste_ml.htm>.
+
+ <URL: http://www.accentsoft.com/un/un-all.htm>
+
+(9) Please report bugs, unexpected behavior, etc.
+ to <lynx-dev@nongnu.org>.
+
+ Suggestions for improvement would be welcome, as well as
+ contributed translation tables (for stuff that is not available
+ at ftp://dkuug.dk or ftp://ftp.unicode.org).
+
+KW 1997-11-06