diff options
Diffstat (limited to 'docs/README.TRST')
-rw-r--r-- | docs/README.TRST | 158 |
1 files changed, 158 insertions, 0 deletions
diff --git a/docs/README.TRST b/docs/README.TRST new file mode 100644 index 0000000..68b27af --- /dev/null +++ b/docs/README.TRST @@ -0,0 +1,158 @@ +Tabular Support for Simple Tables +================================= +Some definitions first: + +* NO table support + What it says. :) Table related tags are treated like other + completely unrecognized tags. + Only listed for completeness, this does not describe Lynx. + +* MINIMAL table support + Table related tags are recognized, and are used to separate + the contents of different cells (by at least a space) and rows + (by a line break) visibly from each other. + +* LYNX minimal table support (LMTS) + The minimal table support as implemented by Lynx up to this point, + also includes the way ALIGN attributes are handled on TABLE, TR + and other specific tweaks (e.g. handle TABLE within PRE specially). + LMTS formatting is briefly described in the Lynx User Guide, see + the section "Lynx and HTML Tables" there. (The Users Guide has not + yet been updated for tabular support.) + +* TABULAR support for tables + Support for tables that really arranges table cells in tabular form. + +* Tabular Rendering for SIMPLE Tables (TRST) + Tabular support for some tables that are 'simple' enough; what this + code change provides. + +One basic idea behind providing TRST is that correct tabular support +for all tables is complex, doesn't fit well into the overwhelmingly +one-pass way in which Lynx does things, and may in the end not give +pleasant results anyway for pages that (ab-)use more complex table +structures for display formatting purposes (especially in view of Lynx +limitations such as fixed character cell size and lack of horizontal +scrolling; see also emacs w3 mode). Full table support within Lynx +hasn't happened so far, and continues to seem unlikely to happen in the +near future. + +The other basic idea is the observation that for simple tables, as +used mostly for data that are really tabular in nature, LMTS rendering +can be transformed into TRST rendering, after parsing the TABLE element, +by two simple transformations applied line by line: +- Insert spaces in the right places. +- Shift the line as a whole. + +And that's exactly what TRST does. An implementation based on the +simple observation above is relatively straightforward, for simple +tables. On encountering the start of a TABLE element, Lynx generates +output as usual for LMTS. But it also keeps track of cell positions +and lengths in parallel. If all goes well, that additional information +is used to fix up the already formatted output lines when the TABLE +ends. If not all goes well, the table was not 'simple' enough, the +additional processing is canceled. One advantage is that we always +have a 'safe' fallback to well-understood traditional LMTS formatting: +TRST won't make more complex tables look worse than before. + +What are 'simple' tables? A table is simple enough if each of its TR +rows translates into at most one display line in LMTS formatting (excluding +leading and trailing line breaks), and the width required by each row +(before as well as after fixup) does not exceed the available screen size. +Note that this excludes all tables where some of the cells are marked up as +block elements ('paragraphs'). Tables that include nested TABLE elements +are always specifically excluded, but the inner tables may be subject to +TRST handling. Also excluded are some constructs that indicate that markup +was already optimized for Lynx (or other browsers with no or minimal table +support): TABLE in PRE, use of TAB. + +The description so far isn't completely accurate. In many cases, tables are +not simple enough according to the last paragraph, but parts of each TR row +can still benefit from some TRST treatment. Some partial treatment is done +for some tables in this grey zone, which may or may not help to a better +display, depending on how the table is used. This is an area where tweaks +in the future are most expected, and where the code's behavior is currently +not well defined. + +One possible approach: +- The table is 'simple' according to all criteria set out in the previous + paragraph, except that some cells at the beginning and/or end of TR rows + may contain block elements (or other markup that results in formatting + like separate paragraphs). +- There is at most one range of (non-empty) table cells in each row whose + contents is not paragraph-formatted, and who are rendered on one line + together by LMTS, separate from the paragraph-formatted cells. Let's + call these cells the 'core' of a row. +Fixups are then only applied to the text lines showing the 'core' cells. +The paragraph-formatted cells are effectively pulled out before/after +their row (no horizontal space is allocated to them for the purpose of +determining column widths for core line formatting). + +This is expected to be most useful for tables that are mostly +simple tabular data cells, but with the occasional longer +text thrown in. For example, a table with intended rendering: + + -------------------------------------------------------- + | date | item no. | price | remarks | + |--------|--------------|---------|----------------------| + | date-1 | item #1 | $0.00 | | + |--------|--------------|---------|----------------------| + | date-2 | item #2 | $101.99 | A longer annotation | + | | | | marked up as a block | + | | | | of text. | + |--------|--------------|---------|----------------------| + | date-3 | long item #3 | $99.00 | | + -------------------------------------------------------- + +It may now be shown by Lynx as + + ................................................. + + date item no. price remarks + date-1 item #1 $0.00 + date-2 item #2 $101.99 + + A longer annotation marked up as a block of + text. + + date-3 long item #3 $99.00 + + ................................................. + +As can be seen, this is still quite far from the intended rendering, +but it is better than without any tabular support. + +Whether the code does something sensible with "grey area" tables is up +for testing. Most of the typical tables in typical Web pages aren't +used in a way that can benefit from the TRST approach. Parts of such +tables may still end up getting shifted left or right by the TRST code +when that doesn't improve anything, but I haven't seen it make things +really worse so far (with the current code). + +TRST and Partial Display +------------------------ +[ Partial display mode is the feature which allows viewing and scrolling +of pages while they are loaded, without having to wait for a complete +transfer. ] During partial display rendering, table lines can sometimes +be shown in the original formatting, i.e. with horizontal fixups not yet +applied. This is more likely for longer tables, and depends on the state +in which partial display 'catches' the TRST code. Sometimes the display +may flicker: first the preliminary rendering of table lines is shown, then +after loading is finished it is replaced by the fixed-up version. I do +not consider this a serious problem: if you have partial display mode +enabled, presumably you want to be able to see as much data as possible, +and scroll up and down through it, as early as possible. In fact, the +approach taken keeps Lynx free from a problem that may graphical browsers +have: they often cannot render a table at all until it is received in full. + +------------------------------------------------------------------------ + +To summarize: + - TRST is a solution that works in many cases where lack of tabular support + was most annoying. + - TRST doesn't implement a full table model, and it is extremely unlikely + that it will ever be the basis for that. Keep on exploring external + solutions, or perhaps waiting for (better: working on) a more fundamental + redesign of Lynx's rendering engine. + +Klaus Weide - kweide@enteract.com 1999-10-13 |