158 lines
7.7 KiB
Text
158 lines
7.7 KiB
Text
Tabular Support for Simple Tables
|
|
=================================
|
|
Some definitions first:
|
|
|
|
* NO table support
|
|
What it says. :) Table related tags are treated like other
|
|
completely unrecognized tags.
|
|
Only listed for completeness, this does not describe Lynx.
|
|
|
|
* MINIMAL table support
|
|
Table related tags are recognized, and are used to separate
|
|
the contents of different cells (by at least a space) and rows
|
|
(by a line break) visibly from each other.
|
|
|
|
* LYNX minimal table support (LMTS)
|
|
The minimal table support as implemented by Lynx up to this point,
|
|
also includes the way ALIGN attributes are handled on TABLE, TR
|
|
and other specific tweaks (e.g. handle TABLE within PRE specially).
|
|
LMTS formatting is briefly described in the Lynx User Guide, see
|
|
the section "Lynx and HTML Tables" there. (The Users Guide has not
|
|
yet been updated for tabular support.)
|
|
|
|
* TABULAR support for tables
|
|
Support for tables that really arranges table cells in tabular form.
|
|
|
|
* Tabular Rendering for SIMPLE Tables (TRST)
|
|
Tabular support for some tables that are 'simple' enough; what this
|
|
code change provides.
|
|
|
|
One basic idea behind providing TRST is that correct tabular support
|
|
for all tables is complex, doesn't fit well into the overwhelmingly
|
|
one-pass way in which Lynx does things, and may in the end not give
|
|
pleasant results anyway for pages that (ab-)use more complex table
|
|
structures for display formatting purposes (especially in view of Lynx
|
|
limitations such as fixed character cell size and lack of horizontal
|
|
scrolling; see also emacs w3 mode). Full table support within Lynx
|
|
hasn't happened so far, and continues to seem unlikely to happen in the
|
|
near future.
|
|
|
|
The other basic idea is the observation that for simple tables, as
|
|
used mostly for data that are really tabular in nature, LMTS rendering
|
|
can be transformed into TRST rendering, after parsing the TABLE element,
|
|
by two simple transformations applied line by line:
|
|
- Insert spaces in the right places.
|
|
- Shift the line as a whole.
|
|
|
|
And that's exactly what TRST does. An implementation based on the
|
|
simple observation above is relatively straightforward, for simple
|
|
tables. On encountering the start of a TABLE element, Lynx generates
|
|
output as usual for LMTS. But it also keeps track of cell positions
|
|
and lengths in parallel. If all goes well, that additional information
|
|
is used to fix up the already formatted output lines when the TABLE
|
|
ends. If not all goes well, the table was not 'simple' enough, the
|
|
additional processing is canceled. One advantage is that we always
|
|
have a 'safe' fallback to well-understood traditional LMTS formatting:
|
|
TRST won't make more complex tables look worse than before.
|
|
|
|
What are 'simple' tables? A table is simple enough if each of its TR
|
|
rows translates into at most one display line in LMTS formatting (excluding
|
|
leading and trailing line breaks), and the width required by each row
|
|
(before as well as after fixup) does not exceed the available screen size.
|
|
Note that this excludes all tables where some of the cells are marked up as
|
|
block elements ('paragraphs'). Tables that include nested TABLE elements
|
|
are always specifically excluded, but the inner tables may be subject to
|
|
TRST handling. Also excluded are some constructs that indicate that markup
|
|
was already optimized for Lynx (or other browsers with no or minimal table
|
|
support): TABLE in PRE, use of TAB.
|
|
|
|
The description so far isn't completely accurate. In many cases, tables are
|
|
not simple enough according to the last paragraph, but parts of each TR row
|
|
can still benefit from some TRST treatment. Some partial treatment is done
|
|
for some tables in this grey zone, which may or may not help to a better
|
|
display, depending on how the table is used. This is an area where tweaks
|
|
in the future are most expected, and where the code's behavior is currently
|
|
not well defined.
|
|
|
|
One possible approach:
|
|
- The table is 'simple' according to all criteria set out in the previous
|
|
paragraph, except that some cells at the beginning and/or end of TR rows
|
|
may contain block elements (or other markup that results in formatting
|
|
like separate paragraphs).
|
|
- There is at most one range of (non-empty) table cells in each row whose
|
|
contents is not paragraph-formatted, and who are rendered on one line
|
|
together by LMTS, separate from the paragraph-formatted cells. Let's
|
|
call these cells the 'core' of a row.
|
|
Fixups are then only applied to the text lines showing the 'core' cells.
|
|
The paragraph-formatted cells are effectively pulled out before/after
|
|
their row (no horizontal space is allocated to them for the purpose of
|
|
determining column widths for core line formatting).
|
|
|
|
This is expected to be most useful for tables that are mostly
|
|
simple tabular data cells, but with the occasional longer
|
|
text thrown in. For example, a table with intended rendering:
|
|
|
|
--------------------------------------------------------
|
|
| date | item no. | price | remarks |
|
|
|--------|--------------|---------|----------------------|
|
|
| date-1 | item #1 | $0.00 | |
|
|
|--------|--------------|---------|----------------------|
|
|
| date-2 | item #2 | $101.99 | A longer annotation |
|
|
| | | | marked up as a block |
|
|
| | | | of text. |
|
|
|--------|--------------|---------|----------------------|
|
|
| date-3 | long item #3 | $99.00 | |
|
|
--------------------------------------------------------
|
|
|
|
It may now be shown by Lynx as
|
|
|
|
.................................................
|
|
|
|
date item no. price remarks
|
|
date-1 item #1 $0.00
|
|
date-2 item #2 $101.99
|
|
|
|
A longer annotation marked up as a block of
|
|
text.
|
|
|
|
date-3 long item #3 $99.00
|
|
|
|
.................................................
|
|
|
|
As can be seen, this is still quite far from the intended rendering,
|
|
but it is better than without any tabular support.
|
|
|
|
Whether the code does something sensible with "grey area" tables is up
|
|
for testing. Most of the typical tables in typical Web pages aren't
|
|
used in a way that can benefit from the TRST approach. Parts of such
|
|
tables may still end up getting shifted left or right by the TRST code
|
|
when that doesn't improve anything, but I haven't seen it make things
|
|
really worse so far (with the current code).
|
|
|
|
TRST and Partial Display
|
|
------------------------
|
|
[ Partial display mode is the feature which allows viewing and scrolling
|
|
of pages while they are loaded, without having to wait for a complete
|
|
transfer. ] During partial display rendering, table lines can sometimes
|
|
be shown in the original formatting, i.e. with horizontal fixups not yet
|
|
applied. This is more likely for longer tables, and depends on the state
|
|
in which partial display 'catches' the TRST code. Sometimes the display
|
|
may flicker: first the preliminary rendering of table lines is shown, then
|
|
after loading is finished it is replaced by the fixed-up version. I do
|
|
not consider this a serious problem: if you have partial display mode
|
|
enabled, presumably you want to be able to see as much data as possible,
|
|
and scroll up and down through it, as early as possible. In fact, the
|
|
approach taken keeps Lynx free from a problem that may graphical browsers
|
|
have: they often cannot render a table at all until it is received in full.
|
|
|
|
------------------------------------------------------------------------
|
|
|
|
To summarize:
|
|
- TRST is a solution that works in many cases where lack of tabular support
|
|
was most annoying.
|
|
- TRST doesn't implement a full table model, and it is extremely unlikely
|
|
that it will ever be the basis for that. Keep on exploring external
|
|
solutions, or perhaps waiting for (better: working on) a more fundamental
|
|
redesign of Lynx's rendering engine.
|
|
|
|
Klaus Weide - kweide@enteract.com 1999-10-13
|