diff options
Diffstat (limited to 'src/3rdparty/libcroco/docs/design')
-rw-r--r-- | src/3rdparty/libcroco/docs/design/parser-architecture.txt | 146 | ||||
-rw-r--r-- | src/3rdparty/libcroco/docs/design/sel-instr.txt | 64 |
2 files changed, 210 insertions, 0 deletions
diff --git a/src/3rdparty/libcroco/docs/design/parser-architecture.txt b/src/3rdparty/libcroco/docs/design/parser-architecture.txt new file mode 100644 index 0000000..67b7713 --- /dev/null +++ b/src/3rdparty/libcroco/docs/design/parser-architecture.txt @@ -0,0 +1,146 @@ +Libcroco parser architecture +----------------------------- + +Author: Dodji Seketeli <dodji@seketeli.org> + +$Id$ + +I) Forethoughts. +=================== + +Libcroco's parser is a simple recursive descent parser. +The major design focus has been simplicity, reliability and +conformance. + +Simplicity +----------- +We want the code to be maintainable by anyone who knows the CSS spec +and who knows how to code in C. Therefore, we avoid to overuse +the C preprocessor magic and all the tricks that tend to turn C into +a maintenance nightmare. + +We also try to adhere to the Gnome coding guidelines specified +at http://developer.gnome.org/doc/guides/programming-guidelines. + + +Reliability +----------- +Each single function of the libcroco library should never crash, +and this, whatever the arguments it takes. +As a consequence we tend to be paranoid when it comes to check +pointers values before dereferencing them for example... + +Conformance +----------- +We try to stick to the CSS spec. We know this is almost impossible to achieve +given the resources we have but we think it is a sane target to chase. + +II) Overall architecture +========================= +The parser is organized around several main classes: + +1/ CRInput +2/ CRTknzr (Tokenizer or lexer) +3/ CRParser +4/ CROMParser + +II.1 The CRInput class +----------------------- +The CRInput class provides the abstraction of +an utf8-encoded character stream. + +Ideally, it should abstract local data sources +(local files and in-memory buffers) +and remote data sources (sockets, url-identified resources) but for the +moment, it can only abstract local data sources. + +Adding a new type of data source should be transparent for the +classes that already use CRInput. After all, this is what abstraction is about :) + + +II.2 The CRTknzr class +---------------------- +The main job of the tokenizer (or lexer) is to +provide a get_next_token() method. +This methods returns the next CSS token found in the input stream. +(Note that the input stream here is an instance of CRInput). + +This provides an extremely useful facility to the parser. + +II.3 The CRParser class +------------------------- +The core of the parser. + +The main job of this class is to provide a cr_parser_parse_stylesheet() +method. During the parsing (the execution of the cr_parser_stylesheet()) +the parser sends events to notify the application when it encounters +remarkable CSS constructions. This is the SAC (Simple API for CSS) API model. + +To achieve that task, almost each production of the CSS grammar +has a matching parsing function (or method) in this class. + +For example, the following production named "ruleset" (specified in the +CSS2 spec in appendix D.1): + +ruleset : selector [ ',' S* selector ]* + '{' S* declaration [ ';' S* declaration ]* '}' S* + +is "implemented" by the cr_parser_parse_ruleset() method. + +The same thing applies for the "selector" production: + +selector : simple_selector [ combinator simple_selector ]* + +which is implemented by the cr_parser_parse_selector() method... and so on +and so forth. + +II.3.1 Structure of a parsing method. +------------------------------------- +A parsing method (e.g cr_parser_parse_ruleset()) is there +to: + + * try to recognize a substring of the incoming character string + as something that matches a given CSS grammar production. + + e.g: the job of the cr_parser_parse_ruleset() is to try + to recognize if "what" comes next in the input stream + is a CSS2 "ruleset". + + * build a basic abstract data structure to + store the information encountered + during the parsing of the current character string. + + eg: cr_parser_parse_declaration() has the following prototype: + + enum CRStatus + cr_parser_parse_declaration (CRParser *a_this, GString **a_property, + CRTerm **a_value) ; + + In case of successful parsing, this method returns + (via its parameters) the property _and_ the + value of the CSS2 declaration. + Note that a CSS2 declaration is specified as follows: + + declaration : property ':' S* expr prio? + | /* empty */ + + * After completion, say if the parsing has succeeded or not. + + eg: cr_parser_parse_declaration() returns CR_OK if the + parsing has succeeded, and error code otherwise. Obviously, + the out parameters "a_property" and "a_value" are valid if and only + if the return value is CR_OK. + + * whenever the function is parsing a construct that must + be notified to the user as part of the SAC API spec, notify + the user by calling the right SAC callback. + + * if the parsing failed, leave the position in the stream unchanged. + That is, the position in the character stream should be as if + the parsing function hasn't been called at all. + + +II.4 The selection Engine. +-------------------------- + +Hmmh, I should kick my ass to write this down ...
\ No newline at end of file diff --git a/src/3rdparty/libcroco/docs/design/sel-instr.txt b/src/3rdparty/libcroco/docs/design/sel-instr.txt new file mode 100644 index 0000000..6b19389 --- /dev/null +++ b/src/3rdparty/libcroco/docs/design/sel-instr.txt @@ -0,0 +1,64 @@ +Draft of the libcroco selector internal instruction set. +********************************************************* + +READERS SHOULD READ THE CHAPTER 5 of THE CSS2 CSS2 SPEC INTITLED +"Selectors" FIRST. + +I) Introduction +'''''''''''''''''''' +This is the instructions set understood by the libcroco +sel-eng.c module (Selection engine). + +The purpose of the selection engine is to basically to say whether if a given +xml node is matched by a given css2 selector or not. + +II) Rationale +'''''''''''''''''''' +For the sake of performance (mostly processing speed) each CSS2 +selector is compiled into a sequences of atomic selection instructions +that are easily executable by the selection engine. + +III) Selection instruction set overview +'''''''''''''''''''''''''''''''''''''''' + +Each selection instruction returns a boolean value (TRUE or FALSE). +The execution of a sequence of selection instruction stops at the +first instruction that returns a FALSE value and the selection engine +returns returns the value FALSE to say that the current xml node +is matched by the CSS2 selection expression being evaluated. + +Note that during the evaluation of a CSS2 selection expression, +all the contextual information are stored into an evaluation context. +For example, the context will hold a pointer to the xml node the +selection engine is trying to match. + +III.1) The instruction set. +''''''''''''''''''''''''''' + +set-cur-node 'a_node' +---------------------- +a_node: an xml node +Sets the current xml node (in the context) to a_node. + +match-n-ancestor 'a_n' 'a_parent' +---------------------------------- +a_parent: a string. +a_n: a number. The depth of the ancestor + +Returns true if the current xml node has an ancestor +located at a depth 'n' (going upward from the current node) +and named 'a_parent'. An ancestor located at depth '0' designates +the current xml node. An ancesstor located at depth '1' designates +the parent of the current xml node etc ... + +match-any +--------- +Always returns true. + +match-first-child 'a_name' +-------------------------- +Returns true if the current xml element's name equal 'a_name' and +if the current xml element is the first child of its parent. + +TODO: continue reading the chapter 5 of the css2 spec and finish +the design of this instruction set. |