summaryrefslogtreecommitdiffstats
path: root/src/3rdparty/libcroco/docs/design
diff options
context:
space:
mode:
Diffstat (limited to 'src/3rdparty/libcroco/docs/design')
-rw-r--r--src/3rdparty/libcroco/docs/design/parser-architecture.txt146
-rw-r--r--src/3rdparty/libcroco/docs/design/sel-instr.txt64
2 files changed, 210 insertions, 0 deletions
diff --git a/src/3rdparty/libcroco/docs/design/parser-architecture.txt b/src/3rdparty/libcroco/docs/design/parser-architecture.txt
new file mode 100644
index 0000000..67b7713
--- /dev/null
+++ b/src/3rdparty/libcroco/docs/design/parser-architecture.txt
@@ -0,0 +1,146 @@
+Libcroco parser architecture
+-----------------------------
+
+Author: Dodji Seketeli <dodji@seketeli.org>
+
+$Id$
+
+I) Forethoughts.
+===================
+
+Libcroco's parser is a simple recursive descent parser.
+The major design focus has been simplicity, reliability and
+conformance.
+
+Simplicity
+-----------
+We want the code to be maintainable by anyone who knows the CSS spec
+and who knows how to code in C. Therefore, we avoid to overuse
+the C preprocessor magic and all the tricks that tend to turn C into
+a maintenance nightmare.
+
+We also try to adhere to the Gnome coding guidelines specified
+at http://developer.gnome.org/doc/guides/programming-guidelines.
+
+
+Reliability
+-----------
+Each single function of the libcroco library should never crash,
+and this, whatever the arguments it takes.
+As a consequence we tend to be paranoid when it comes to check
+pointers values before dereferencing them for example...
+
+Conformance
+-----------
+We try to stick to the CSS spec. We know this is almost impossible to achieve
+given the resources we have but we think it is a sane target to chase.
+
+II) Overall architecture
+=========================
+The parser is organized around several main classes:
+
+1/ CRInput
+2/ CRTknzr (Tokenizer or lexer)
+3/ CRParser
+4/ CROMParser
+
+II.1 The CRInput class
+-----------------------
+The CRInput class provides the abstraction of
+an utf8-encoded character stream.
+
+Ideally, it should abstract local data sources
+(local files and in-memory buffers)
+and remote data sources (sockets, url-identified resources) but for the
+moment, it can only abstract local data sources.
+
+Adding a new type of data source should be transparent for the
+classes that already use CRInput. After all, this is what abstraction is about :)
+
+
+II.2 The CRTknzr class
+----------------------
+The main job of the tokenizer (or lexer) is to
+provide a get_next_token() method.
+This methods returns the next CSS token found in the input stream.
+(Note that the input stream here is an instance of CRInput).
+
+This provides an extremely useful facility to the parser.
+
+II.3 The CRParser class
+-------------------------
+The core of the parser.
+
+The main job of this class is to provide a cr_parser_parse_stylesheet()
+method. During the parsing (the execution of the cr_parser_stylesheet())
+the parser sends events to notify the application when it encounters
+remarkable CSS constructions. This is the SAC (Simple API for CSS) API model.
+
+To achieve that task, almost each production of the CSS grammar
+has a matching parsing function (or method) in this class.
+
+For example, the following production named "ruleset" (specified in the
+CSS2 spec in appendix D.1):
+
+ruleset : selector [ ',' S* selector ]*
+ '{' S* declaration [ ';' S* declaration ]* '}' S*
+
+is "implemented" by the cr_parser_parse_ruleset() method.
+
+The same thing applies for the "selector" production:
+
+selector : simple_selector [ combinator simple_selector ]*
+
+which is implemented by the cr_parser_parse_selector() method... and so on
+and so forth.
+
+II.3.1 Structure of a parsing method.
+-------------------------------------
+A parsing method (e.g cr_parser_parse_ruleset()) is there
+to:
+
+ * try to recognize a substring of the incoming character string
+ as something that matches a given CSS grammar production.
+
+ e.g: the job of the cr_parser_parse_ruleset() is to try
+ to recognize if "what" comes next in the input stream
+ is a CSS2 "ruleset".
+
+ * build a basic abstract data structure to
+ store the information encountered
+ during the parsing of the current character string.
+
+ eg: cr_parser_parse_declaration() has the following prototype:
+
+ enum CRStatus
+ cr_parser_parse_declaration (CRParser *a_this, GString **a_property,
+ CRTerm **a_value) ;
+
+ In case of successful parsing, this method returns
+ (via its parameters) the property _and_ the
+ value of the CSS2 declaration.
+ Note that a CSS2 declaration is specified as follows:
+
+ declaration : property ':' S* expr prio?
+ | /* empty */
+
+ * After completion, say if the parsing has succeeded or not.
+
+ eg: cr_parser_parse_declaration() returns CR_OK if the
+ parsing has succeeded, and error code otherwise. Obviously,
+ the out parameters "a_property" and "a_value" are valid if and only
+ if the return value is CR_OK.
+
+ * whenever the function is parsing a construct that must
+ be notified to the user as part of the SAC API spec, notify
+ the user by calling the right SAC callback.
+
+ * if the parsing failed, leave the position in the stream unchanged.
+ That is, the position in the character stream should be as if
+ the parsing function hasn't been called at all.
+
+
+II.4 The selection Engine.
+--------------------------
+
+Hmmh, I should kick my ass to write this down ... \ No newline at end of file
diff --git a/src/3rdparty/libcroco/docs/design/sel-instr.txt b/src/3rdparty/libcroco/docs/design/sel-instr.txt
new file mode 100644
index 0000000..6b19389
--- /dev/null
+++ b/src/3rdparty/libcroco/docs/design/sel-instr.txt
@@ -0,0 +1,64 @@
+Draft of the libcroco selector internal instruction set.
+*********************************************************
+
+READERS SHOULD READ THE CHAPTER 5 of THE CSS2 CSS2 SPEC INTITLED
+"Selectors" FIRST.
+
+I) Introduction
+''''''''''''''''''''
+This is the instructions set understood by the libcroco
+sel-eng.c module (Selection engine).
+
+The purpose of the selection engine is to basically to say whether if a given
+xml node is matched by a given css2 selector or not.
+
+II) Rationale
+''''''''''''''''''''
+For the sake of performance (mostly processing speed) each CSS2
+selector is compiled into a sequences of atomic selection instructions
+that are easily executable by the selection engine.
+
+III) Selection instruction set overview
+''''''''''''''''''''''''''''''''''''''''
+
+Each selection instruction returns a boolean value (TRUE or FALSE).
+The execution of a sequence of selection instruction stops at the
+first instruction that returns a FALSE value and the selection engine
+returns returns the value FALSE to say that the current xml node
+is matched by the CSS2 selection expression being evaluated.
+
+Note that during the evaluation of a CSS2 selection expression,
+all the contextual information are stored into an evaluation context.
+For example, the context will hold a pointer to the xml node the
+selection engine is trying to match.
+
+III.1) The instruction set.
+'''''''''''''''''''''''''''
+
+set-cur-node 'a_node'
+----------------------
+a_node: an xml node
+Sets the current xml node (in the context) to a_node.
+
+match-n-ancestor 'a_n' 'a_parent'
+----------------------------------
+a_parent: a string.
+a_n: a number. The depth of the ancestor
+
+Returns true if the current xml node has an ancestor
+located at a depth 'n' (going upward from the current node)
+and named 'a_parent'. An ancestor located at depth '0' designates
+the current xml node. An ancesstor located at depth '1' designates
+the parent of the current xml node etc ...
+
+match-any
+---------
+Always returns true.
+
+match-first-child 'a_name'
+--------------------------
+Returns true if the current xml element's name equal 'a_name' and
+if the current xml element is the first child of its parent.
+
+TODO: continue reading the chapter 5 of the css2 spec and finish
+the design of this instruction set.