summaryrefslogtreecommitdiffstats
path: root/src/lib/eval/eval.dox
diff options
context:
space:
mode:
Diffstat (limited to 'src/lib/eval/eval.dox')
-rw-r--r--src/lib/eval/eval.dox198
1 files changed, 198 insertions, 0 deletions
diff --git a/src/lib/eval/eval.dox b/src/lib/eval/eval.dox
new file mode 100644
index 0000000..e728006
--- /dev/null
+++ b/src/lib/eval/eval.dox
@@ -0,0 +1,198 @@
+// Copyright (C) 2015-2021 Internet Systems Consortium, Inc. ("ISC")
+//
+// This Source Code Form is subject to the terms of the Mozilla Public
+// License, v. 2.0. If a copy of the MPL was not distributed with this
+// file, You can obtain one at http://mozilla.org/MPL/2.0/.
+
+/**
+ @page libeval libkea-eval - Expression Evaluation and Client Classification Library
+
+ @section dhcpEvalIntroduction Introduction
+
+ The core of the libeval library is a parser that is able to parse an
+ expression (e.g. option[123].text == 'APC'). This is currently used for
+ client classification, but in the future may be also used for other
+ applications.
+
+ The external interface to the library is the @ref isc::eval::EvalContext
+ class. Once instantiated, it offers a major method:
+ @ref isc::eval::EvalContext::parseString, which parses the specified
+ string. Once the expression is parsed, it is converted to a collection of
+ tokens that are stored in Reverse Polish Notation in
+ EvalContext::expression.
+
+ Parameters to the @ref isc::eval::EvalContext class constructor are
+ the universe to choose between DHCPv4 and DHCPv6 for DHCP version
+ dependent expressions, and a function used
+ by the parser to accept only already defined or built-in client
+ class names in client class membership expressions. This function defaults
+ to accept all client class names.
+
+ Internally, the parser code is generated by flex and bison. These two
+ tools convert lexer.ll and parser.yy files into a number of .cc and .hh files.
+ To avoid a build of Kea depending on the presence of flex and bison, the
+ result of the generation is checked into the github repository and is
+ distributed in the tarballs.
+
+ @section dhcpEvalLexer Lexer generation using flex
+
+ Flex is used to generate the lexer, a piece of code that converts input
+ data into a series of tokens. It contains a small number of directives,
+ but the majority of the code consists of the definitions of tokens. These
+ definitions are regular expressions that define various tokens, e.g. strings,
+ numbers, parentheses, etc. Once the expression is matched, the associated
+ action is executed. In the majority of the cases a generator method from
+ @ref isc::eval::EvalParser is called, which returns returns a newly created
+ bison token. The purpose of the lexer is to generate a stream
+ of tokens that are consumed by the parser.
+
+ lexer.cc and lexer.hh must not be edited. If there is a need
+ to introduce changes, lexer.ll must be updated and the .cc and .hh files
+ regenerated.
+
+ @section dhcpEvalParser Parser generation using bison
+
+ Bison is used to generate the parser, a piece of code that consumes a
+ stream of tokens and attempts to match it against a defined grammar.
+ The bison parser is created from parser.yy. It contains
+ a number of directives, but the two most important sections are:
+ a list of tokens (for each token defined here, bison will generate the
+ make_NAMEOFTOKEN method in the @ref isc::eval::EvalParser class) and
+ the grammar. The Grammar is a tree like structure with possible loops.
+
+ Here is an over-simplified version of the grammar:
+
+@code
+01. %start expression;
+02.
+03. expression : token EQUAL token
+04. | token
+05. ;
+06.
+07. token : STRING
+08. {
+09. TokenPtr str(new TokenString($1));
+10. ctx.expression.push_back(str);
+11. }
+12. | HEXSTRING
+13. {
+14. TokenPtr hex(new TokenHexString($1));
+15. ctx.expression.push_back(hex);
+16. }
+17. | OPTION '[' INTEGER ']' DOT TEXT
+18. {
+19. TokenPtr opt(new TokenOption($3, TokenOption::TEXTUAL));
+20. ctx.expression.push_back(opt);
+21. }
+22. | OPTION '[' INTEGER ']' DOT HEX
+23. {
+24. TokenPtr opt(new TokenOption($3, TokenOption::HEXADECIMAL));
+25. ctx.expression.push_back(opt);
+26. }
+27. ;
+@endcode
+
+This code determines that the grammar starts from expression (line 1).
+The actual definition of expression (lines 3-5) may either be a
+single token or an expression "token == token" (EQUAL has been defined as
+"==" elsewhere). Token is further
+defined in lines 7-22: it may either be a string (lines 7-11),
+a hex string (lines 12-16), option in the textual format (lines 17-21)
+or option in a hexadecimal format (lines 22-26).
+When the actual case is determined, the respective C++ action
+is executed. For example, if the token is a string, the TokenString class is
+instantiated with the appropriate value and put onto the expression vector.
+
+@section dhcpEvalMakefile Generating parser files
+
+ In the general case, we want to avoid generating parser files, so an
+ average user interested in just compiling Kea would not need flex or
+ bison. Therefore the generated files are already included in the
+ git repository and will be included in the tarball releases.
+
+ However, there will be cases when one of the developers would want
+ to tweak the lexer.ll and parser.yy files and then regenerate
+ the code. For this purpose, two makefile targets are defined:
+ @code
+ make parser
+ @endcode
+ will generate the parsers and
+ @code
+ make parser-clean
+ @endcode
+ will remove the files. Generated files removal was also hooked
+ into the maintainer-clean target.
+
+@section dhcpEvalConfigure Configure options
+
+ Since the flex/bison tools are not necessary for a regular compilation,
+ checks are conducted during the configure script, but the lack of flex or
+ bison tools does not stop the process. There is a flag
+ (--enable-generate-parser) that tells configure script that the
+ parser will be generated. With this flag, the checks for flex/bison
+ are mandatory. If either tool is missing or at too early a version, the
+ configure process will terminate with an error.
+
+@section dhcpEvalToken Supported tokens
+
+ There are a number of tokens implemented. Each token is derived from
+ isc::eval::Token class and represents a certain expression primitive.
+ Currently supported tokens are:
+
+ - isc::dhcp::TokenString -- represents a constant string, e.g. "MSFT".
+ - isc::dhcp::TokenHexString -- represents a constant string, encoded as
+ hex string, e.g. 0x666f6f which is actually "foo".
+ - isc::dhcp::TokenIpAddress -- represents a constant IP address, encoded as
+ a 4 or 16 byte binary string, e.g., 10.0.0.1 is 0x10000001.
+ - isc::dhcp::TokenIpAddressToText -- represents an IP address in text format.
+ - isc::dhcp::TokenOption -- represents an option in a packet, e.g.
+ option[123].text.
+ - isc::dhcp::TokenRelay4Option -- represents a sub-option inserted by the
+ DHCPv4 relay, e.g. relay[123].text or relay[123].hex
+ - isc::dhcp::TokenRelay6Option -- represents a sub-option inserted by
+ a DHCPv6 relay
+ - isc::dhcp::TokenPkt -- represents a DHCP packet meta data (incoming
+ interface name, source/remote or destination/local IP address, length).
+ - isc::dhcp::TokenPkt4 -- represents a DHCPv4 packet field.
+ - isc::dhcp::TokenPkt6 -- represents a DHCPv6 packet field (message type
+ or transaction id).
+ - isc::dhcp::TokenRelay6Field -- represents a DHCPv6 relay information field.
+ - isc::dhcp::TokenEqual -- represents the equal (==) operator.
+ - isc::dhcp::TokenSubstring -- represents the substring(text, start, length) operator.
+ - isc::dhcp::TokenConcat -- represents the concat operator which
+ concatenate two other tokens.
+ - isc::dhcp::TokenIfElse -- represents the ifelse(cond, iftrue, ifelse) operator.
+ - isc::dhcp::TokenToHexString -- represents the hexstring operator which
+ converts a binary value to its hexadecimal string representation.
+ - isc::dhcp::TokenInt8ToText -- represents the signed 8 bit integer in string
+ representation.
+ - isc::dhcp::TokenInt16ToText -- represents the signed 16 bit integer in string
+ representation.
+ - isc::dhcp::TokenInt32ToText -- represents the signed 32 bit integer in string
+ representation.
+ - isc::dhcp::TokenUInt8ToText -- represents the unsigned 8 bit integer in string
+ representation.
+ - isc::dhcp::TokenUInt16ToText -- represents the unsigned 16 bit integer in string
+ representation.
+ - isc::dhcp::TokenUInt32ToText -- represents the unsigned 32 bit integer in string
+ representation.
+ - isc::dhcp::TokenNot -- the logical not operator.
+ - isc::dhcp::TokenAnd -- the logical and (strict) operator.
+ - isc::dhcp::TokenOr -- the logical or (strict) operator (strict means
+ it always evaluates its operands).
+ - isc::dhcp::TokenVendor -- represents vendor information option's existence,
+ enterprise-id field and possible sub-options. (e.g. vendor[1234].exists,
+ vendor[*].enterprise-id, vendor[1234].option[1].exists, vendor[1234].option[1].hex)
+ - isc::dhcp::TokenVendorClass -- represents vendor information option's existence,
+ enterprise-id and included data chunks. (e.g. vendor-class[1234].exists,
+ vendor-class[*].enterprise-id, vendor-class[*].data[3])
+
+More operators are expected to be implemented in upcoming releases.
+
+@section dhcpEvalMTConsiderations Multi-Threading Consideration for Expression Evaluation Library
+
+This library is not thread safe, for instance @ref isc::dhcp::evaluateBool
+or @ref isc::dhcp::evaluateString must not be called in different threads
+on the same packet.
+
+*/