summaryrefslogtreecommitdiffstats
path: root/src/lib/eval/eval.dox
blob: e72800630c5af1af21c8b7e397e1c70f0b1bbc6a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
// Copyright (C) 2015-2021 Internet Systems Consortium, Inc. ("ISC")
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.

/**
  @page libeval libkea-eval - Expression Evaluation and Client Classification Library

  @section dhcpEvalIntroduction Introduction

  The core of the libeval library is a parser that is able to parse an
  expression (e.g. option[123].text == 'APC'). This is currently used for
  client classification, but in the future may be also used for other
  applications.

  The external interface to the library is the @ref isc::eval::EvalContext
  class.  Once instantiated, it offers a major method:
  @ref isc::eval::EvalContext::parseString, which parses the specified
  string.  Once the expression is parsed, it is converted to a collection of
  tokens that are stored in Reverse Polish Notation in
  EvalContext::expression.

  Parameters to the @ref isc::eval::EvalContext class constructor are
  the universe to choose between DHCPv4 and DHCPv6 for DHCP version
  dependent expressions, and a function used
  by the parser to accept only already defined or built-in client
  class names in client class membership expressions. This function defaults
  to accept all client class names.

  Internally, the parser code is generated by flex and bison. These two
  tools convert lexer.ll and parser.yy files into a number of .cc and .hh files.
  To avoid a build of Kea depending on the presence of flex and bison, the
  result of the generation is checked into the github repository and is
  distributed in the tarballs.

  @section dhcpEvalLexer Lexer generation using flex

  Flex is used to generate the lexer, a piece of code that converts input
  data into a series of tokens. It contains a small number of directives,
  but the majority of the code consists of the definitions of tokens. These
  definitions are regular expressions that define various tokens, e.g. strings,
  numbers, parentheses, etc. Once the expression is matched, the associated
  action is executed. In the majority of the cases a generator method from
  @ref isc::eval::EvalParser is called, which returns returns a newly created
  bison token. The purpose of the lexer is to generate a stream
  of tokens that are consumed by the parser.

  lexer.cc and lexer.hh must not be edited. If there is a need
  to introduce changes, lexer.ll must be updated and the .cc and .hh files
  regenerated.

  @section dhcpEvalParser Parser generation using bison

  Bison is used to generate the parser, a piece of code that consumes a
  stream of tokens and attempts to match it against a defined grammar.
  The bison parser is created from parser.yy. It contains
  a number of directives, but the two most important sections are:
  a list of tokens (for each token defined here, bison will generate the
  make_NAMEOFTOKEN method in the @ref isc::eval::EvalParser class) and
  the grammar. The Grammar is a tree like structure with possible loops.

  Here is an over-simplified version of the grammar:

@code
01. %start expression;
02.
03. expression : token EQUAL token
04.            | token
05.            ;
06.
07. token : STRING
08.             {
09.                 TokenPtr str(new TokenString($1));
10.                 ctx.expression.push_back(str);
11.             }
12.       | HEXSTRING
13.             {
14.                 TokenPtr hex(new TokenHexString($1));
15.                 ctx.expression.push_back(hex);
16.             }
17.       | OPTION '[' INTEGER ']' DOT TEXT
18.             {
19.                 TokenPtr opt(new TokenOption($3, TokenOption::TEXTUAL));
20.                 ctx.expression.push_back(opt);
21.             }
22.       | OPTION '[' INTEGER ']' DOT HEX
23.             {
24.                 TokenPtr opt(new TokenOption($3, TokenOption::HEXADECIMAL));
25.                 ctx.expression.push_back(opt);
26.              }
27.       ;
@endcode

This code determines that the grammar starts from expression (line 1).
The actual definition of expression (lines 3-5) may either be a
single token or an expression "token == token" (EQUAL has been defined as
"==" elsewhere). Token is further
defined in lines 7-22: it may either be a string (lines 7-11),
a hex string (lines 12-16), option in the textual format (lines 17-21)
or option in a hexadecimal format (lines 22-26).
When the actual case is determined, the respective C++ action
is executed. For example, if the token is a string, the TokenString class is
instantiated with the appropriate value and put onto the expression vector.

@section dhcpEvalMakefile Generating parser files

 In the general case, we want to avoid generating parser files, so an
 average user interested in just compiling Kea would not need flex or
 bison. Therefore the generated files are already included in the
 git repository and will be included in the tarball releases.

 However, there will be cases when one of the developers would want
 to tweak the lexer.ll and parser.yy files and then regenerate
 the code. For this purpose, two makefile targets are defined:
 @code
 make parser
 @endcode
 will generate the parsers and
 @code
 make parser-clean
 @endcode
 will remove the files. Generated files removal was also hooked
 into the maintainer-clean target.

@section dhcpEvalConfigure Configure options

 Since the flex/bison tools are not necessary for a regular compilation,
 checks are conducted during the configure script, but the lack of flex or
 bison tools does not stop the process. There is a flag
 (--enable-generate-parser) that tells configure script that the
 parser will be generated. With this flag, the checks for flex/bison
 are mandatory. If either tool is missing or at too early a version, the
 configure process will terminate with an error.

@section dhcpEvalToken Supported tokens

 There are a number of tokens implemented. Each token is derived from
 isc::eval::Token class and represents a certain expression primitive.
 Currently supported tokens are:

 - isc::dhcp::TokenString -- represents a constant string, e.g. "MSFT".
 - isc::dhcp::TokenHexString -- represents a constant string, encoded as
   hex string, e.g. 0x666f6f which is actually "foo".
 - isc::dhcp::TokenIpAddress -- represents a constant IP address, encoded as
   a 4 or 16 byte binary string, e.g., 10.0.0.1 is 0x10000001.
 - isc::dhcp::TokenIpAddressToText -- represents an IP address in text format.
 - isc::dhcp::TokenOption -- represents an option in a packet, e.g.
                    option[123].text.
 - isc::dhcp::TokenRelay4Option -- represents a sub-option inserted by the
                    DHCPv4 relay, e.g. relay[123].text or relay[123].hex
 - isc::dhcp::TokenRelay6Option -- represents a sub-option inserted by
   a DHCPv6 relay
 - isc::dhcp::TokenPkt -- represents a DHCP packet meta data (incoming
   interface name, source/remote or destination/local IP address, length).
 - isc::dhcp::TokenPkt4 -- represents a DHCPv4 packet field.
 - isc::dhcp::TokenPkt6 -- represents a DHCPv6 packet field (message type
   or transaction id).
 - isc::dhcp::TokenRelay6Field -- represents a DHCPv6 relay information field.
 - isc::dhcp::TokenEqual -- represents the equal (==) operator.
 - isc::dhcp::TokenSubstring -- represents the substring(text, start, length) operator.
 - isc::dhcp::TokenConcat -- represents the concat operator which
   concatenate two other tokens.
 - isc::dhcp::TokenIfElse -- represents the ifelse(cond, iftrue, ifelse) operator.
 - isc::dhcp::TokenToHexString -- represents the hexstring operator which
   converts a binary value to its hexadecimal string representation.
 - isc::dhcp::TokenInt8ToText -- represents the signed 8 bit integer in string
   representation.
 - isc::dhcp::TokenInt16ToText -- represents the signed 16 bit integer in string
   representation.
 - isc::dhcp::TokenInt32ToText -- represents the signed 32 bit integer in string
   representation.
 - isc::dhcp::TokenUInt8ToText -- represents the unsigned 8 bit integer in string
   representation.
 - isc::dhcp::TokenUInt16ToText -- represents the unsigned 16 bit integer in string
   representation.
 - isc::dhcp::TokenUInt32ToText -- represents the unsigned 32 bit integer in string
   representation.
 - isc::dhcp::TokenNot -- the logical not operator.
 - isc::dhcp::TokenAnd -- the logical and (strict) operator.
 - isc::dhcp::TokenOr -- the logical or (strict) operator (strict means
   it always evaluates its operands).
 - isc::dhcp::TokenVendor -- represents vendor information option's existence,
   enterprise-id field and possible sub-options. (e.g. vendor[1234].exists,
   vendor[*].enterprise-id, vendor[1234].option[1].exists, vendor[1234].option[1].hex)
 - isc::dhcp::TokenVendorClass -- represents vendor information option's existence,
   enterprise-id and included data chunks. (e.g. vendor-class[1234].exists,
   vendor-class[*].enterprise-id, vendor-class[*].data[3])

More operators are expected to be implemented in upcoming releases.

@section dhcpEvalMTConsiderations Multi-Threading Consideration for Expression Evaluation Library

This library is not thread safe, for instance @ref isc::dhcp::evaluateBool
or @ref isc::dhcp::evaluateString must not be called in different threads
on the same packet.

*/