diff options
Diffstat (limited to 'doc/userguide/devguide')
37 files changed, 2563 insertions, 0 deletions
diff --git a/doc/userguide/devguide/README.md b/doc/userguide/devguide/README.md new file mode 100644 index 0000000..c5d80eb --- /dev/null +++ b/doc/userguide/devguide/README.md @@ -0,0 +1,9 @@ +# Suricata Developer Guide + +This directory contains the Suricata Developer's Guide. It is built as part of the Suricata Userguide. + +The Sequence Diagrams seen in the Transactions documentation are generated with Mscgen. Mscgen is a small program to parse Message Sequence Charts that can be represented as text and can then converted to image. + +If you need to update the diagrams, please edit the ``.msc`` files present in the diagrams directory (extending/app-layer/diagrams). Once those have been changed, in the ``scripts`` directory (in the main Suricata dir) there's a scrip that will generate images for all files: ``generate-images.sh`` (you'll have to install Mscgen for that to work). + +More info about Mscgen can be found at: https://www.mcternan.me.uk/mscgen/ diff --git a/doc/userguide/devguide/codebase/code-style.rst b/doc/userguide/devguide/codebase/code-style.rst new file mode 100644 index 0000000..1d1d009 --- /dev/null +++ b/doc/userguide/devguide/codebase/code-style.rst @@ -0,0 +1,756 @@ +.. _Coding Style: + +============ +Coding Style +============ + +Suricata uses a fairly strict coding style. This document describes it. + +Formatting +~~~~~~~~~~ + +clang-format +^^^^^^^^^^^^ +``clang-format`` is configured to help you with formatting C code. + +.. note:: + + The ``.clang-format`` script requires clang 9 or newer. + +Format your Changes +******************* + +Before opening a pull request, please also try to ensure it is formatted +properly. We use ``clang-format`` for this, which has git integration through the +``git-clang-format`` script to only format your changes. + +On some systems, it may already be installed (or be installable via your package +manager). If so, you can simply run it. + +It is recommended to format each commit as you go. However, you can always +reformat your whole branch after the fact. + +.. note:: + + Depending on your installation, you might have to use the version-specific + ``git clang-format`` in the commands below, e.g. ``git clang-format-9``, + and possibly even provide the ``clang-format`` binary with + ``--binary clang-format-9``. + + As an alternative, you can use the provided ``scripts/clang-format.sh`` + that isolates you from the different versions. + +Formatting the most recent commit only +"""""""""""""""""""""""""""""""""""""" +The following command will format only the code changed in the most recent commit: + +.. code-block:: bash + + $ git clang-format HEAD^ + # Or with script: + $ scripts/clang-format.sh commit + +Note that this modifies the files, but doesn't commit them. If the changes are +trivial, you’ll likely want to run + +.. code-block:: bash + + $ git commit --amend -a + +in order to update the last commit with all pending changes. + +For bigger formatting changes, we do ask you to add them to separate, dedicated +commits. + +Formatting code in staging +"""""""""""""""""""""""""" +The following command will format the changes in staging, i.e. files you +``git add``-ed: + +.. code-block:: bash + + $ git clang-format + # Or with script: + $ scripts/clang-format.sh cached + +If you also want to change the unstaged changes, do: + +.. code-block:: bash + + $ git clang-format --force + # Or with script: + $ scripts/clang-format.sh cached --force + +Formatting your branch's commits +"""""""""""""""""""""""""""""""" +In case you have multiple commits on your branch already and forgot to +format them you can fix that up as well. + +The following command will format every commit in your branch off master and +rewrite history using the existing commit metadata. + +Tip: Create a new version of your branch first and run this off the new version. + +.. code-block:: bash + + # In a new version of your pull request: + $ scripts/clang-format.sh rewrite-branch + +Note that the above should only be used for rather minimal formatting changes. +As mentioned, we prefer that you add such changes to a dedicated commit for +formatting changes: + +.. code-block:: bash + + # Format all changes by commits on your branch: + $ git clang-format first_commit_on_your_branch^ + # Or with script: + $ scripts/clang-format.sh branch + +Note the usage of ``first_commit_on_your_branch^``, not ``master``, to avoid picking up +new commits on ``master`` in case you've updated master since you've branched. + +Check formatting +"""""""""""""""" +Check if your branch changes' formatting is correct with: + +.. code-block:: bash + + $ scripts/clang-format.sh check-branch + +Add the ``--diffstat`` parameter if you want to see the files needing formatting. +Add the ``--diff`` parameter if you want to see the actual diff of the formatting +change. + +Formatting a whole file +""""""""""""""""""""""" + ++--------------------------------------------------------------------+ +| **Note** | +| | +| Do not reformat whole files by default, i.e. do not use | +| ``clang-format`` proper in general. | ++--------------------------------------------------------------------+ + +If you were ever to do so, formatting changes of existing code with clang-format +shall be a different commit and must not be mixed with actual code changes. + +.. code-block:: bash + + $ clang-format -i {file} + +Disabling clang-format +********************** + +There might be times, where the clang-format's formatting might not please. +This might mostly happen with macros, arrays (single or multi-dimensional ones), +struct initialization, or where one manually formatted code. + +You can always disable clang-format. + +.. code-block:: c + + /* clang-format off */ + #define APP_LAYER_INCOMPLETE(c, n) (AppLayerResult){1, (c), (n)} + /* clang-format on */ + +Installing clang-format and git-clang-format +******************************************** +clang-format 9 or newer is required. + +On ubuntu 18.04: + +- It is sufficient to only install clang-format, e.g. + + .. code-block:: bash + + $ sudo apt-get install clang-format-9 + +- See http://apt.llvm.org for other releases in case the clang-format version + is not found in the default repos. + +On fedora: + +- Install the ``clang`` and ``git-clang-format`` packages with + + .. code-block:: bash + + $ sudo dnf install clang git-clang-format + + +Line length +^^^^^^^^^^^ + +Limit line lengths to 100 characters. + +When wrapping lines that are too long, they should be indented at least 8 +spaces from previous line. You should attempt to wrap the minimal portion of +the line to meet the 100 character limit. + +clang-format: + - ColumnLimit: 100 + - ContinuationIndentWidth: 8 + - ReflowComments: true + + +Indent +^^^^^^ + +We use 4 space indentation. + +.. code-block:: c + + int DecodeEthernet(ThreadVars *tv, DecodeThreadVars *dtv, Packet *p, + uint8_t *pkt, uint16_t len, PacketQueue *pq) + { + SCPerfCounterIncr(dtv->counter_eth, tv->sc_perf_pca); + + if (unlikely(len < ETHERNET_HEADER_LEN)) { + ENGINE_SET_INVALID_EVENT(p, ETHERNET_PKT_TOO_SMALL); + return TM_ECODE_FAILED; + } + + ... + + DecodeNetworkLayer(tv, dtv, SCNtohs(p->ethh->eth_type), p, + pkt + ETHERNET_HEADER_LEN, len - ETHERNET_HEADER_LEN); + + return TM_ECODE_OK; + } + +Use 8 space indentation when wrapping function parameters, loops and if statements. + +Use 4 space indentation when wrapping variable definitions. + +.. code-block:: c + + const SCPlugin PluginSpec = { + .name = OUTPUT_NAME, + .author = "Some Developer", + .license = "GPLv2", + .Init = TemplateInit, + }; + + +clang-format: + - AlignAfterOpenBracket: DontAlign + - Cpp11BracedListStyle: false + - IndentWidth: 4 + - TabWidth: 8 [llvm]_ + - UseTab: Never [llvm]_ + +Braces +^^^^^^ + +Functions should have the opening brace on a newline: + +.. code-block:: c + + int SomeFunction(void) + { + DoSomething(); + } + +Note: you may encounter non-compliant code. + +Control and loop statements should have the opening brace on the same line: + +.. code-block:: c + + if (unlikely(len < ETHERNET_HEADER_LEN)) { + ENGINE_SET_INVALID_EVENT(p, ETHERNET_PKT_TOO_SMALL); + return TM_ECODE_FAILED; + } + + for (ascii_code = 0; ascii_code < 256; ascii_code++) { + ctx->goto_table[ctx->state_count][ascii_code] = SC_AC_FAIL; + } + + while (funcs != NULL) { + temp = funcs; + funcs = funcs->next; + SCFree(temp); + } + +Opening and closing braces go on the same line as as the _else_ (also known as a "cuddled else"). + +.. code-block:: c + + if (this) { + DoThis(); + } else { + DoThat(); + } + +Structs, unions and enums should have the opening brace on the same line: + +.. code-block:: c + + union { + TCPVars tcpvars; + ICMPV4Vars icmpv4vars; + ICMPV6Vars icmpv6vars; + } l4vars; + + struct { + uint8_t type; + uint8_t code; + } icmp_s; + + enum { + DETECT_TAG_TYPE_SESSION, + DETECT_TAG_TYPE_HOST, + DETECT_TAG_TYPE_MAX + }; + +clang-format: + - BreakBeforeBraces: Custom [breakbeforebraces]_ + - BraceWrapping: + + - AfterClass: true + - AfterControlStatement: false + - AfterEnum: false + - AfterFunction: true + - AfterStruct: false + - AfterUnion: false + - AfterExternBlock: true + - BeforeElse: false + - IndentBraces: false + +Flow +~~~~ + +Don't use conditions and statements on the same line. E.g. + +.. code-block:: c + + if (a) b = a; // <- wrong + + if (a) + b = a; // <- right + + for (int i = 0; i < 32; ++i) f(i); // <- wrong + + for (int i = 0; i < 32; ++i) + f(i); // <- right + +Don't put short or empty functions and structs on one line. + +.. code-block:: c + + void empty_function(void) + { + } + + int short_function(void) + { + return 1; + } + +Don't use unnecessary branching. E.g.: + +.. code-block:: c + + if (error) { + goto error; + } else { + a = b; + } + + +Can be written as: + +.. code-block:: c + + if (error) { + goto error; + } + a = b; + +clang-format: + - AllowShortBlocksOnASingleLine: false [llvm]_ + - AllowShortBlocksOnASingleLine: Never [llvm]_ (breaking change in clang 10!) [clang10]_ + - AllowShortEnumsOnASingleLine: false [clang11]_ + - AllowShortFunctionsOnASingleLine: None + - AllowShortIfStatementsOnASingleLine: Never [llvm]_ + - AllowShortLoopsOnASingleLine: false [llvm]_ + - BreakBeforeBraces: Custom [breakbeforebraces]_ + - BraceWrapping: + + - SplitEmptyFunction: true + - SplitEmptyRecord: true + +Alignment +~~~~~~~~~ + +Pointers +^^^^^^^^ +Pointers shall be right aligned. + +.. code-block:: c + + void *ptr; + void f(int *a, const char *b); + void (*foo)(int *); + +clang-format: + - PointerAlignment: Right + - DerivePointerAlignment: false + +Declarations and Comments +^^^^^^^^^^^^^^^^^^^^^^^^^ +Trailing comments should be aligned for consecutive lines. + +.. code-block:: c + + struct bla { + int a; /* comment */ + unsigned bb; /* comment */ + int *ccc; /* comment */ + }; + + void alignment() + { + // multiple consecutive vars + int a = 13; /* comment */ + int32_t abc = 1312; /* comment */ + int abcdefghikl = 13; /* comment */ + } + +clang-format: + - AlignConsecutiveAssignments: false + - AlignConsecutiveDeclarations: false + - AlignTrailingComments: true + +Functions +~~~~~~~~~ + +parameter names +^^^^^^^^^^^^^^^ + +TODO + +Function names +^^^^^^^^^^^^^^ + +Function names are NamedLikeThis(). + +.. code-block:: c + + static ConfNode *ConfGetNodeOrCreate(char *name, int final) + +static vs non-static +^^^^^^^^^^^^^^^^^^^^ + +Functions should be declared static whenever possible. + +inline +^^^^^^ + +The inlining of functions should be used only in critical paths. + +Variables +~~~~~~~~~ + +Names +^^^^^ + +A variable is ``named_like_this`` in all lowercase. + +.. code-block:: c + + ConfNode *parent_node = root; + +Generally, use descriptive variable names. + +In loop vars, make sure ``i`` is a signed int type. + +Scope +^^^^^ + +TODO + +Macros +~~~~~~ + +Macro names are ALL_CAPS_WITH_UNDERSCORES. +Enclose parameters in parens on each usage inside the macro. + +Align macro values on consecutive lines. + +.. code-block:: c + + #define ACTION_ALERT 0x01 + #define ACTION_DROP 0x02 + #define ACTION_REJECT 0x04 + #define ACTION_REJECT_DST 0x08 + #define ACTION_REJECT_BOTH 0x10 + #define ACTION_PASS 0x20 + +Align escape for multi-line macros right-most at ColumnLimit. + +.. code-block:: c + + #define MULTILINE_DEF(a, b) \ + if ((a) > 2) { \ + auto temp = (b) / 2; \ + (b) += 10; \ + someFunctionCall((a), (b)); \ + } + +clang-format: + - AlignConsecutiveMacros: true [clang9]_ + - AlignEscapedNewlines: Right + +Comments +~~~~~~~~ + +TODO + +Function comments +^^^^^^^^^^^^^^^^^ + +We use Doxygen, functions are documented using Doxygen notation: + +.. code-block:: c + + /** + * \brief Helper function to get a node, creating it if it does not + * exist. + * + * This function exits on memory failure as creating configuration + * nodes is usually part of application initialization. + * + * \param name The name of the configuration node to get. + * \param final Flag to set created nodes as final or not. + * + * \retval The existing configuration node if it exists, or a newly + * created node for the provided name. On error, NULL will be returned. + */ + static ConfNode *ConfGetNodeOrCreate(char *name, int final) + +General comments +^^^^^^^^^^^^^^^^ + +We use ``/* foobar */`` style and try to avoid ``//`` style. + +File names +~~~~~~~~~~ + +File names are all lowercase and have a .c. .h or .rs (Rust) extension. + +Most files have a _subsystem_ prefix, e.g. ``detect-dsize.c, util-ip.c`` + +Some cases have a multi-layer prefix, e.g. ``util-mpm-ac.c`` + +Enums +~~~~~ + +Use a common prefix for all enum values. Value names are ALL_CAPS_WITH_UNDERSCORES. + +Put each enum values on a separate line. +Tip: Add a trailing comma to the last element to force "one-value-per-line" +formatting in clang-format. + +.. code-block:: c + + enum { VALUE_ONE, VALUE_TWO }; // <- wrong + + // right + enum { + VALUE_ONE, + VALUE_TWO, // <- force one-value-per-line + }; + +clang-format: + - AllowShortEnumsOnASingleLine: false [clang11]_ + +Structures and typedefs +~~~~~~~~~~~~~~~~~~~~~~~ + +TODO + +switch statements +~~~~~~~~~~~~~~~~~ + +Switch statements are indented like in the following example, so the 'case' is indented from the switch: + +.. code-block:: c + + switch (ntohs(p->ethh->eth_type)) { + case ETHERNET_TYPE_IP: + DecodeIPV4(tv, dtv, p, pkt + ETHERNET_HEADER_LEN, + len - ETHERNET_HEADER_LEN, pq); + break; + +Fall through cases will be commented with ``/* fall through */``. E.g.: + +.. code-block:: c + + switch (suri->run_mode) { + case RUNMODE_PCAP_DEV: + case RUNMODE_AFP_DEV: + case RUNMODE_PFRING: + /* find payload for interface and use it */ + default_packet_size = GetIfaceMaxPacketSize(suri->pcap_dev); + if (default_packet_size) + break; + /* fall through */ + default: + default_packet_size = DEFAULT_PACKET_SIZE; + + +Do not put short case labels on one line. +Put opening brace on same line as case statement. + +.. code-block:: c + + switch (a) { + case 13: { + int a = bla(); + break; + } + case 15: + blu(); + break; + default: + gugus(); + } + + +clang-format: + - IndentCaseLabels: true + - IndentCaseBlocks: false [clang11]_ + - AllowShortCaseLabelsOnASingleLine: false [llvm]_ + - BreakBeforeBraces: Custom [breakbeforebraces]_ + - BraceWrapping: + + - AfterCaseLabel: false (default) + +const +~~~~~ + +TODO + +goto +~~~~ + +Goto statements should be used with care. Generally, we use it primarily for error handling. E.g.: + +.. code-block:: c + + static DetectFileextData *DetectFileextParse (char *str) + { + DetectFileextData *fileext = NULL; + + fileext = SCMalloc(sizeof(DetectFileextData)); + if (unlikely(fileext == NULL)) + goto error; + + memset(fileext, 0x00, sizeof(DetectFileextData)); + + if (DetectContentDataParse("fileext", str, &fileext->ext, &fileext->len, &fileext->flags) == -1) { + goto error; + } + + return fileext; + + error: + if (fileext != NULL) + DetectFileextFree(fileext); + return NULL; + } + +Put goto labels at brace level. + +.. code-block:: c + + int goto_style_nested() + { + if (foo()) { + label1: + bar(); + } + + label2: + return 1; + } + +clang-format: + - IndentGotoLabels: true (default) [clang10]_ + +Includes +~~~~~~~~ + +TODO + +A .c file shall include it's own header first. + +clang-format: + - SortIncludes: false + +Unittests +~~~~~~~~~ + +When writing unittests that use a data array containing a protocol message, please put an explanatory comment that contain the readable content of the message + +So instead of: + +.. code-block:: c + + int SMTPProcessDataChunkTest02(void) + { + char mimemsg[] = {0x4D, 0x49, 0x4D, 0x45, 0x2D, 0x56, 0x65, 0x72, + +you should have something like: + +.. code-block:: c + + int SMTPParserTest14(void) + { + /* 220 mx.google.com ESMTP d15sm986283wfl.6<CR><LF> */ + static uint8_t welcome_reply[] = { 0x32, 0x32, 0x30, 0x20, + +Banned functions +~~~~~~~~~~~~~~~~ + ++------------+---------------+-----------+ +| function | replacement | reason | ++============+===============+===========+ +| strtok | strtok_r | | ++------------+---------------+-----------+ +| sprintf | snprintf | unsafe | ++------------+---------------+-----------+ +| strcat | strlcat | unsafe | ++------------+---------------+-----------+ +| strcpy | strlcpy | unsafe | ++------------+---------------+-----------+ +| strncpy | strlcat | | ++------------+---------------+-----------+ +| strncat | strlcpy | | ++------------+---------------+-----------+ +| strndup | |OS specific| ++------------+---------------+-----------+ +| strchrnul | | | ++------------+---------------+-----------+ +| rand | | | ++------------+---------------+-----------+ +| rand_r | | | ++------------+---------------+-----------+ +| index | | | ++------------+---------------+-----------+ +| rindex | | | ++------------+---------------+-----------+ +| bzero | memset | | ++------------+---------------+-----------+ + +Also, check the existing code. If yours is wildly different, it's wrong. +Example: https://github.com/oisf/suricata/blob/master/src/decode-ethernet.c + +.. rubric:: Footnotes + +.. [llvm] Default LLVM clang-format Style +.. [clang9] Requires clang 9 +.. [clang10] Requires clang 10 +.. [clang11] Requires clang 11 +.. [breakbeforebraces] BreakBeforeBraces: Mozilla is closest, but does not split empty functions/structs diff --git a/doc/userguide/devguide/codebase/contributing/code-submission-process.rst b/doc/userguide/devguide/codebase/contributing/code-submission-process.rst new file mode 100644 index 0000000..22bf160 --- /dev/null +++ b/doc/userguide/devguide/codebase/contributing/code-submission-process.rst @@ -0,0 +1,68 @@ +Code Submission Process +======================= + +.. _commits: + +Commits +~~~~~~~ + +#. Commits need to be logically separated. Don't fix unrelated things in one commit. +#. Don't add unnecessary commits, if commit 2 fixes commit 1 merge them together (squash) +#. Commits need to have proper messages, explaining anything that is non-trivial +#. Commits should not at the same time change, rename and/or move code. Use separate commits + for each of this, e.g, a commit to rename files, then a commit to change the code. +#. Documentation updates should be in their own commit (not mixed with code commits) +#. Commit messages need to be properly formatted: + * Meaningful and short (50 chars max) subject line followed by an empty line + * Naming convention: prefix message with sub-system ("rule parsing: fixing foobar"). If + you're not sure what to use, look at past commits to the file(s) in your PR. + * Description, wrapped at ~72 characters +#. Commits should be individually compilable, starting with the oldest commit. Make sure that + each commit can be built if it and the preceding commits in the PR are used. +#. Commits should be authored with the format: "FirstName LastName <name@example.com>" + +Information that needs to be part of a commit (if applicable): + +#. Ticket it fixes. E.g. "Fixes Bug #123." +#. Compiler warnings addressed. +#. Coverity Scan issues addressed. +#. Static analyzer error it fixes (cppcheck/scan-build/etc) + +When in doubt, check our git history for other messages or changes done to the +same module your're working on. This is a good example of a `commit message +<https://github.com/OISF/suricata/commit/33fca4d4db112b75ffa22eb2e6f14f038cbcc1f9>`_:: + + pcap/file: normalize file timestamps + + Normalize the timestamps that are too far in the past to epoch. + + Bug: #6240. + +.. _pull-requests-criteria: + +Pull Requests +~~~~~~~~~~~~~ + +A github pull request is actually just a pointer to a branch in your tree. GitHub provides a review interface that we use. + +#. A branch can only be used in for an individual PR. +#. A branch should not be updated after the pull request +#. A pull request always needs a good description (link to issue tracker if related to a ticket). +#. Incremental pull requests need to link to the prior iteration +#. Incremental pull requests need to describe changes since the last PR +#. Link to the ticket(s) that are addressed to it. +#. When fixing an issue, update the issue status to ``In Review`` after submitting the PR. +#. Pull requests are automatically tested using github actions (https://github.com/OISF/suricata/blob/master/.github/workflows/builds.yml). + Failing builds won't be considered and should be closed immediately. +#. Pull requests that change, or add a feature should include a documentation update commit + +Tests and QA +~~~~~~~~~~~~ + +As much as possible, new functionality should be easy to QA. + +#. Add ``suricata-verify`` tests for verification. See https://github.com/OISF/suricata-verify +#. Add unittests if a ``suricata-verify`` test isn't possible. +#. Provide pcaps that reproduce the problem. Try to trim as much as possible to the pcap includes the minimal + set of packets that demonstrate the problem. +#. Provide example rules if the code added new keywords or new options to existing keywords diff --git a/doc/userguide/devguide/codebase/contributing/contribution-process.rst b/doc/userguide/devguide/codebase/contributing/contribution-process.rst new file mode 100644 index 0000000..198ff14 --- /dev/null +++ b/doc/userguide/devguide/codebase/contributing/contribution-process.rst @@ -0,0 +1,271 @@ +************************ +Contributing to Suricata +************************ + +This guide describes what steps to take if you want to contribute a patch or +patchset to Suricata. + +Essentially, these are: + +#. Agree to and sign our :ref:`Contribution Agreement<contribution-agreement>` +#. Communicate early, and use the :ref:`preferred channels <communication-channels>` +#. :ref:`claim-ticket` +#. :ref:`Fork from master <what-branch-to-work-on>` +#. Follow our :ref:`Coding Style` +#. Use our :ref:`documentation-style` +#. Stick to our :ref:`commit guidelines<commits>` +#. Add version numbers to your :ref:`Pull Requests <send-a-pull-request>` +#. Incorporate :ref:`feedback` into new PRs +#. [Work merged] :ref:`Wrap up! <wrap-up>` + +The rest of this document will cover those in detail. + +.. _contribution-agreement: + +.. note:: Important! + + Before contributing, please review and sign our `Contribution Agreement + <https://suricata.io/contribution-agreements/>`_. + +.. _communication-channels: + +Communication is Key! +===================== + +To clarify questions, discuss or suggest new features, talk about bugs and +optimizations, and/or ask for help, it is important to communicate. + +These are our main channels: + +- `Suricata's issue tracker <https://redmine.openinfosecfoundation.org/ + projects/suricata/issues>`_ +- `Suricata's forum <https://forum.suricata.io/c/developers/8>`_ +- `Suricata's Discord server <https://discord.com/invite/t3rV2x7MrG>`_ + + +.. _claim-ticket: + +Claim (or open) a ticket +======================== + +For features and bugs we need `tickets <https://redmine.openinfosecfoundation. +org/projects/suricata/issues>`_. Tickets help us keep track of the work done, +indicate when changes need backports etc. + +They are also important if you would like to see your new feature officially +added to our tool: the ticket documents your ideas so we can analyze how do they +fit in our plans for Suricata, and, if the feature is accepted, we can properly +track progress etc. + +.. note:: If you want to add new functionalities (e.g. a new application layer + protocol), please ask us first whether we see that being merged into + Suricata or not. This helps both sides understand how the new feature will + fit in our roadmap, and prevents wasting time and motivation with + contributions that we may not accept. Therefore, `before` starting any code + related to a new feature, do request comments from the team about it. + +For really trivial fixes or cleanups we won't need that. + +Once work on the issue has been agreed upon: + +Assign the ticket to yourself. For this, you will need to have the "developer" +role. You can ask for that directly on the ticket you want to claim or mention +that you are interested in working on `ticket number` on our `Developer's +channel on Discord <https://discord.com/channels/864648830553292840/ +888087709002891324>`_. + +If a ticket is already assigned to someone, please reach out on the ticket or +ask the person first. + +You can reach out to other community members via `Suricata's Discord server +<https://discord.com/invite/t3rV2x7MrG>`_. + + +Expectations +============ + +If you submit a new feature that is not part of Suricata's core functionalities, +it will have the `community supported`_ status. This means we would expect some +commitment from you, or the organization who is sponsoring your work, before we +could approve the new feature, as the Suricata development team is pretty lean +(and many times overworked). + +This means we expect that: + + * the new contribution comes with a set of Suricata-verify tests (and + possibly unit tests, where those apply), before we can approve it; + * proof of compatibility with existing keywords/features is provided, + when the contribution is for replacing an existing feature; + * you would maintain the feature once it is approved - or some other + community member would do that, in case you cannot. + +.. note:: + + Regardless of contribution size or complexity, we expect that you respect + our guidelines and processes. We appreciate community contributors: + Suricata wouldn't be what it is without them; and the value of our tool and + community also comes from how seriously we take all this, so we ask that + our contributors do the same! + +.. _community supported: + +What does "community supported" and "supporting a feature" mean? +----------------------------------------------------------------- + +If a feature is *community supported*, the Suricata team will try to spend +minimal time on it - to be able to focus on the core functionalities. If for any +reason you're not willing or able to commit to supporting a feature, please +indicate this. + +The team and/or community members can then consider offering help. It is best +to indicate this prior to doing the actual work, because we will reject features +if no one steps up. + +It is also important to note that *community supported* features will be +disabled by default, and if it brings in new dependencies (libraries or Rust +crates) those will also be optional and disabled by default. + +**Supporting a feature** means to actually *maintain* it: + +* fixing bugs +* writing documentation +* keeping it up to date +* offering end-user support via forum and/or Discord chat + +.. _stale-tickets-policy: + +Stale tickets policy +==================== + +We understand that people's availability and interested to volunteer their time +to our project may change. Therefore, to prevent tickets going stale (not worked +on), and issues going unsolved for a long time, we have a policy to unclaim +tickets if there are no contribution updates within 6 months. + +If you claim a ticket and later on find out that you won't be able to work on +it, it is also appreciated if you inform that to us in the ticket and unclaim +it, so everyone knows that work is still open and waiting to be done. + +.. _what-branch-to-work-on: + +What branch to work on +====================== + +There are 2 or 3 active branches: + + * master-x.x.x (e.g. master-6.x.y) + * master + +The former is the stable branch. The latter the development branch. + +The stable branch should only be worked on for important bug fixes. Those are +mainly expected from more experienced contributors. + +Development of new features or large scale redesign is done in the development +branch. New development and new contributors should work with ``master`` except +in very special cases - which should and would be discussed with us first. + +If in doubt, please reach out to us via :ref:`Redmine, Discord or +forum <communication-channels>`. + +.. _create-your-own-branch: + +Create your own branch +====================== + +It's useful to create descriptive branch names. You're working on ticket 123 to +improve GeoIP? Name your branch "geoip-feature-123-v1". The "-v1" addition is +for feedback. When incorporating feedback you will have to create a new branch +for each pull request. So, when you address the first feedback, you will work in +"geoip-feature-123-v2" and so on. + +For more details check: `Creating a branch to do your changes <https://redmine. +openinfosecfoundation.org/projects/suricata/wiki/GitHub_work_flow#Creating-a- +branch-to-do-your-changes>`_ + + +Coding Style +============ + +We have a :ref:`Coding Style` that must be followed. + +.. _documentation-style: + +Documentation Style +=================== + +For documenting *code*, please follow Rust documentation and/or Doxygen +guidelines, according to what your contribution is using (Rust or C). + +If you are writing or updating *documentation pages*, please: + +* wrap up lines at 79 (80 at most) characters; +* when adding diagrams or images, we prefer alternatives that can be generated + automatically, if possible; +* bear in mind that our documentation is published on `Read the Docs <https:/ + /docs.suricata.io/en/latest/#suricata-user-guide>`_ and can also be + built to pdf, so it is important that it looks good in such formats. + + +Commit History matters +====================== + +Please consider our :ref:`Commit guidelines <commits>` before submitting your PR. + +.. _send-a-pull-request: + +Send a Pull Request +=================== + +The pull request is used to request inclusion of your patches into the main +repository. Before it is merged, it will be reviewed and pushed through a QA +process. + +Please consider our :ref:`Pull Requests Criteria <pull-requests-criteria>` when +submitting. + +We have 'GitHub-CI' integration enabled. This means some automated build check, +suricata-verity and unit tests are performed on the pull request. Generally, +this is ready after a few minutes. If the test fails, the pull request won't be +considered. So please, when you submit something, keep an eye on the checks, +and address any failures - if you do not understand what they are, it is fine to +ask about them on the failing PR itself. + +Before merge, we also perform other integration tests in our private QA-lab. If +those fail, we may request further changes, even if the GitHub-CI has passed. + +.. _feedback: + +Feedback +======== + +You'll likely get some feedback. Even our most experienced devs do, so don't +feel bad about it. + +After discussing what needs to be changed (usually on the PR itself), it's time +to go back to ":ref:`create-your-own-branch`" and do it all again. This process +can iterate quite a few times, as the contribution is refined. + +.. _wrap-up: + +Wrapping up +=========== + +Merged! Cleanup +--------------- + +Congrats! Your change has been merged into the main repository. Many thanks! + +We strongly suggest cleaning up: delete your related branches, both locally and +on GitHub - this helps you in keeping things organized when you want to make new +contributions. + +.. _update-ticket: + +Update ticket +------------- + +You can now put the URL of the *merged* pull request in the Redmine ticket. +Next, mark the ticket as "Closed" or "Resolved". + +Well done! You are all set now. diff --git a/doc/userguide/devguide/codebase/contributing/github-pr-workflow.rst b/doc/userguide/devguide/codebase/contributing/github-pr-workflow.rst new file mode 100644 index 0000000..618c966 --- /dev/null +++ b/doc/userguide/devguide/codebase/contributing/github-pr-workflow.rst @@ -0,0 +1,46 @@ +GitHub Pull Request Workflow +============================ + +Draft Pull Requests +~~~~~~~~~~~~~~~~~~~ + +A Pull Request (PR) should be marked as `draft` if it is not intended to be merged as is, +but is waiting for some sort of feedback. +The author of the PR should be explicit with what kind of feedback is expected +(CI/QA run, discussion on the code, etc...) + +GitHub filter is ``is:pr is:open draft:true sort:updated-asc`` + +A draft may be closed if it has not been updated in two months. + +Mergeable Pull Requests +~~~~~~~~~~~~~~~~~~~~~~~ + +When a Pull Request is intended to be merged as is, the workflow is the following: + 1. get reviewed, and either request changes or get approved + 2. if approved, get staged in a next branch (with other PRs), wait for CI validation + (and eventually request changes if CI finds anything) + 3. get merged and closed + +A newly created PR should match the filter +``is:pr is:open draft:false review:none sort:updated-asc no:assignee`` +The whole team is responsible to assign a PR to someone precise within 2 weeks. + +When someone gets assigned a PR, the PR should get a review status within 2 weeks: +either changes requested, approved, or assigned to someone else if more +expertise is needed. + +GitHub filter for changes-requested PRs is ``is:pr is:open draft:false sort: +updated-asc review:changes-requested`` + +Such a PR may be closed if it has not been updated in two months. +It is expected that the author creates a new PR with a new version of the patch +as described in :ref:`Pull Requests Criteria <pull-requests-criteria>`. + +Command to get approved PRs is ``gh pr list --json number,reviewDecision --search +"state:open type:pr -review:none" | jq '.[] | select(.reviewDecision=="")'`` + +Web UI filter does not work cf https://github.com/orgs/community/discussions/55826 + +Once in approved state, the PRs are in the responsibility of the merger, along +with the next branches/PRs. diff --git a/doc/userguide/devguide/codebase/contributing/index.rst b/doc/userguide/devguide/codebase/contributing/index.rst new file mode 100644 index 0000000..e0d2912 --- /dev/null +++ b/doc/userguide/devguide/codebase/contributing/index.rst @@ -0,0 +1,9 @@ +Contributing +============ + +.. toctree:: + :maxdepth: 2 + + contribution-process + code-submission-process + github-pr-workflow diff --git a/doc/userguide/devguide/codebase/fuzz-testing.rst b/doc/userguide/devguide/codebase/fuzz-testing.rst new file mode 100644 index 0000000..cd8b6ed --- /dev/null +++ b/doc/userguide/devguide/codebase/fuzz-testing.rst @@ -0,0 +1,31 @@ +Fuzz Testing +============ + +To enable fuzz targets compilation, add `--enable-fuzztargets` to configure. + +.. note:: This changes various parts of Suricata making the `suricata` binary + unsafe for production use. + +The targets can be used with libFuzzer, AFL and other fuzz platforms. + + +Running the Fuzzers +------------------- + +TODO. For now see src/tests/fuzz/README + +Reproducing issues +^^^^^^^^^^^^^^^^^^ + + +Extending Coverage +------------------ + +Adding Fuzz Targets +------------------- + + +Oss-Fuzz +-------- + +Suricata is continuously fuzz tested in Oss-Fuzz. See https://github.com/google/oss-fuzz/tree/master/projects/suricata diff --git a/doc/userguide/devguide/codebase/img/InputCaptureExample.png b/doc/userguide/devguide/codebase/img/InputCaptureExample.png Binary files differnew file mode 100644 index 0000000..7fc3519 --- /dev/null +++ b/doc/userguide/devguide/codebase/img/InputCaptureExample.png diff --git a/doc/userguide/devguide/codebase/index.rst b/doc/userguide/devguide/codebase/index.rst new file mode 100644 index 0000000..f6cb955 --- /dev/null +++ b/doc/userguide/devguide/codebase/index.rst @@ -0,0 +1,13 @@ +Working with the Codebase +========================= + +.. toctree:: + :maxdepth: 2 + + contributing/index.rst + installation-from-git + code-style + fuzz-testing + testing + unittests-c + unittests-rust diff --git a/doc/userguide/devguide/codebase/installation-from-git.rst b/doc/userguide/devguide/codebase/installation-from-git.rst new file mode 100644 index 0000000..9d7a45a --- /dev/null +++ b/doc/userguide/devguide/codebase/installation-from-git.rst @@ -0,0 +1,154 @@ +.. _Installation from GIT: + +Installation from GIT +===================== + +Ubuntu Installation from GIT +---------------------------- + +This document will explain how to install and use the most recent code of +Suricata on Ubuntu. Installing from GIT on other operating systems is +basically the same, except that some commands are Ubuntu-specific +(like sudo and apt-get). In case you are using another operating system, +you should replace those commands with your OS-specific commands. + +.. note:: + + These instructions were tested on Ubuntu 22.04. + +Pre-installation requirements +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before you can build Suricata for your system, run the following command +to ensure that you have everything you need for the installation. + +.. code-block:: bash + + sudo apt-get -y install libpcre2-dev build-essential autoconf \ + automake libtool libpcap-dev libnet1-dev libyaml-0-2 libyaml-dev \ + pkg-config zlib1g zlib1g-dev libcap-ng-dev libcap-ng0 make \ + libmagic-dev libjansson-dev rustc cargo jq git-core + +Add ``${HOME}/.cargo/bin`` to your path: + +.. code-block:: bash + + export PATH=$PATH:${HOME}/.cargo/bin + cargo install --force cbindgen + +Depending on the current status of your system, it may take a while to +complete this process. + +**IPS** + +By default, Suricata works as an IDS. If you want to use it as an IDS and IPS +program, enter: + +.. code-block:: bash + + sudo apt-get -y install libnetfilter-queue-dev libnetfilter-queue1 \ + libnfnetlink-dev libnfnetlink0 + +Suricata +~~~~~~~~ + +First, it is convenient to create a directory for Suricata. +Name it 'suricata' or 'oisf', for example. Open the terminal and enter: + +.. code-block:: bash + + mkdir suricata # mkdir oisf + +Followed by: + +.. code-block:: bash + + cd suricata # cd oisf + +Next, enter the following line in the terminal: + +.. code-block:: bash + + git clone https://github.com/OISF/suricata.git + cd suricata + +Libhtp and suricata-update are not bundled. Get them by doing: + +.. code-block:: bash + + ./scripts/bundle.sh + +Followed by: + +.. code-block:: bash + + ./autogen.sh + +To configure, please enter: + +.. code-block:: bash + + ./configure + +To compile, please enter: + +.. code-block:: bash + + make + +To install Suricata, enter: + +.. code-block:: bash + + sudo make install + sudo ldconfig + +Auto-setup +~~~~~~~~~~ + +You can also use the available auto-setup features of Suricata. Ex: + +.. code-block:: bash + + ./configure && make && sudo make install-conf + +*make install-conf* +would do the regular "make install" and then it would automatically +create/setup all the necessary directories and ``suricata.yaml`` for you. + +.. code-block:: bash + + ./configure && make && make install-rules + +*make install-rules* +would do the regular "make install" and then it would automatically download +and set-up the latest ruleset from Emerging Threats available for Suricata. + +.. code-block:: bash + + ./configure && make && make install-full + +*make install-full* +would combine everything mentioned above (install-conf and install-rules) - +and will present you with a ready to run (configured and set-up) Suricata. + +Post installation +~~~~~~~~~~~~~~~~~ + +Please continue with :ref:`Basic setup`. + +In case you have already created your Suricata directory and cloned the +repository in it, if you want to update your local repository with the +most recent code, please run: + +.. code-block:: bash + + cd suricata/suricata + +next, enter: + +.. code-block:: bash + + git pull + +After that, you should run *./autogen.sh* again. diff --git a/doc/userguide/devguide/codebase/testing.rst b/doc/userguide/devguide/codebase/testing.rst new file mode 100644 index 0000000..c712e90 --- /dev/null +++ b/doc/userguide/devguide/codebase/testing.rst @@ -0,0 +1,158 @@ +**************** +Testing Suricata +**************** + +.. contents:: Table of Contents + +General Concepts +================ + +There are a few ways of testing Suricata: + +- **Unit tests**: for independently checking specific functions or portions of code. This guide has specific sections to + further explain those, for C and Rust; +- `Suricata-Verify <https://github.com/OISF/suricata-verify>`_: those are used to check more complex behavior, like the log output or the alert counts for a given input, where that input is usually comprised of several packets; +- **Static and dynamic analysis tools**: to help in finding bugs, memory leaks and other issues (like `scan-build <https://clang-analyzer.llvm.org/scan-build.html#scanbuild_basicusage>`_, from `clang`, which is also used for our C formatting checks; or ASAN, which checks for memory issues); +- **Fuzz testing**: especially good for uncovering existing, often non-trivial bugs. For more on how to fuzz test Suricata, check :doc:`fuzz-testing`; +- **CI checks**: each PR submitted to the project's public repositories will be run against a suit of Continuous Integration + workflows, as part of our QA process. Those cover: formatting and commit checks; fuzz tests (CI Fuzz), and several builds. See our `github workflows <https://github.com/OISF/suricata/tree/master/.github/workflows>`_ for details and those in + action at `<https://github.com/OISF/suricata/actions>`_. + + .. note:: If you can run unit tests or other checks and report failures in our `issue tracker <https://redmine.openinfosecfoundation.org/projects/suricata/issues>`_, that is rather useful and appreciated! + +The focus of this document are Unit tests and Suricata-Verify tests, especially on offering some guidance regarding when to use each type of test, and how to prepare input +for them. + +Unit tests +========== + +Use these to check that specific functions behave as expected, in success and in failure scenarios. Specially useful +during development, for nom parsers in the Rust codebase, for instance, or for checking that messages +or message parts of a protocol/stream are processed as they should. + +To execute all unit tests (both from C and Rust code), as well as ``libhtp`` ones, from the Suricata main directory, run:: + + make check + +Check the Suricata Devguide on :doc:`unittests-c` or :doc:`unittests-rust` for more on how to write and run unit tests, +given that the way to do so differs, depending on the language. + +Code Examples +^^^^^^^^^^^^^ + +An example from the `DNS parser <https://github.com/OISF/suricata/blob/master/rust/src/dns/parser.rs#L417>`_. This +checks that the given raw input (note the comments indicating what it means), once processed by ``dns_parse_name`` yields +the expected result, including the unparsed portion. + +.. code-block:: rust + + /// Parse a simple name with no pointers. + #[test] + fn test_dns_parse_name() { + let buf: &[u8] = &[ + 0x09, 0x63, /* .......c */ + 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x2d, 0x63, 0x66, /* lient-cf */ + 0x07, 0x64, 0x72, 0x6f, 0x70, 0x62, 0x6f, 0x78, /* .dropbox */ + 0x03, 0x63, 0x6f, 0x6d, 0x00, 0x00, 0x01, 0x00, /* .com.... */ + ]; + let expected_remainder: &[u8] = &[0x00, 0x01, 0x00]; + let (remainder,name) = dns_parse_name(buf, buf).unwrap(); + assert_eq!("client-cf.dropbox.com".as_bytes(), &name[..]); + assert_eq!(remainder, expected_remainder); + } + +From the C side, ``decode-ethernet.c`` offers an good example: + +.. code-block:: c + + /** + * Test a DCE ethernet frame that is too small. + */ + static int DecodeEthernetTestDceTooSmall(void) + { + uint8_t raw_eth[] = { + 0x00, 0x10, 0x94, 0x55, 0x00, 0x01, 0x00, 0x10, + 0x94, 0x56, 0x00, 0x01, 0x89, 0x03, + }; + + Packet *p = PacketGetFromAlloc(); + FAIL_IF_NULL(p); + ThreadVars tv; + DecodeThreadVars dtv; + + memset(&dtv, 0, sizeof(DecodeThreadVars)); + memset(&tv, 0, sizeof(ThreadVars)); + + DecodeEthernet(&tv, &dtv, p, raw_eth, sizeof(raw_eth)); + + FAIL_IF_NOT(ENGINE_ISSET_EVENT(p, DCE_PKT_TOO_SMALL)); + + PacketFree(p); + PASS; + } + +Suricata-Verify +=============== + +As mentioned above, these tests are used to check more complex behavior that involve a complete flow, with exchange of requests and responses. This can be done in an easier and more straightforward way, +since one doesn't have to simulate the network traffic and Suricata engine mechanics - one simply runs it, with the desired input packet capture, +configuration and checks. + +A Suricata-verify test can help to ensure that code refactoring doesn't affect protocol logs, or signature detection, +for instance, as this could have a major impact to Suricata users and integrators. + +For simpler tests, providing the pcap input is enough. But it is also possible to provide Suricata rules to be +inspected, and have Suricata Verify match for alerts and specific events. + +Refer to the `Suricata Verify readme <https://github.com/OISF/suricata-verify#readme>`_ for details on how to create +this type of test. It suffices to have a packet capture representative of the behavior one wants to test, and then +follow the steps described there. + +The Git repository for the Suricata Verify tests is a great source for examples, like the `app-layer-template <https://github.com/OISF/suricata-verify/tree/master/tests/app-layer-template>`_ one. + +Generating Input +================ + +Using real traffic +^^^^^^^^^^^^^^^^^^ + +Having a packet capture for the desired protocol you want to test, open it in `Wireshark <https://www.wireshark.org/>`_, and select the specific +packet chosen for the test input, then use the Wireshark option ``Follow [TCP/UDP/HTTP/HTTP2/QUIC] Stream``. This allows for inspecting the whole network traffic stream in a different window. +There, it's possible to choose to ``Show and save data as`` ``C Arrays``, as well as to select if one wants to see the whole conversation or just **client** or **server** packets. +It is also possible to reach the same effect by accessing the **Analyze->Follow->TCP Stream** top menu in Wireshark. +(There are other stream options, the available one will depend on the type of network traffic captured). + +This option will show the packet data as hexadecimal compatible with C-array style, and easily adapted for Rust, +as well. As shown in the image: + +.. image:: img/InputCaptureExample.png + +Wireshark can be also used to `capture sample network traffic <https://gitlab.com/wireshark/wireshark/-/wikis/CaptureSetup>`_ and generate pcap files. + +Crafting input samples with Scapy +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is also possible to use Scapy to create specific traffic: `Scapy usage +<https://scapy.readthedocs.io/en/latest/usage.html>`_ + +Suricata-verify tests have several examples of pcaps generated in such a way. Look for Python scripts like the one used +for the `dce-udp-scapy +<https://github.com/OISF/suricata-verify/blob/master/tests/dcerpc/dcerpc-udp-scapy/dcerpc_udp_scapy.py>`_. + +Other examples from our Suricata-Verify tests: +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Going through Suricata-Verify tests `readme` files it is also possible to find an assorted collection of pcap generation possibilities, some with explanation on the how-tos. To list a few: + +- `http2-range <https://github.com/OISF/suricata-verify/blob/master/tests/http2-range/README.md>`_ +- `http-range <https://github.com/OISF/suricata-verify/blob/master/tests/http-range/README.md>`_ +- `smb2-delete <https://github.com/OISF/suricata-verify/blob/master/tests/smb2-delete/README.md>`_ +- `smtp-rset <https://github.com/OISF/suricata-verify/blob/master/tests/smtp-rset/README.md>`_ +- `http-auth-unrecognized <https://github.com/OISF/suricata-verify/blob/master/tests/http-auth-unrecognized/README.md>`_ + +Finding Capture Samples +^^^^^^^^^^^^^^^^^^^^^^^ + +If you can't capture traffic for the desired protocol from live traffic, or craft something up, you can try finding the type of traffic you +are interested in in public data sets. There's a thread for `Sharing good sources of sample captures +<https://forum.suricata.io/t/sharing-good-sources-of-sample-captures/1766/4>`_ in our forum. diff --git a/doc/userguide/devguide/codebase/unittests-c.rst b/doc/userguide/devguide/codebase/unittests-c.rst new file mode 100644 index 0000000..0ae7bdf --- /dev/null +++ b/doc/userguide/devguide/codebase/unittests-c.rst @@ -0,0 +1,148 @@ +************** +Unit Tests - C +************** + +Unit tests are a great way to create tests that can check the internal state +of parsers, structures and other objects. + +Tests should: + +- use ``FAIL``/``PASS`` macros +- be deterministic +- not leak memory on ``PASS`` +- not use conditions + +Unit tests are used by developers of Suricata and advanced users who would like to contribute by debugging and testing the engine. +Unit tests are small pieces (units) of code which check certain code functionalities in Suricata. If Suricata's code is modified, developers can run unit tests to see if there are any unforeseen effects on other parts of the engine's code. +Unit tests will not be compiled with Suricata by default. +If you would like to compile Suricata with unit tests, enter the following during the configure-stage:: + + ./configure --enable-unittests + +The unit tests specific command line options can be found at `Command Line Options <https://docs.suricata.io/en/suricata-6.0.3/command-line-options.html#unit-tests>`_. + +Example: +You can run tests specifically on flowbits. This is how you should do that:: + + suricata -u -U flowbit + +It is highly appreciated if you would run unit tests and report failing tests in our `issue tracker +<https://redmine.openinfosecfoundation.org/projects/suricata/issues>`_. + +If you want more info about the unittests, regular debug mode can help. This is enabled by adding the configure option:: + + --enable-debug + +Then, set the debug level from the command-line:: + + SC_LOG_LEVEL=Debug suricata -u + +This will be very verbose. You can also add the ``SC_LOG_OP_FILTER`` to limit the output, it is grep-like:: + + SC_LOG_LEVEL=Debug SC_LOG_OP_FILTER="(something|somethingelse)" suricata -u + +This example will show all lines (debug, info, and all other levels) that contain either something or something else. +Keep in mind the `log level <https://docs.suricata.io/en/latest/manpages/suricata.html#id1>`_ precedence: if you choose *Info* level, for instance, Suricata won't show messages from the other levels. + +Writing Unit Tests - C codebase +=============================== + +Suricata unit tests are somewhat different in C and in Rust. In C, they are comprised of a function with no arguments and returning 0 for failure or 1 for success. Instead of explicitly returning a value, FAIL_* and PASS macros should be used. For example: + +.. code-block:: c + + void MyUnitTest(void) + { + int n = 1; + void *p = NULL; + + FAIL_IF(n != 1); + FAIL_IF_NOT(n == 1); + FAIL_IF_NOT_NULL(p); + FAIL_IF_NULL(p); + + PASS; + } + +Each unit test needs to be registered with ``UtRegisterTest()``. Example:: + + UtRegisterTest("MyUnitTest", MyUnitTest); + +where the first argument is the name of the test, and the second argument is the function. Existing modules should already have a function that registers its unit tests. Otherwise the unit tests will need to be registered. Look for a module similar to your new module to see how best to register the unit tests or ask the development team for help. + +Examples +-------- + +From ``conf-yaml-loader.c``: + +.. code-block:: c + + /** + * Test that a configuration section is overridden but subsequent + * occurrences. + */ + static int + ConfYamlOverrideTest(void) + { + char config[] = + "%YAML 1.1\n" + "---\n" + "some-log-dir: /var/log\n" + "some-log-dir: /tmp\n" + "\n" + "parent:\n" + " child0:\n" + " key: value\n" + "parent:\n" + " child1:\n" + " key: value\n" + ; + const char *value; + + ConfCreateContextBackup(); + ConfInit(); + + FAIL_IF(ConfYamlLoadString(config, strlen(config)) != 0); + FAIL_IF_NOT(ConfGet("some-log-dir", &value)); + FAIL_IF(strcmp(value, "/tmp") != 0); + + /* Test that parent.child0 does not exist, but child1 does. */ + FAIL_IF_NOT_NULL(ConfGetNode("parent.child0")); + FAIL_IF_NOT(ConfGet("parent.child1.key", &value)); + FAIL_IF(strcmp(value, "value") != 0); + + ConfDeInit(); + ConfRestoreContextBackup(); + + PASS; + } + +In ``detect-ike-chosen-sa.c``, it is possible to see the freeing of resources (``DetectIkeChosenSaFree``) and the +function that should group all the ``UtRegisterTest`` calls: + +.. code-block:: c + + #ifdef UNITTESTS + . + . + . + static int IKEChosenSaParserTest(void) + { + DetectIkeChosenSaData *de = NULL; + de = DetectIkeChosenSaParse("alg_hash=2"); + + FAIL_IF_NULL(de); + FAIL_IF(de->sa_value != 2); + FAIL_IF(strcmp(de->sa_type, "alg_hash") != 0); + + DetectIkeChosenSaFree(NULL, de); + PASS; + } + + #endif /* UNITTESTS */ + + void IKEChosenSaRegisterTests(void) + { + #ifdef UNITTESTS + UtRegisterTest("IKEChosenSaParserTest", IKEChosenSaParserTest); + #endif /* UNITTESTS */ diff --git a/doc/userguide/devguide/codebase/unittests-rust.rst b/doc/userguide/devguide/codebase/unittests-rust.rst new file mode 100644 index 0000000..c11d6e4 --- /dev/null +++ b/doc/userguide/devguide/codebase/unittests-rust.rst @@ -0,0 +1,91 @@ +***************** +Unit tests - Rust +***************** + +Rust tests with Cargo check +=========================== + +Rust offers a built-in tool for running unit and integration tests. To do so, one makes usage of: + +.. code-block:: rust + + cargo test [options][testname][-- test-options] + +`The Cargo Book <https://doc.rust-lang.org/cargo/commands/cargo-test.html>`_ explains all options in more detail. + +For testing a specific Rust module from Suricata, it suffices to go to the ``rust`` directory and run the above command, +specifying the desired module (like ``http2``). + +.. code-block:: rust + + cargo test http2 + +The line above will make *rustc* compile the Rust side of Suricata and run unit tests in the ``http2`` rust module. + +For running all Suricata unit tests from our Rust codebase, just run ``cargo test``. + +Adding unit tests +================= + + .. note:: If you want to understand *when* to use a unit test, please read the devguide section on :doc:`testing`. + +In general, it is preferable to have the unit tests in the same file that they test. At the end of the file, after all other functions. Add a ``tests`` module, if there isn't one yet, and add the ``#[test]`` attribute before the unit test +function. It is also necessary to import (``use``) the module to test, as well as any other modules used. As seen in the example below: + +Example +------- + +From ``nfs > rpc_records.rs``: + +.. code-block:: rust + + mod tests { + use crate::nfs::rpc_records::*; + use nom::Err::Incomplete; + use nom::Needed::Size; + + #[test] + fn test_partial_input_ok() { + let buf: &[u8] = &[ + 0x80, 0x00, 0x00, 0x9c, // flags + 0x8e, 0x28, 0x02, 0x7e, // xid + 0x00, 0x00, 0x00, 0x01, // msgtype + 0x00, 0x00, 0x00, 0x02, // rpcver + 0x00, 0x00, 0x00, 0x03, // program + 0x00, 0x00, 0x00, 0x04, // progver + 0x00, 0x00, 0x00, 0x05, // procedure + ]; + let expected = RpcRequestPacketPartial { + hdr: RpcPacketHeader { + frag_is_last: true, + frag_len: 156, + xid: 2384986750, + msgtype: 1 + }, + rpcver: 2, + program: 3, + progver: 4, + procedure: 5 + }; + let r = parse_rpc_request_partial(buf); + match r { + Ok((rem, hdr)) => { + assert_eq!(rem.len(), 0); + assert_eq!(hdr, expected); + }, + _ => { panic!("failed {:?}",r); } + } + } + } + +Once that is done, Rust should recognize the new test. If you want to check a single test, run:: + + cargo test module::file_name::tests::test_name + +Where ``tests`` refers to ``mod tests``. If you know the test name is unique, you can even run:: + + cargo test test_name + +Following the same idea, it is also possible to test specific modules or submodules. For instance:: + + cargo test nfs::rpc_records diff --git a/doc/userguide/devguide/extending/app-layer/app-layer-frames.rst b/doc/userguide/devguide/extending/app-layer/app-layer-frames.rst new file mode 100644 index 0000000..a810f55 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/app-layer-frames.rst @@ -0,0 +1,223 @@ +******************************* +Application Layer Frame Support +******************************* + +.. contents:: Table of Contents + +Baseline +======== + +- `Suricata rules format <https://docs.suricata.io/en/latest/rules/intro.html>`_ + +General Concepts +================ + +Frame support was introduced with Suricata 7.0. Up until 6.0.x, Suricata's architecture and state of parsers meant that the network traffic available to the detection engine was just a stream of data, without detail about higher level parsers. + +.. note:: For Suricata, *Frame* is a generic term that can represent any unit of network data we are interested in, which could be comprised of one or several records of other, lower level protocol(s). Frames work as "stream annotations", allowing Suricata to tell the detection engine what type of record exists at a specific offset in the stream. + +The normal pipeline of detection in Suricata implied that: + +- Certain rules could be quite costly performance-wise. This happened because the same stream could be inspected several times for different rules, since for certain signatures the detection is done when Suricata is still inspecting a lower level stream, not the application layer protocol (e.g., *TCP* traffic, in place of *SMB* one); +- Rules could be difficult and tedious to write (and read), requiring that writers went in byte-detail to express matching on specific payload patterns. + +What the Frame support offers is the ability to "point" to a specific portion of stream and identify what type of traffic Suricata is looking at. Then, as the engine reassembles the stream, one can have "read access" to that portion of the stream, aggregating concepts like what type of application layer protocol that is, and differentiating between ``header``, ``data`` or even protocol versions (*SMB1*, *SMB2*...). + +The goal of the stream *Frame* is to expose application layer protocol `PDUs <https://en.wikipedia.org/wiki/Protocol_data_unit>`_ and other such arbitrary elements to the detection engine directly, instead of relying on Transactions. The main purpose is to bring *TCP data* processing times down by specialising/ filtering down traffic detection. + +Adding Frame Support to a Parser +================================ + +The application layer parser exposes frames it supports to the detect engine, by tagging them as they're parsed. The rest works automatically. + +In order to allow the engine to identify frames for records of a given application layer parser, thought must be given as to which frames make sense for the specific protocol you are handling. Some parsers may have clear ``header`` and ``data`` fields that form its *protocol data unit* (pdu). For others, the distinction might be between ``request`` and ``response``, only. Whereas for others it may make sense to have specific types of data. This is better understood by seeing the different types of frame keywords, which vary on a per-protocol basis. + +It is also important to keep follow naming conventions when defining Frame Types. While a protocol may have strong naming standards for certain structures, do compare those with what Suricata already has registered: + +- ``hdr``: used for the record header portion +- ``data``: is used for the record data portion +- ``pdu``: unless documented otherwise, means the whole record, comprising ``hdr`` and ``data`` +- ``request``: a message from a client to a server +- ``response``: a message from a server to a client + +Basic steps +~~~~~~~~~~~ + +Once the frame types that make sense for a given protocol are defined, the basic steps for adding them are: + +- create an enum with the frame types; +- identify the parsing function(s) where application layer records are parsed; +- identify the correct moment to register the frames; +- use the Frame API calls directly or build upon them and use your functions to register the frames; +- register the relevant frame callbacks when registering the parser. + +Once these are done, you can enable frame eve-output to confirm that your frames are being properly registered. It is important to notice that some hard coded limits could influence what you see on the logs (max size of log output; type of logging for the payload, cf. https://redmine.openinfosecfoundation.org/issues/4988). + +If all the steps are successfully followed, you should be able to write a rule using the *frame* keyword and the frame types you have registered with the application layer parser. + +Using the *SMB* parser as example, before frame support, a rule would look like:: + + alert tcp ... flow:to_server; content:"|ff|SMB"; content:"some smb 1 issue"; + +With frame support, one is able to do:: + + alert smb ... flow:to_server; frame:smb1.data; content:"some smb 1 issue"; + +Implementation Examples & API Callbacks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Though the steps are the same, there are a few differences when implementing frame support in Rust or in C. The following sections elaborate on that, as well as on the process itself. (Note that the code snippets have omitted portions of code that weren't so relevant to this document). + +Rust +---- + +This section shows how Frame support is added in Rust, using examples from the `SIP parser <https://github.com/OISF/suricata/blob/master/rust/src/sip/sip.rs>`_, and the `telnet parser <https://github.com/OISF/suricata/blob/master/rust/src/telnet/telnet.rs>`_. + +**Define the frame types**. The frame types are defined as an enum. In Rust, make sure to derive from the ``AppLayerFrameType``: + +.. literalinclude:: ../../../../../rust/src/sip/sip.rs + :caption: rust/src/sip/sip.rs + :language: rust + :start-after: // app-layer-frame-documentation tag start: FrameType enum + :end-before: // app-layer-frame-documentation tag end: FrameType enum + +**Frame registering**. Some understanding of the parser will be needed in order to find where the frames should be registered. It makes sense that it will happen when the input stream is being parsed into records. See when some pdu and request frames are created for SIP: + +.. literalinclude:: ../../../../../rust/src/sip/sip.rs + :caption: rust/src/sip/sip.rs + :language: rust + :start-after: // app-layer-frame-documentation tag start: parse_request + :end-before: // app-layer-frame-documentation tag end: parse_request + :dedent: 4 + +.. note:: when to create PDU frames + + The standard approach we follow for frame registration is that a frame ``pdu`` will always be created, regardless of parser status (in practice, before the parser is called). The other frames are then created when and if only the parser succeeds. + +**Use the Frame API or build upon them as needed**. These are the frame registration functions highlighted above: + +.. literalinclude:: ../../../../../rust/src/sip/sip.rs + :caption: rust/src/sip/sip.rs + :language: rust + :start-after: // app-layer-frame-documentation tag start: function to add frames + :end-before: // app-layer-frame-documentation tag end: function to add frames + +**Register relevant frame callbacks.** As these are inferred from the ``#[derive(AppLayerFrameType)]`` statement, all that is needed is: + +.. literalinclude:: ../../../../../rust/src/sip/sip.rs + :caption: rust/src/sip/sip.rs + :language: rust + :start-at: get_frame_id_by_name + :end-at: ffi_name_from_id), + :dedent: 8 + +.. note:: on frame_len + + For protocols which search for an end of frame char, like telnet, indicate unknown length by passing ``-1``. Once the length is known, it must be updated. For those where length is a field in the record (e.g. *SIP*), the frame is set to match said length, even if that is bigger than the current input + +The telnet parser has examples of using the Frame API directly for registering telnet frames, and also illustrates how that is done when length is not yet known: + +.. literalinclude:: ../../../../../rust/src/telnet/telnet.rs + :caption: rust/src/telnet/telnet.rs + :language: rust + :start-after: // app-layer-frame-documentation tag start: parse_request + :end-before: // app-layer-frame-documentation tag end: parse_request + :lines: 1-3, 22-49 + :dedent: 4 + +We then update length later on (note especially lines 3 and 10): + +.. literalinclude:: ../../../../../rust/src/telnet/telnet.rs + :caption: rust/src/telnet/telnet.rs + :language: rust + :start-after: // app-layer-frame-documentation tag start: update frame_len + :end-before: // app-layer-frame-documentation tag end: update frame_len + :linenos: + :dedent: 12 + +The Frame API calls parameters represent: + +- ``flow``: dedicated data type, carries specific flow-related data +- ``stream_slice``: dedicated data type, carries stream data, shown further bellow +- ``frame_start``: a pointer to the start of the frame buffer in the stream (``cur_i`` in the SMB code snippet) +- ``frame_len``: what we expect the frame length to be (the engine may need to wait until it has enough data. See what is done in the telnet snippet request frames registering) +- ``frame_type``: type of frame it's being registering (defined in an enum, as shown further above) + +``StreamSlice`` contains the input data to the parser, alongside other Stream-related data important in parsing context. Definition is found in *applayer.rs*: + +.. literalinclude:: ../../../../../rust/src/applayer.rs + :caption: rust/src/applayer.rs + :language: rust + :start-at: pub struct StreamSlice + :end-before: impl StreamSlice + + +C code +------ + +Implementing Frame support in C involves a bit more manual work, as one cannot make use of the Rust derives. Code snippets from the *HTTP* parser: + +Defining the frame types with the enum means: + +.. literalinclude:: ../../../../../src/app-layer-htp.c + :caption: src/app-layer-htp.c + :start-after: /* app-layer-frame-documentation tag start: HttpFrameTypes + :end-before: /* app-layer-frame-documentation tag end: HttpFrameTypes + :lines: 1-16 + +The HTTP parser uses the Frame registration functions from the C API (``app-layer-frames.c``) directly for registering request Frames. Here we also don't know the length yet. The ``0`` indicates flow direction: ``toserver``, and ``1`` would be used for ``toclient``: + +.. literalinclude:: ../../../../../src/app-layer-htp.c + :caption: src/app-layer-htp.c + :start-after: /* app-layer-frame-documentation tag start: frame registration http request + :end-before: /* app-layer-frame-documentation tag end: frame registration http request + :dedent: 4 + +Updating ``frame->len`` later: + +.. literalinclude:: ../../../../../src/app-layer-htp.c + :caption: src/app-layer-htp.c + :start-after: /* app-layer-frame-documentation tag start: updating frame->len + :end-before: /* app-layer-frame-documentation tag end: updating frame->len + :dedent: 4 + +Register relevant callbacks (note that the actual functions will also have to be written, for C): + +.. literalinclude:: ../../../../../src/app-layer-htp.c + :caption: src/app-layer-htp.c + :language: c + :start-after: /* app-layer-frame-documentation tag start: registering relevant callbacks + :end-before: /* app-layer-frame-documentation tag end: registering relevant callbacks + :dedent: 8 + +.. note:: The ``GetFrameIdByName`` functions can be "probed", so they should not generate any output or that could be misleading (for instance, Suricata generating a log message stating that a valid frame type is unknown). + +Visual context +============== + +``input`` and ``input_len`` are used to calculate the proper offset, for storing the frame. The stream buffer slides forward, so frame offsets/frames have to be updated. The `relative offset` (``rel_offset``) reflects that: + +.. code-block:: c + + Start: + [ stream ] + [ frame ...........] + rel_offset: 2 + len: 19 + + Slide: + [ stream ] + [ frame .... .] + rel_offset: -10 + len: 19 + + Slide: + [ stream ] + [ frame ........... ] + rel_offset: -16 + len: 19 + +The way the engine handles stream frames can be illustrated as follows: + +.. image:: img/StreamFrames.png + :scale: 80 diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.msc b/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.msc new file mode 100644 index 0000000..43dd653 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.msc @@ -0,0 +1,19 @@ +# MSC Sequence Diagram Example: DNS Query Transaction + +msc { + # Chart Options + arcgradient = "10"; + + # Entities + a [ label = "Client" ], b [ label = "Server" ]; + + # Message Flow + a =>> b [ label = "DNS Request" ]; + --- [ label = "Transaction 1 Completed" ]; + |||; + b =>> a [ label = "DNS Response" ]; + --- [ label = "Transaction 2 Completed" ]; + + |||; + ||| [label="[ generated with Mscgen ]", textcolor="gray"]; +} diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.png b/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.png Binary files differnew file mode 100644 index 0000000..611ae31 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/DnsUnidirectionalTransactions.png diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.msc b/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.msc new file mode 100644 index 0000000..3c3484f --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.msc @@ -0,0 +1,20 @@ +# MSC Sequence Diagram for an HTTP2 Transaction, which is bidirectional in Suricata + +msc { + + # Chart options + arcgradient = "10"; + + # Entities + a [ label = "Client" ], b [ label = "Server" ]; + + # Message flow + a =>> b [ label = "Request" ]; + b =>> a [ label = "Response" ]; + |||; + --- [ label = "Transaction Completed" ]; + |||; + ||| [label="[ generated with Mscgen ]", textcolor="gray"]; +} + +# Reference: https://tools.ietf.org/html/rfc7540#section-8.1 diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.png b/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.png Binary files differnew file mode 100644 index 0000000..02de5dc --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/HTTP2BidirectionalTransaction.png diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.msc b/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.msc new file mode 100644 index 0000000..3a5a308 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.msc @@ -0,0 +1,18 @@ +# MSC Sequence Diagram Example: Template transaction + +msc { + # Chart Options + arcgradient = "10"; + + # Entities + a [ label = "Client" ], b [ label = "Server" ]; + + # Message Flow + a =>> b [ label = "Request ('12:HelloWorld!')" ]; + b =>> a [ label = "Response ('3:Bye')" ]; + |||; + --- [ label = "Transaction Completed" ]; + + |||; + ||| [label="[ generated with Mscgen ]", textcolor="gray"]; +} diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.png b/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.png Binary files differnew file mode 100644 index 0000000..89daed4 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/TemplateTransaction.png diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.msc b/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.msc new file mode 100644 index 0000000..34a025d --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.msc @@ -0,0 +1,34 @@ +# MSC Sequence Diagram Example: TLS Handshake Transaction + +msc { + # Chart Options + arcgradient = "10"; + + # Entities + a [ label = "Client" ], b [ label = "Server" ]; + + # Message Flow + # TLS_STATE_IN_PROGRESS = 0, + a abox b [ label = "TLS_STATE_IN_PROGRESS" ]; + a =>> b [ label = "ClientHello" ]; + b =>> a [ label = "ServerHello" ]; + b =>> a [ label = "ServerCertificate" ]; + b =>> a [ label = "ServerHello Done" ]; + + a =>> b [ label = "ClientCertificate" ]; + # TLS_STATE_CERT_READY = 1, + a abox b [ label = "TLS_STATE_CERT_READY" ]; + a =>> b [ label = "ClientKeyExchange" ]; + + a =>> b [ label = "Finished" ]; + b =>> a [ label = "Finished" ]; + # TLS_HANDSHAKE_DONE = 2, + a abox b [ label = "TLS_HANDSHAKE_DONE" ]; + ...; + # TLS_STATE_FINISHED = 3 + a abox b [ label = "TLS_STATE_FINISHED" ]; + --- [ label = "Transaction Completed" ]; + + |||; + ||| [label="[ generated with Mscgen ]", textcolor="gray"]; +} diff --git a/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.png b/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.png Binary files differnew file mode 100644 index 0000000..4be7b04 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/diagrams/TlsHandshake.png diff --git a/doc/userguide/devguide/extending/app-layer/img/StreamFrames.png b/doc/userguide/devguide/extending/app-layer/img/StreamFrames.png Binary files differnew file mode 100644 index 0000000..f145c98 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/img/StreamFrames.png diff --git a/doc/userguide/devguide/extending/app-layer/index.rst b/doc/userguide/devguide/extending/app-layer/index.rst new file mode 100644 index 0000000..c392e58 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/index.rst @@ -0,0 +1,9 @@ +App-Layer +========= + +.. toctree:: + :maxdepth: 2 + + app-layer-frames.rst + parser.rst + transactions.rst diff --git a/doc/userguide/devguide/extending/app-layer/parser.rst b/doc/userguide/devguide/extending/app-layer/parser.rst new file mode 100644 index 0000000..262ed89 --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/parser.rst @@ -0,0 +1,98 @@ +******* +Parsers +******* + +Callbacks +========= + +The API calls callbacks that are registered at the start of the program. + +The function prototype is: + +.. code-block:: c + + typedef AppLayerResult (*AppLayerParserFPtr)(Flow *f, void *protocol_state, + AppLayerParserState *pstate, + const uint8_t *buf, uint32_t buf_len, + void *local_storage, const uint8_t flags); + +Examples +-------- + +A C example: + +.. code-block:: c + + static AppLayerResult HTPHandleRequestData(Flow *f, void *htp_state, + AppLayerParserState *pstate, + const uint8_t *input, uint32_t input_len, + void *local_data, const uint8_t flags); + +In Rust, the callbacks are similar. + +.. code-block:: rust + + #[no_mangle] + pub extern "C" fn rs_dns_parse_response_tcp(_flow: *const core::Flow, + state: *mut std::os::raw::c_void, + _pstate: *mut std::os::raw::c_void, + input: *const u8, + input_len: u32, + _data: *const std::os::raw::c_void, + _flags: u8) + -> AppLayerResult + + +Return Types +============ + +Parsers return the type `AppLayerResult`. + +There are 3 possible results: + - `APP_LAYER_OK` - parser consumed the data successfully + - `APP_LAYER_ERROR` - parser encountered a unrecoverable error + - `APP_LAYER_INCOMPLETE(c,n)` - parser consumed `c` bytes, and needs `n` more before being called again + +Rust parsers follow the same logic, but can return + - `AppLayerResult::ok()` + - `AppLayerResult::err()` + - `AppLayerResult::incomplete(c,n)` + +For `i32` and `bool`, Rust parsers can also use `.into()`. + +APP_LAYER_OK / AppLayerResult::ok() +----------------------------------- + +When a parser returns "OK", it signals to the API that all data has been consumed. The parser will be called again when more data is available. + +APP_LAYER_ERROR / AppLayerResult::err() +--------------------------------------- + +Returning "ERROR" from the parser indicates to the API that the parser encountered an unrecoverable error and the processing of the protocol should stop for the rest of this flow. + +.. note:: This should not be used for recoverable errors. For those events should be set. + +APP_LAYER_INCOMPLETE / AppLayerResult::incomplete() +--------------------------------------------------- + +Using "INCOMPLETE" a parser can indicate how much more data is needed. Many protocols use records that have the size as one of the first parameters. When the parser receives a partial record, it can read this value and then tell the API to only call the parser again when enough data is available. + +`consumed` is used how much of the current data has been processed +`needed` is the number of bytes that the parser needs on top of what was consumed. + +Example:: + + [ 32 record 1 ][ 32 record 2 ][ 32 r.. ] + 0 31 32 63 64 72 + ^ ^ + consumed: 64 ---------------/ | + needed: 32 -------------------/ + +.. note:: "INCOMPLETE" is only supported for TCP + +The parser will be called again when the `needed` data is available OR when the stream ends. In the latter case the data will be incomplete. It's up to the parser to decide what to do with it in this case. + +Supporting incomplete data +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In some cases it may be preferable to actually support dealing with incomplete records. For example protocols like SMB and NFS can use very large records during file transfers. Completely queuing these before processing could be a waste of resources. In such cases the "INCOMPLETE" logic could be used for just the record header, while the record data is streamed into the parser. diff --git a/doc/userguide/devguide/extending/app-layer/transactions.rst b/doc/userguide/devguide/extending/app-layer/transactions.rst new file mode 100644 index 0000000..357bdcd --- /dev/null +++ b/doc/userguide/devguide/extending/app-layer/transactions.rst @@ -0,0 +1,321 @@ +************ +Transactions +************ + +.. contents:: Table of Contents + +General Concepts +================ + +For Suricata, transactions are an abstraction that help with detecting and logging. An example of a complete transaction is +a pair of messages in the form of a request (from client to server) and a response (from server to client) in HTTP. + +In order to know when to log an event for a given protocol, the engine tracks the progress of each transaction - that +is, when is it complete, or when it reaches a key intermediate state. They aid during the detection phase, +when dealing with protocols that can have large PDUs (protocol data units), like TCP, in controlling state for partial rule matching -- in case of rules that mention more than one field. + +Transactions are implemented and stored in the per-flow state. The engine interacts with them using a set of callbacks the parser registers. + +How the engine uses transactions +================================ + +Logging +~~~~~~~ + +Suricata controls when logging should happen based on transaction completeness. For simpler protocols, such as ``dns`` +or ``ntp``, that will most +likely happen once per transaction, by the time of its completion. In other cases, like with HTTP, this may happen at intermediary states. + +In ``OutputTxLog``, the engine will compare current state with the value defined for the logging to happen, per flow +direction (``logger->tc_log_progress``, ``logger->ts_log_progress``). If state is less than that value, the engine skips to +the next logger. Code snippet from: suricata/src/output-tx.c: + +.. code-block:: c + + static TmEcode OutputTxLog(ThreadVars *tv, Packet *p, void *thread_data) + { + . + . + . + if ((ts_eof && tc_eof) || last_pseudo) { + SCLogDebug("EOF, so log now"); + } else { + if (logger->LogCondition) { + int r = logger->LogCondition(tv, p, alstate, tx, tx_id); + if (r == FALSE) { + SCLogDebug("conditions not met, not logging"); + goto next_logger; + } + } else { + if (tx_progress_tc < logger->tc_log_progress) { + SCLogDebug("progress not far enough, not logging"); + goto next_logger; + } + + if (tx_progress_ts < logger->ts_log_progress) { + SCLogDebug("progress not far enough, not logging"); + goto next_logger; + } + } + } + . + . + . + } + +Rule Matching +~~~~~~~~~~~~~ + +Transaction progress is also used for certain keywords to know what is the minimum state before we can expect a match: until that, Suricata won't even try to look for the patterns. + +As seen in ``DetectAppLayerMpmRegister2`` that has ``int progress`` as parameter, and ``DetectAppLayerInspectEngineRegister2``, which expects ``int tx_min_progress``, for instance. In the code snippet, +``HTTP2StateDataClient``, ``HTTP2StateDataServer`` and ``0`` are the values passed to the functions - in the last +example, for ``FTPDATA``, +the existence of a transaction implies that a file is being transferred. Hence the ``0`` value. + + +.. code-block:: c + + void DetectFiledataRegister(void) + { + . + . + DetectAppLayerMpmRegister2("file_data", SIG_FLAG_TOSERVER, 2, + PrefilterMpmFiledataRegister, NULL, + ALPROTO_HTTP2, HTTP2StateDataClient); + DetectAppLayerMpmRegister2("file_data", SIG_FLAG_TOCLIENT, 2, + PrefilterMpmFiledataRegister, NULL, + ALPROTO_HTTP2, HTTP2StateDataServer); + . + . + DetectAppLayerInspectEngineRegister2("file_data", + ALPROTO_HTTP2, SIG_FLAG_TOCLIENT, HTTP2StateDataServer, + DetectEngineInspectFiledata, NULL); + DetectAppLayerInspectEngineRegister2( + "file_data", ALPROTO_FTPDATA, SIG_FLAG_TOSERVER, 0, DetectEngineInspectFiledata, NULL); + . + . + } + +Progress Tracking +================= + +As a rule of thumb, transactions will follow a request-response model: if a transaction has had a request and a response, it is complete. + +But if a protocol has situations where a request or response won’t expect or generate a message from its counterpart, +it is also possible to have uni-directional transactions. In such cases, transaction is set to complete at the moment of +creation. + +For example, DNS responses may be considered as completed transactions, because they also contain the request data, so +all information needed for logging and detection can be found in the response. + +In addition, for file transfer protocols, or similar ones where there may be several messages before the file exchange +is completed (NFS, SMB), it is possible to create a level of abstraction to handle such complexity. This could be achieved by adding phases to the model implemented by the protocol (e.g., protocol negotiation phase (SMB), request parsed (HTTP), and so on). + +This is controlled by implementing progress states. In Suricata, those will be enums that are incremented as the parsing +progresses. A state will start at 0. The higher its value, the closer the transaction would be to completion. Due to how +the engine tracks detection across states, there is an upper limit of 48 to the state progress (it must be < 48). + +The engine interacts with transactions' state using a set of callbacks the parser registers. State is defined per flow direction (``STREAM_TOSERVER`` / ``STREAM_TOCLIENT``). + +In Summary - Transactions and State +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Initial State value: ``0``. +- Simpler scenarios: State is simply a bool. ``1`` represents transaction completion, per direction. +- Complex Transaction State in Suricata: ``enum`` (Rust: ``i32``). Completion is indicated by the highest enum value (some examples are: SSH, HTTP, HTTP2, DNS, SMB). + +Examples +======== + +This section shares some examples from Suricata codebase, to help visualize how Transactions work and are handled by the engine. + +Enums +~~~~~ + +Code snippet from: rust/src/ssh/ssh.rs: + +.. code-block:: rust + + pub enum SSHConnectionState { + SshStateInProgress = 0, + SshStateBannerWaitEol = 1, + SshStateBannerDone = 2, + SshStateFinished = 3, + } + +From src/app-layer-ftp.h: + +.. code-block:: c + + enum { + FTP_STATE_IN_PROGRESS, + FTP_STATE_PORT_DONE, + FTP_STATE_FINISHED, + }; + +From src/app-layer-ssl.h: + +.. code-block:: c + + enum { + TLS_STATE_IN_PROGRESS = 0, + TLS_STATE_CERT_READY = 1, + TLS_HANDSHAKE_DONE = 2, + TLS_STATE_FINISHED = 3 + }; + +API Callbacks +~~~~~~~~~~~~~ + +In Rust, this is done via the RustParser struct. As seen in rust/src/applayer.rs: + +.. code-block:: rust + + /// Rust parser declaration + pub struct RustParser { + . + . + . + /// Progress values at which the tx is considered complete in a direction + pub tx_comp_st_ts: c_int, + pub tx_comp_st_tc: c_int, + . + . + . + } + +In C, the callback API is: + +.. code-block:: c + + void AppLayerParserRegisterStateProgressCompletionStatus( + AppProto alproto, const int ts, const int tc) + +Simple scenario described, in Rust: + +rust/src/dhcp/dhcp.rs: + +.. code-block:: rust + + tx_comp_st_ts: 1, + tx_comp_st_tc: 1, + +For SSH, this looks like this: + +rust/src/ssh/ssh.rs: + +.. code-block:: rust + + tx_comp_st_ts: SSHConnectionState::SshStateFinished as i32, + tx_comp_st_tc: SSHConnectionState::SshStateFinished as i32, + +In C, callback usage would be as follows: + +src/app-layer-dcerpc.c: + +.. code-block:: c + + AppLayerParserRegisterStateProgressCompletionStatus(ALPROTO_DCERPC, 1, 1); + +src/app-layer-ftp.c: + +.. code-block:: c + + AppLayerParserRegisterStateProgressCompletionStatus( + ALPROTO_FTP, FTP_STATE_FINISHED, FTP_STATE_FINISHED); + +Sequence Diagrams +~~~~~~~~~~~~~~~~~ + +A DNS transaction in Suricata can be considered unidirectional: + +.. image:: diagrams/DnsUnidirectionalTransactions.png + :width: 600 + :alt: A sequence diagram with two entities, Client and Server, with an arrow going from the Client to the Server, labeled "DNS Request". After that, there is a dotted line labeled "Transaction Completed". + +An HTTP2 transaction is an example of a bidirectional transaction, in Suricata (note that, while HTTP2 may have multiple streams, those are mapped to transactions in Suricata. They run in parallel, scenario not shown in this Sequence Diagram - which shows one transaction, only): + +.. image:: diagrams/HTTP2BidirectionalTransaction.png + :width: 600 + :alt: A sequence diagram with two entities, Client and Server, with an arrow going from the Client to the Server labeled "Request" and below that an arrow going from Server to Client labeled "Response". Below those arrows, a dotted line indicates that the transaction is completed. + +A TLS Handshake is a more complex example, where several messages are exchanged before the transaction is considered completed: + +.. image:: diagrams/TlsHandshake.png + :width: 600 + :alt: A sequence diagram with two entities, Client and Server, with an arrow going from the Client to the Server labeled "ClientHello" and below that an arrow going from Server to Client labeled "ServerHello". Below those arrows, several more follow from Server to Client and vice-versa, before a dotted line indicates that the transaction is finally completed. + +Template Protocol +~~~~~~~~~~~~~~~~~ + +Suricata has a template protocol for educational purposes, which has simple bidirectional transactions. + +A completed transaction for the template looks like this: + +.. image:: diagrams/TemplateTransaction.png + :width: 600 + :alt: A sequence diagram with two entities, Client and Server, with an arrow going from the Client to the Server, labeled "Request". An arrow below that first one goes from Server to Client. + +Following are the functions that check whether a transaction is considered completed, for the Template Protocol. Those are called by the Suricata API. Similar functions exist for each protocol, and may present implementation differences, based on what is considered a transaction for that given protocol. + +In C: + +.. code-block:: c + + static int TemplateGetStateProgress(void *txv, uint8_t direction) + { + TemplateTransaction *tx = txv; + + SCLogNotice("Transaction progress requested for tx ID %"PRIu64 + ", direction=0x%02x", tx->tx_id, direction); + + if (direction & STREAM_TOCLIENT && tx->response_done) { + return 1; + } + else if (direction & STREAM_TOSERVER) { + /* For the template, just the existence of the transaction means the + * request is done. */ + return 1; + } + + return 0; + } + +And in Rust: + +.. code-block:: rust + + pub extern "C" fn rs_template_tx_get_alstate_progress( + tx: *mut std::os::raw::c_void, + _direction: u8, + ) -> std::os::raw::c_int { + let tx = cast_pointer!(tx, TemplateTransaction); + + // Transaction is done if we have a response. + if tx.response.is_some() { + return 1; + } + return 0; + } + +Work In Progress changes +======================== + +Currently we are working to have files be part of the transaction instead of the per-flow state, as seen in https://redmine.openinfosecfoundation.org/issues/4444. + +Another work in progress is to limit the number of transactions per flow, to prevent Denial of Service (DoS) by quadratic complexity - a type of attack that may happen to protocols which can have multiple transactions at the same time - such as HTTP2 so-called streams (see https://redmine.openinfosecfoundation.org/issues/4530). + +Common words and abbreviations +============================== + +- al, applayer: application layer +- alproto: application layer protocol +- alstate: application layer state +- engine: refers to Suricata core detection logic +- flow: a bidirectional flow of packets with the same 5-tuple elements (protocol, source ip, destination ip, source port, destination port. Vlans can be added as well) +- PDU: Protocol Data Unit +- rs: rust +- tc: to client +- ts: to server +- tx: transaction diff --git a/doc/userguide/devguide/extending/capture/index.rst b/doc/userguide/devguide/extending/capture/index.rst new file mode 100644 index 0000000..2b84ae6 --- /dev/null +++ b/doc/userguide/devguide/extending/capture/index.rst @@ -0,0 +1,2 @@ +Packet Capture +============== diff --git a/doc/userguide/devguide/extending/decoder/index.rst b/doc/userguide/devguide/extending/decoder/index.rst new file mode 100644 index 0000000..9c26117 --- /dev/null +++ b/doc/userguide/devguide/extending/decoder/index.rst @@ -0,0 +1,2 @@ +Packet Decoder +============== diff --git a/doc/userguide/devguide/extending/detect/index.rst b/doc/userguide/devguide/extending/detect/index.rst new file mode 100644 index 0000000..1dde177 --- /dev/null +++ b/doc/userguide/devguide/extending/detect/index.rst @@ -0,0 +1,2 @@ +Detection +========= diff --git a/doc/userguide/devguide/extending/index.rst b/doc/userguide/devguide/extending/index.rst new file mode 100644 index 0000000..bcdabef --- /dev/null +++ b/doc/userguide/devguide/extending/index.rst @@ -0,0 +1,11 @@ +Extending Suricata +================== + +.. toctree:: + :maxdepth: 2 + + capture/index.rst + decoder/index.rst + app-layer/index.rst + detect/index.rst + output/index.rst diff --git a/doc/userguide/devguide/extending/output/index.rst b/doc/userguide/devguide/extending/output/index.rst new file mode 100644 index 0000000..559723e --- /dev/null +++ b/doc/userguide/devguide/extending/output/index.rst @@ -0,0 +1,7 @@ +Output +====== + +Introduction +------------ + +Extending Suricata's alert and event output. diff --git a/doc/userguide/devguide/index.rst b/doc/userguide/devguide/index.rst new file mode 100644 index 0000000..b6b5fbc --- /dev/null +++ b/doc/userguide/devguide/index.rst @@ -0,0 +1,9 @@ +Suricata Developer Guide +======================== + +.. toctree:: + :maxdepth: 2 + + codebase/index.rst + internals/index.rst + extending/index.rst diff --git a/doc/userguide/devguide/internals/datastructs/index.rst b/doc/userguide/devguide/internals/datastructs/index.rst new file mode 100644 index 0000000..cb67988 --- /dev/null +++ b/doc/userguide/devguide/internals/datastructs/index.rst @@ -0,0 +1,9 @@ +Important Data Structures +========================= + +Introduction +------------ + +This section explains the most important Suricata Data structures. + +For a complete overview, see the doxygen: https://doxygen.openinfosecfoundation.org diff --git a/doc/userguide/devguide/internals/engines/index.rst b/doc/userguide/devguide/internals/engines/index.rst new file mode 100644 index 0000000..6944a69 --- /dev/null +++ b/doc/userguide/devguide/internals/engines/index.rst @@ -0,0 +1,11 @@ +Engines +======= + +Flow +---- + +Stream +------ + +Defrag +------ diff --git a/doc/userguide/devguide/internals/index.rst b/doc/userguide/devguide/internals/index.rst new file mode 100644 index 0000000..29a8e59 --- /dev/null +++ b/doc/userguide/devguide/internals/index.rst @@ -0,0 +1,10 @@ +Suricata Internals +================== + +.. toctree:: + :maxdepth: 2 + + pipeline/index.rst + threading/index.rst + datastructs/index.rst + engines/index.rst diff --git a/doc/userguide/devguide/internals/pipeline/index.rst b/doc/userguide/devguide/internals/pipeline/index.rst new file mode 100644 index 0000000..8acd523 --- /dev/null +++ b/doc/userguide/devguide/internals/pipeline/index.rst @@ -0,0 +1,2 @@ +Packet Pipeline +=============== diff --git a/doc/userguide/devguide/internals/threading/index.rst b/doc/userguide/devguide/internals/threading/index.rst new file mode 100644 index 0000000..b89b6d0 --- /dev/null +++ b/doc/userguide/devguide/internals/threading/index.rst @@ -0,0 +1,2 @@ +Threading +========= |