summaryrefslogtreecommitdiffstats
path: root/CONTRIBUTING.md
diff options
context:
space:
mode:
Diffstat (limited to 'CONTRIBUTING.md')
-rw-r--r--CONTRIBUTING.md269
1 files changed, 236 insertions, 33 deletions
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 26f226e..215e4c6 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -36,17 +36,17 @@ and ask!
## How to add new code and to update existing code
-0) Check that there isn't a pull request already opened for the changes you
+1) Check that there isn't a pull request already opened for the changes you
intend to make.
-1) [Fork](https://help.github.com/articles/fork-a-repo/) the Tcpdump
+2) [Fork](https://help.github.com/articles/fork-a-repo/) the Tcpdump
[repository](https://github.com/the-tcpdump-group/tcpdump).
-2) The easiest way to test your changes on multiple operating systems and
+3) The easiest way to test your changes on multiple operating systems and
architectures is to let the upstream CI test your pull request (more on
this below).
-3) Setup your git working copy
+4) Setup your git working copy
```
git clone https://github.com/<username>/tcpdump.git
cd tcpdump
@@ -54,19 +54,19 @@ and ask!
git fetch upstream
```
-4) Do a `touch .devel` in your working directory.
+5) Do a `touch .devel` in your working directory.
Currently, the effect is
* add (via `configure`, in `Makefile`) some warnings options (`-Wall`,
`-Wmissing-prototypes`, `-Wstrict-prototypes`, ...) to the compiler if it
supports these options,
* have the `Makefile` support `make depend` and the `configure` script run it.
-5) Configure and build
+6) Configure and build
```
./configure && make -s && make check
```
-6) Add/update tests
+7) Add/update tests
The `tests` directory contains regression tests of the dissection of captured
packets. Those captured packets were saved running tcpdump with option
`-w sample.pcap`. Additional options, such as `-n`, are used to create relevant
@@ -96,12 +96,12 @@ and ask!
It is often useful to have test outputs with different verbosity levels
(none, `-v`, `-vv`, `-vvv`, etc.) depending on the code.
-7) Test using `make check` (current build options) and `./build_matrix.sh`
+8) Test using `make check` (current build options) and `./build_matrix.sh`
(a multitude of build options, build systems and compilers). If you can,
test on more than one operating system. Don't send a pull request until
all tests pass.
-8) Try to rebase your commits to keep the history simple.
+9) Try to rebase your commits to keep the history simple.
```
git fetch upstream
git rebase upstream/master
@@ -109,32 +109,76 @@ and ask!
(If the rebase fails and you cannot resolve, issue `git rebase --abort`
and ask for help in the pull request comment.)
-9) Once 100% happy, put your work into your forked repository using `git push`.
+10) Once 100% happy, put your work into your forked repository using `git push`.
-10) [Initiate and send](https://help.github.com/articles/using-pull-requests/)
+11) [Initiate and send](https://help.github.com/articles/using-pull-requests/)
a pull request.
This will trigger the upstream repository CI tests.
## Code style and generic remarks
-* A thorough reading of some other printers code is useful.
+1) A thorough reading of some other printers code is useful.
-* Put the normative reference if any as comments (RFC, etc.).
+2) To help learn how tcpdump works or to help debugging:
+ You can configure and build tcpdump with the instrumentation of functions:
+ ```
+ $ ./configure --enable-instrument-functions
+ $ make -s clean all
+ ```
+
+ This generates instrumentation calls for entry and exit to functions.
+ Just after function entry and just before function exit, these
+ profiling functions are called and print the function names with
+ indentation and call level.
+
+ If entering in a function, it prints also the calling function name with
+ file name and line number. There may be a small shift in the line number.
+
+ In some cases, with Clang 11, the file number is unknown (printed '??')
+ or the line number is unknown (printed '?'). In this case, use GCC.
+
+ If the environment variable INSTRUMENT is
+ - unset or set to an empty string, print nothing, like with no
+ instrumentation
+ - set to "all" or "a", print all the functions names
+ - set to "global" or "g", print only the global functions names
+
+ This allows to run:
+ ```
+ $ INSTRUMENT=a ./tcpdump ...
+ $ INSTRUMENT=g ./tcpdump ...
+ $ INSTRUMENT= ./tcpdump ...
+ ```
+ or
+ ```
+ $ export INSTRUMENT=global
+ $ ./tcpdump ...
+ ```
+
+ The library libbfd is used, therefore the binutils-dev package is required.
-* Put the format of packets/headers/options as comments if there is no
+3) Put the normative reference if any as comments (RFC, etc.).
+
+4) Put the format of packets/headers/options as comments if there is no
published normative reference.
-* The printer may receive incomplete packet in the buffer, truncated at any
+5) The printer may receive incomplete packet in the buffer, truncated at any
random position, for example by capturing with `-s size` option.
+ This means that an attempt to fetch packet data based on the expected
+ format of the packet may run the risk of overrunning the buffer.
+
+ Furthermore, if the packet is complete, but is not correctly formed,
+ that can also cause a printer to overrun the buffer, as it will be
+ fetching packet data based on the expected format of the packet.
+
+ Therefore, integral, IPv4 address, and octet sequence values should
+ be fetched using the `GET_*()` macros, which are defined in
+ `extract.h`.
+
If your code reads and decodes every byte of the protocol packet, then to
ensure proper and complete bounds checks it would be sufficient to read all
- packet data using the `GET_*()` macros, typically:
- ```
- GET_U_1(p)
- GET_S_1(p)
- GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
- GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
- ```
+ packet data using the `GET_*()` macros.
+
If your code uses the macros above only on some packet data, then the gaps
would have to be bounds-checked using the `ND_TCHECK_*()` macros:
```
@@ -142,7 +186,14 @@ and ask!
ND_TCHECK_SIZE(p)
ND_TCHECK_LEN(p, l)
```
- For the `ND_TCHECK_*` macros (if not already done):
+
+ where *p* points to the data not being decoded. For `ND_CHECK_n()`,
+ *n* is the length of the gap, in bytes. For `ND_CHECK_SIZE()`, the
+ length of the gap, in bytes, is the size of an item of the data type
+ to which *p* points. For `ND_CHECK_LEN()`, *l* is the length of the
+ gap, in bytes.
+
+ For the `GET_*()` and `ND_TCHECK_*` macros (if not already done):
* Assign: `ndo->ndo_protocol = "protocol";`
* Define: `ND_LONGJMP_FROM_TCHECK` before including `netdissect.h`
* Make sure that the intersection of `GET_*()` and `ND_TCHECK_*()` is minimal,
@@ -155,19 +206,170 @@ and ask!
```
You should try several values for snaplen to do various truncation.
-* Do invalid packet checks in code: Think that your code can receive in input
+* The `GET_*()` macros that fetch integral values are:
+ ```
+ GET_U_1(p)
+ GET_S_1(p)
+ GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
+ GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
+ GET_LE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
+ GET_LE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
+ ```
+
+ where *p* points to the integral value in the packet buffer. The
+ macro returns the integral value at that location.
+
+ `U` indicates that an unsigned value is fetched; `S` indicates that a
+ signed value is fetched. For multi-byte values, `BE` indicates that
+ a big-endian value ("network byte order") is fetched, and `LE`
+ indicates that a little-endian value is fetched. *n* is the length,
+ in bytes, of the multi-byte integral value to be fetched.
+
+ In addition to the bounds checking the `GET_*()` macros perform,
+ using those macros has other advantages:
+
+ * tcpdump runs on both big-endian and little-endian systems, so
+ fetches of multi-byte integral values must be done in a fashion
+ that works regardless of the byte order of the machine running
+ tcpdump. The `GET_BE_*()` macros will fetch a big-endian value and
+ return a host-byte-order value on both big-endian and little-endian
+ machines, and the `GET_LE_*()` macros will fetch a little-endian
+ value and return a host-byte-order value on both big-endian and
+ little-endian machines.
+
+ * tcpdump runs on machines that do not support unaligned access to
+ multi-byte values, and packet values are not guaranteed to be
+ aligned on the proper boundary. The `GET_BE_*()` and `GET_LE_*()`
+ macros will fetch values even if they are not aligned on the proper
+ boundary.
+
+* The `GET_*()` macros that fetch IPv4 address values are:
+ ```
+ GET_IPV4_TO_HOST_ORDER(p)
+ GET_IPV4_TO_NETWORK_ORDER(p)
+ ```
+
+ where *p* points to the address in the packet buffer.
+ `GET_IPV4_TO_HOST_ORDER()` returns the address in the byte order of
+ the host that is running tcpdump; `GET_IPV4_TO_NETWORK_ORDER()`
+ returns it in network byte order.
+
+ Like the integral `GET_*()` macros, these macros work correctly on
+ both big-endian and little-endian machines and will fetch values even
+ if they are not aligned on the proper boundary.
+
+* The `GET_*()` macro that fetches an arbitrary sequences of bytes is:
+ ```
+ GET_CPY_BYTES(dst, p, len)
+ ```
+
+ where *dst* is the destination to which the sequence of bytes should
+ be copied, *p* points to the first byte of the sequence of bytes, and
+ *len* is the number of bytes to be copied. The bytes are copied in
+ the order in which they appear in the packet.
+
+* To fetch a network address and convert it to a printable string, use
+ the following `GET_*()` macros, defined in `addrtoname.h`, to
+ perform bounds checks to make sure the entire address is within the
+ buffer and to translate the address to a string to print:
+ ```
+ GET_IPADDR_STRING(p)
+ GET_IP6ADDR_STRING(p)
+ GET_MAC48_STRING(p)
+ GET_EUI64_STRING(p)
+ GET_EUI64LE_STRING(p)
+ GET_LINKADDR_STRING(p, type, len)
+ GET_ISONSAP_STRING(nsap, nsap_length)
+ ```
+
+ `GET_IPADDR_STRING()` fetches an IPv4 address pointed to by *p* and
+ returns a string that is either a host name, if the `-n` flag wasn't
+ specified and a host name could be found for the address, or the
+ standard XXX.XXX.XXX.XXX-style representation of the address.
+
+ `GET_IP6ADDR_STRING()` fetches an IPv6 address pointed to by *p* and
+ returns a string that is either a host name, if the `-n` flag wasn't
+ specified and a host name could be found for the address, or the
+ standard XXXX::XXXX-style representation of the address.
+
+ `GET_MAC48_STRING()` fetches a 48-bit MAC address (Ethernet, 802.11,
+ etc.) pointed to by *p* and returns a string that is either a host
+ name, if the `-n` flag wasn't specified and a host name could be
+ found in the ethers file for the address, or the standard
+ XX:XX:XX:XX:XX:XX-style representation of the address.
+
+ `GET_EUI64_STRING()` fetches a 64-bit EUI pointed to by *p* and
+ returns a string that is the standard XX:XX:XX:XX:XX:XX:XX:XX-style
+ representation of the address.
+
+ `GET_EUI64LE_STRING()` fetches a 64-bit EUI, in reverse byte order,
+ pointed to by *p* and returns a string that is the standard
+ XX:XX:XX:XX:XX:XX:XX:XX-style representation of the address.
+
+ `GET_LINKADDR_STRING()` fetches an octet string, of length *length*
+ and type *type*, pointed to by *p* and returns a string whose format
+ depends on the value of *type*:
+
+ * `LINKADDR_MAC48` - if the length is 6, the string has the same
+ value as `GET_MAC48_STRING()` would return for that address,
+ otherwise, the string is a sequence of XX:XX:... values for the bytes
+ of the address;
+
+ * `LINKADDR_FRELAY` - the string is "DLCI XXX", where XXX is the
+ DLCI, if the address is a valid Q.922 header, and an error indication
+ otherwise;
+
+ * `LINKADDR_EUI64`, `LINKADDR_ATM`, `LINKADDR_OTHER` -
+ the string is a sequence of XX:XX:... values for the bytes
+ of the address.
+
+6) When defining a structure corresponding to a packet or part of a
+ packet, so that a pointer to packet data can be cast to a pointer to
+ that structure and that structure pointer used to refer to fields in
+ the packet, use the `nd_*` types for the structure members.
+
+ Those types all are aligned only on a 1-byte boundary, so a
+ compiler will not assume that the structure is aligned on a boundary
+ stricter than one byte; there is no guarantee that fields in packets
+ are aligned on any particular boundary.
+
+ This means that all padding in the structure must be explicitly
+ declared as fields in the structure.
+
+ The `nd_*` types for integral values are:
+
+ * `nd_uintN_t`, for unsigned integral values, where *N* is the number
+ of bytes in the value.
+ * `nd_intN_t`, for signed integral values, where *N* is the number
+ of bytes in the value.
+
+ The `nd_*` types for IP addresses are:
+
+ * `nd_ipv4`, for IPv4 addresses;
+ * `nd_ipv6`, for IPv6 addresses.
+
+ The `nd_*` types for link-layer addresses are:
+
+ * `nd_mac48`, for MAC-48 (Ethernet, 802.11, etc.) addresses;
+ * `nd_eui64`, for EUI-64 values.
+
+ The `nd_*` type for a byte in a sequence of bytes is `nd_byte`; an
+ *N*-byte sequence should be declared as `nd_byte[N]`.
+
+7) Do invalid packet checks in code: Think that your code can receive in input
not only a valid packet but any arbitrary random sequence of octets (packet
* built malformed originally by the sender or by a fuzz tester,
* became corrupted in transit or for some other reason).
Print with: `nd_print_invalid(ndo); /* to print " (invalid)" */`
-* Use `struct tok` for indexed strings and print them with
+8) Use `struct tok` for indexed strings and print them with
`tok2str()` or `bittok2str()` (for flags).
+ All `struct tok` must end with `{ 0, NULL }`.
-* Avoid empty lines in output of printers.
+9) Avoid empty lines in output of printers.
-* A commit message must have:
+10) A commit message must have:
```
First line: Capitalized short summary in the imperative (50 chars or less)
@@ -179,13 +381,14 @@ and ask!
the body.
```
-* Avoid non-ASCII characters in code and commit messages.
+11) Avoid non-ASCII characters in code and commit messages.
-* Use the style of the modified sources.
+12) Use the style of the modified sources.
-* Don't mix declarations and code.
+13) Don't mix declarations and code.
-* Don't use `//` for comments.
- Not all C compilers accept C++/C99 comments by default.
+14) tcpdump requires a compiler that supports C99 or later, so C99
+ features may be used in code, but C11 or later features should not be
+ used.
-* Avoid trailing tabs/spaces
+15) Avoid trailing tabs/spaces