From c248d29056abbc1fc4c5dc178bab48fb8d2c1fcb Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 19 Apr 2024 19:40:56 +0200 Subject: Adding upstream version 1:0.5.47. Signed-off-by: Daniel Baumann --- docs/QUICK_START | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) create mode 100644 docs/QUICK_START (limited to 'docs/QUICK_START') diff --git a/docs/QUICK_START b/docs/QUICK_START new file mode 100644 index 0000000..30943e0 --- /dev/null +++ b/docs/QUICK_START @@ -0,0 +1,106 @@ + +QUICK START +----------- + +LibHTP is envisioned to be many things, but the only scenario in which it has been tested +so far is that when you need to parse a duplex HTTP stream which you have obtained by +passively intercepting a communication channel. The assumption is that you have raw TCP data +(after SSL, if SSL is used). + +Every parsing operation needs to follow these steps: + + 1. Configure-time: + + 1.1. Create one or more parser configuration structures. + + 1.2. Tweak the configuration of each parser to match the behaviour of + the server you're intercepting the communication of (htp_config_set_* functions). + + 1.3. Register the parser callbacks you'll need. You will need to use parser callbacks + if you want to monitor parsing events as they occur, and gain access to partial + transaction information. If you are processing data in batch (off-line) you may + simply parse entire streams at a time and only analyze complete transaction data + after the fact. + + If you need to gain access to request and response bodies, your only option at + this time is to use the callbacks, because the parser will not preserve that + information. + + For callback registration, look up the htp_config_register_* functions. + + If your program operates in real-time then it may be desirable to dispose of + the used resources after each transaction is parsed. To do that, use the + htp_config_set_tx_auto_destroy() function to tell LibHTP to delete transactions + after they are no longer needed. + + 2. Run-time: + + 2.1. Create a parser instance for every TCP stream you want to process. + + 2.2. Feed the parser inbound and outbound data. + + The parser will typically always consume complete data chunks and return + STREAM_STATE_DATA, which means that you can continue to feed it more data + when you have it. If you have a queue of data chunks, always first send the + parser all the _request_ chunks you have. That will ensure that the parser + never encounters a response for which it had not seen a request (which + would result with a fatal error). + + If you get STREAM_STATE_ERROR, the parser has encountered a fatal error and + is unable to continue to parse the stream. An error should never happen for + a valid HTTP stream. If you encounter such an error and you believe the + HTTP stream is valid, please send us the PCAP file we can use to diagnose + the problem. + + There is one situation when the parser will not be able to consume a complete + request data chunk, in which case it will return STREAM_STATE_DATA_OTHER. This + means that the parser needs to see some response data. You will then need to + do the following: + + 2.2.1. Remember how many bytes of the request chunk data were consumed (using + htp_connp_req_data_consumed()). + + 2.2.2. Suspend request parsing until you get some response data. + + 2.2.3. Feed some response data (when you have it) to the parser. + + Note that it is also possible to receive STREAM_STATE_DATA_OTHER + from the response parser. If that happens, you will need to + remember how many bytes were consumed using + htp_connp_res_data_consumed(). + + 2.2.4. After each chunk of response data fed to the parser, attempt + to resume request stream parsing. + + 2.2.5. If you again receive STREAM_STATE_DATA_OTHER go back to 2.2.3. + + 2.2.6. Otherwise, feed to the parser all the request data you have. This is + necessary to prevent the case of the parser seeing more responses + than requests (which would inevitably result with an error). + + 2.2.7. Send unprocessed response data from 2.2.3 (if any). + + 2.2.8. Continue sending request/response data as normal. + + The above situation should occur very rarely. + + 2.3. Analyze transaction data in callbacks (if you want to have access to + the data as it is being produced). + + 2.4. Analyze transaction data after an entire TCP stream has been processed. + + 2.4. Destroy parser instance to free up the allocated resources. + + +USER DATA +--------- + +If you're using the callbacks and you need to keep state between invocations, you have two +options: + + 1. Associate one opaque structure with a parser instance, using htp_connp_set_user_data(). + + 2. Associate one opaque structure with a transaction instance, using htp_tx_set_user_data(). + The best place to do this is in a TRANSACTION_START callback. Don't forget to free up + any resources you allocate on per-transaction basis, before you delete each transaction. + -- cgit v1.2.3