summaryrefslogtreecommitdiffstats
path: root/ucb/README.md
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 09:06:44 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-07 09:06:44 +0000
commited5640d8b587fbcfed7dd7967f3de04b37a76f26 (patch)
tree7a5f7c6c9d02226d7471cb3cc8fbbf631b415303 /ucb/README.md
parentInitial commit. (diff)
downloadlibreoffice-ed5640d8b587fbcfed7dd7967f3de04b37a76f26.tar.xz
libreoffice-ed5640d8b587fbcfed7dd7967f3de04b37a76f26.zip
Adding upstream version 4:7.4.7.upstream/4%7.4.7upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'ucb/README.md')
-rw-r--r--ucb/README.md78
1 files changed, 78 insertions, 0 deletions
diff --git a/ucb/README.md b/ucb/README.md
new file mode 100644
index 000000000..d77656104
--- /dev/null
+++ b/ucb/README.md
@@ -0,0 +1,78 @@
+# Universal Content Broker (UCB)
+
+Universal Content Broker (has ucps) which do things like convert files
+to strings in content broker world, or connect LibreOffice with
+various DMS and fileshare systems like WebDAV, CMIS, or GIO.
+
+The UCPs implement the Universal Content Provider UNO interfaces in
+C++, in particular the `com.sun.star.ucb.ContentProvider` service.
+
+## WebDAV UCP
+
+The WebDAV content provider is based on `libcurl` for much of the
+network and protocol stuff, including authentication.
+
+WebDAV as implemented here is defined in an IETF RFC 4918 extensions,
+and the code supports both unencrypted HTTP/1.1 (IETF RFC 2616) as
+well as TLS 1.2 or later.
+
+Our WebDAV `com.sun.star.ucb.ContentProvider` service implementation
+registers the `vnd.sun.star.webdav` and `http` URI schemes (and their
+encrypted TLS variants). See here for the specification:
+https://wiki.documentfoundation.org/Documentation/DevGuide/Universal_Content_Providers#The_WebDAV_Content_Provider
+
+Historically, webdav had two ucps, one based on `neon`, the second one
+based on `serf`. Both are superseded by the current `libcurl`
+implementation (since LibreOffice 7.3), but in case of behavioural
+differences, go hunt for code differences (ucb/source/ucp/webdav-neon
+and ucb/source/ucp/webdav).
+
+The WebDAV protocol is implemented on top of libcurl basic http GET,
+PUT, and POST requests (and is relatively straight-forward - see
+`ucb/source/ucp/webdav-curl/webdavcontent.cxx` for the main
+functionality), but incorporates custom handling for a number of
+server idiosyncrasies:
+
+* Nextcloud will reply to a PROPFIND request with "100 Continue" and
+ then after the data is uploaded it will send a "401 Unauthorized" if
+ the auth header is missing in the headers to which it replied with
+ "100 Continue".
+* Sharepoint 16 responds to PROPFIND, PROPPATCH and LOCK with
+ "Transfer-Encoding: chunked"
+ with "HTTP/1.1 200 OK" and an actual error message in the response *body*.
+ * apparently setting Content-Length works better, so we use that
+* Sharepoint returns redirect urls that curl can't parse, so we encode
+ them (check `WebDAVResponseParser` for the code)
+* Sharepoint may reply to HEAD with 200 OK but then 404 NOT FOUND to PROPFIND
+* Sharepoint does not appear to support Dead Properties
+* avoiding chunked encoding for PUT, since for Nextcloud:
+ * Transfer-Encoding: chunked creates a 0 byte file with response
+ "201 Created"
+ * see upstream bug: https://github.com/nextcloud/server/issues/7995
+
+To a first approximation, there are 3 parts involved in the UCP:
+
+* The upper layer implements the UNO API which is called by LibreOffice, and
+ translates the calls from generic sequence-of-any stringly typed abstractness
+ into HTTP or WebDAV protocol calls, and does some high level protocol
+ handling to figure out what the server supports and so on. This is
+ independent of the low-level library.
+* Then there is the lower layer of the UCP, which translates the generic HTTP
+ or WebDAV protocol calls to something that the particular third-party library
+ can understand, hook up its callbacks for data transfer and authentication,
+ and parse the reply XML documents.
+* At the bottom, there is the third-party library that implements the HTTP
+ protocol.
+
+The most important classes are:
+* ContentProvider: the UNO entry point/factory, creates Content instances
+* Content: the main UNO service, translates the UCP API to WebDAV methods,
+ one instance per URL
+* DAVResourceAccess: sits between Content and CurlSession
+* DAVSessionFactory: creates CurlSession for DAVResourceAccess
+* CurlSession: low-level interfacing with libcurl
+* SerfLockStore: singleton used by CurlSession to store DAV lock tokens, runs
+ a thread to refresh locks when they expire
+* WebDAVResponseParser: parse XML responses to LOCK, PROPFIND requests
+* DAVAuthListener_Impl: request credentials from UI via UNO
+