summaryrefslogtreecommitdiffstats
path: root/static/README.wasm.md
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 05:54:39 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 05:54:39 +0000
commit267c6f2ac71f92999e969232431ba04678e7437e (patch)
tree358c9467650e1d0a1d7227a21dac2e3d08b622b2 /static/README.wasm.md
parentInitial commit. (diff)
downloadlibreoffice-267c6f2ac71f92999e969232431ba04678e7437e.tar.xz
libreoffice-267c6f2ac71f92999e969232431ba04678e7437e.zip
Adding upstream version 4:24.2.0.upstream/4%24.2.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'static/README.wasm.md')
-rw-r--r--static/README.wasm.md413
1 files changed, 413 insertions, 0 deletions
diff --git a/static/README.wasm.md b/static/README.wasm.md
new file mode 100644
index 0000000000..f39a79247d
--- /dev/null
+++ b/static/README.wasm.md
@@ -0,0 +1,413 @@
+# Support for Emscripten Cross Build
+
+This subdirectory provides support for building LibreOffice as WASM, with the Emscripten toolchain.
+
+You can build LibreOffice for WASM for two separate purposes: 1)
+Either to produce a WASM binary of LibreOffice as such, using Qt5 for
+its GUI, or 2) just compiling LibreOffice core ("LibreOffice
+Technology") to WASM without any UI for use in other software that
+provides the UI, like Collabora Online built as WASM.
+
+The first purpose was the original reason for the WASM port and this
+document was originally written with that in mind. For the second
+purpose, look towards the end of the document for the section
+"Building headless LibreOffice as WASM for use in another product".
+
+## Status of LibreOffice as WASM with Qt
+
+The build generates a Writer-only LO build. You should be able to run either
+
+ $ emrun --serve_after_close instdir/program/qt_soffice.html
+ $ emrun --serve_after_close workdir/LinkTarget/Executable/qt_vcldemo.html
+ $ emrun --serve_after_close workdir/LinkTarget/Executable/qt_wasm-qt5-mandelbrot.html
+
+REMINDER: Always start new tabs in the browser, reload might fail / cache!
+INFO: latest browser won't work anymore with 0.0.0.0 and need 127.0.0.1.
+
+## Setup for the LO WASM build (with Qt)
+
+We're using Qt 5.15.2 with Emscripten 2.0.31. There are a bunch of Qt patches
+to fix the most grave bugs. Also newer Emscripten versions have various bugs
+with the FS image support.
+
+- See below under Docker build for another build option
+
+### Setup emscripten
+
+<https://emscripten.org/docs/getting_started/index.html>
+
+ git clone https://github.com/emscripten-core/emsdk.git
+ ./emsdk install 2.0.31
+ ./emsdk activate --embedded 2.0.31
+
+Example `bashrc` scriptlet:
+
+ EMSDK_ENV=$HOME/Development/libreoffice/git_emsdk/emsdk_env.sh
+ [ -f "$EMSDK_ENV" ] && \. "$EMSDK_ENV" 1>/dev/null 2>&1
+
+### Setup Qt
+
+<https://doc.qt.io/qt-5/wasm.html>
+
+Most of the information from <https://doc.qt.io/qt-6/wasm.html> is still valid for Qt5;
+generally the Qt6 WASM documentation is much better, because it incorporated many
+information from the Qt Wiki.
+
+FWIW: Qt 5.15 LTS is not maintained publicly and Qt WASM has quite a few bugs. Most
+WASM fixes from Qt 6 are needed for Qt 5.15 too. Allotropia offers a Qt repository
+with the necessary patches cherry-picked.
+
+ git clone https://github.com/allotropia/qt5.git
+ cd qt5
+ git checkout v5.15.2+wasm
+ ./init-repository --module-subset=qtbase
+ ./configure -xplatform wasm-emscripten -feature-thread -prefix <whatever>
+ make -j<CORES> module-qtbase
+
+Optionally you can add the configure flag "-compile-examples". But then you also have to
+patch at least mkspecs/wasm-emscripten/qmake.conf with EXIT_RUNTIME=0, otherwise they will
+fail to run. In addition, building with examples will break with some of them, but at that
+point Qt already works and also most examples.
+Building with examples will break with some of them, but at that point Qt already works.
+Or just skip them. Other interesting flags might be "-nomake tests -no-pch -ccache".
+
+Linking takes quite a long time, because emscripten-finalize rewrites the whole WASM files
+with some options. This way the LO WASM needs at least 64GB RAM. For faster link times add
+"-s WASM_BIGINT=1", change to ASSERTIONS=1 nd use -g3 to prevent rewriting the WASM file
+and generating source maps (see emscripten.py, finalize_wasm, and avoid modify_wasm = True).
+This is just needed for Qt examples, as LO already uses the correct flags!
+
+The install is not really needed, as LO currently just uses qtbase on its own. You can do
+
+ make -j<CORES> install
+or
+ make -j8 -C qtbase/src install_subtargets
+
+Current Qt fails to start the demo webserver: <https://bugreports.qt.io/browse/QTCREATORBUG-24072>
+
+Use `emrun --serve_after_close` to run Qt WASM demos.
+
+### Setup LO
+
+`autogen.sh` is patched to use emconfigure. That basically sets various
+environment vars, especially `EMMAKEN_JUST_CONFIGURE`, which will create the
+correct output file names, checked by `configure` (`a.out`).
+
+There's a distro config for WASM, but it just provides --host=wasm32-local-emscripten, which
+should be enough setup. The build itself is a cross build and the cross-toolset just depends
+on a minimal toolset (gcc, libc-dev, flex, bison); all else is build from source, because the
+final result is not depending on the build system at all.
+
+Recommended configure setup is thusly:
+
+* grab defaults
+ `--with-distro=LibreOfficeWASM32`
+
+* local config
+ `QT5DIR=/dir/of/git_qt5/qtbase`
+
+* if you want to use ccache on both sides of the build
+ `--with-build-platform-configure-options=--enable-ccache`
+ `--enable-ccache`
+
+FWIW: it's also possible to build an almost static Linux LibreOffice by just using
+--disable-dynloading --enable-customtarget-components. System externals are still
+linked dynamically, but everything else is static.
+
+#### Experimental (AKA currently broken) WASM exception + SjLj build
+
+You can build LO with WASM exceptions, which should be "much" faster then the JS
+based Emscripten EH handling. For setjmp / longjmp (SjLj) used by the PNG and JPEG
+libraries error handling, this needs Emscripten 3.1.3+. That builds, but execution
+still fails early with a signature mismatch call to Task::UpdateMinPeriod in LO's
+job scheduler code. Unfortunately the build also needs a Qt build with
+"-s SUPPORT_LONGJMP=wasm", which is incompatible with the JS EH + SjLj.
+
+The LO configure flag is simply an additional --enable-wasm-exceptions. Qt5 can
+be patched in qtbase/mkspecs/wasm-emscripten/qmake.conf with the addition of
+
+ QMAKE_CFLAGS += -s SUPPORT_LONGJMP=wasm
+ QMAKE_CXXFLAGS += -s SUPPORT_LONGJMP=wasm
+
+### "Deploying" soffice.wasm
+
+ tar -chf wasm.tar --xform 's/.*program/lo-wasm/' instdir/program/soffice.* \
+ instdir/program/qt*
+
+Your HTTP server needs to provide additional headers:
+* add_header Cross-Origin-Opener-Policy same-origin
+* add_header Cross-Origin-Embedder-Policy require-corp
+
+The default html to use should be qt_soffice.html
+
+### Debugging setup
+
+Since a few months you can use DWARF information embedded by LLVM into the WASM
+to debug WASM in Chrome. You need to enable an experimental feature and install
+an additional extension. The whole setup is described in:
+
+https://developer.chrome.com/blog/wasm-debugging-2020/
+
+This way you don't need source maps (much faster linking!) and can resolve local
+WASM variables to C++ names!
+
+Per default, the WASM debug build splits the DWARF information into an additional
+WASM file, postfixed '.debug.wasm'.
+
+### Using Docker to cross-build with emscripten
+
+If you prefer a controlled environment (sadly emsdk install/activate
+is _not_ stable over time, as e.g. nodejs versions evolve), that is
+easy to replicate across different machines - consider the docker
+images we're providing.
+
+Config/setup file see
+<https://git.libreoffice.org/lode/+/ccb36979563635b51215477455953252c99ec013>
+
+Run
+
+ docker-compose build
+
+in the lode/docker dir to get the container prepared. Run
+
+ PARALLELISM=4 BUILD_OPTIONS= BUILD_TARGET=build docker-compose run --rm \
+ -e PARALLELISM -e BUILD_TARGET -e BUILD_OPTIONS builder
+
+to perform an actual `srcdir != builddir` build; the container mounts
+checked-out git repo and output dir via `docker-compose.yml` (so make
+sure the path names there match your setup):
+
+The lode setup expects, inside the lode/docker subdir, the following directories:
+
+- core (`git checkout`)
+- workdir (the output dir - gets written into)
+- cache (`ccache tree`)
+- tarballs (external project tarballs gets written and cached there)
+
+
+## Ideas for an UNO bridge implementation
+
+My post to Discord #emscripten:
+
+"I'm looking for a way to do an abstract call
+from one WASM C++ object to another WASM C++ object, so like FFI / WebIDL,
+just within WASM. All my code is C++ and normally I have bridge code, with
+assembler to implement the function call /RTTI and exception semantics of the
+specified platform. Code is at
+<https://cgit.freedesktop.org/libreoffice/core/tree/bridges/source/cpp_uno>.
+I've read a bit about `call_indirect` and stuff, but I don't have yet a good
+idea, how I could implement this (and there is an initial feature/wasm branch
+for the interested). I probably need some fixed lookup table, like on iOS,
+because AFAIK you can't dynamically generate code in WASM. So any pointers or
+ideas for an implementation? I can disassemble some minimalistic WASM example
+and read clang code for `WASM_EmscriptenInvoke`, but if there were some
+standalone code or documentation I'm missing, that would be nice to know."
+
+We basically would go the same way then the other backends. Write the bridge in
+C++, which is probably largely boilerplate code, but the function call in WAT
+(<https://github.com/WebAssembly/wabt>) based on the LLVM WASM calling
+conventions in `WASM_EmscriptenInvoke`. I didn't get a reply to that question for
+hours. Maybe I'll open an Emscripten issue, if we really have to implement
+this.
+
+WASM dynamic dispatch:
+
+- <https://fitzgeraldnick.com/2018/04/26/how-does-dynamic-dispatch-work-in-wasm.html>
+
+### UNO bindings with Embind
+
+Right now there's a very rough implementation in place. With lots of different
+bits unimplemented. And it _might_ be leaking memory. i.e. Lots of room for
+improvement! ;)
+
+Some usage examples through javascript of the current implementation:
+```js
+// inserts a string at the start of the Writer document.
+xModel = Module.getCurrentModelFromViewSh();
+xTextDocument = new Module.com$sun$star$text$XTextDocumentRef(xModel, Module.UnoReference_Query.UNO_QUERY);
+xText = xTextDocument.getText();
+xSimpleText = new Module.com$sun$star$text$XSimpleTextRef(xText, Module.UnoReference_Query.UNO_QUERY);
+xTextCursor = xSimpleText.createTextCursor();
+xTextRange = new Module.com$sun$star$text$XTextRangeRef(xTextCursor, Module.UnoReference_Query.UNO_QUERY);
+xTextRange.setString(new Module.OUString("string here!"));
+xModel.delete(); xTextDocument.delete(); xText.delete(); xSimpleText.delete(); xTextCursor.delete(); xTextRange.delete();
+```
+
+```js
+// changes each paragraph of the Writer document to a random color.
+xModel = Module.getCurrentModelFromViewSh();
+xTextDocument = new Module.com$sun$star$text$XTextDocumentRef(xModel, Module.UnoReference_Query.UNO_QUERY);
+xText = xTextDocument.getText();
+xEnumAccess = new Module.com$sun$star$container$XEnumerationAccessRef(xText, Module.UnoReference_Query.UNO_QUERY);
+xParaEnumeration = xEnumAccess.createEnumeration();
+
+while (xParaEnumeration.hasMoreElements()) {
+ xParagraph = new Module.com$sun$star$text$XTextRangeRef();
+ xParagraph.set(xParaEnumeration.nextElement(), Module.UnoReference_Query.UNO_QUERY);
+ if (xParagraph.is()) {
+ xParaProps = new Module.com$sun$star$beans$XPropertySetRef(xParagraph, Module.UnoReference_Query.UNO_QUERY);
+ xParaProps.setPropertyValue(new Module.OUString("CharColor"), new Module.Any(Math.floor(Math.random() * 0xFFFFFF), Module.UnoType.long));
+ }
+}
+```
+
+
+
+## Tools for problem diagnosis
+
+* `nm -s` should list the symbols in the archive, based on the index generated by ranlib.
+ If you get linking errors that archive has no index.
+
+
+## Emscripten filesystem access with threads
+
+This is closed, but not really fixed IMHO:
+
+- <https://github.com/emscripten-core/emscripten/issues/3922>
+
+## Dynamic libraries `/` modules in emscripten
+
+There is a good summary in:
+
+- <https://bugreports.qt.io/browse/QTBUG-63925>
+
+Summary: you can't use modules and threads.
+
+This is mentioned at the end of:
+
+- <https://github.com/emscripten-core/emscripten/wiki/Linking>
+
+The usage of `MAIN_MODULE` and `SIDE_MODULE` has other problems, a major one IMHO is symbol resolution at runtime only.
+So this works really more like plugins in the sense of symbol resolution without dependencies `/` rpath.
+
+There is some clang-level dynamic-linking in progress (WASM dlload). The following link is already a bit old,
+but I found it a god summary of problems to expect:
+
+- <https://iandouglasscott.com/2019/07/18/experimenting-with-webassembly-dynamic-linking-with-clang/>
+
+
+## Mixed information, links, problems, TODO
+
+More info on Qt WASM emscripten pthreads:
+
+- <https://wiki.qt.io/Qt_for_WebAssembly#Multithreading_Support>
+
+WASM needs `-pthread` at compile, not just link time for atomics support. Alternatively you can provide
+`-s USE_PTHREADS=1`, but both don't seem to work reliable, so best provide both.
+<https://github.com/emscripten-core/emscripten/issues/10370>
+
+The output file must have the prefix .o, otherwise the WASM files will get a
+`node.js` shebang (!) and ranlib won't be able to index the library (link errors).
+
+Qt with threads has further memory limit. From Qt configure:
+````
+Project MESSAGE: Setting PTHREAD_POOL_SIZE to 4
+Project MESSAGE: Setting TOTAL_MEMORY to 1GB
+````
+
+You can actually allocate 4GB:
+
+- <https://bugzilla.mozilla.org/show_bug.cgi?id=1392234>
+
+LO uses a nested event loop to run dialogs in general, but that won't work, because you can't drive
+the browser event loop. like VCL does with the system event loop in the various VCL backends.
+Changing this will need some major work (basically dropping Application::Execute).
+
+But with the know problems with exceptions and threads, this might change:
+
+- <https://github.com/emscripten-core/emscripten/pull/11518>
+- <https://github.com/emscripten-core/emscripten/issues/11503>
+- <https://github.com/emscripten-core/emscripten/issues/11233>
+- <https://github.com/emscripten-core/emscripten/issues/12035>
+
+We're also using emconfigure at the moment. Originally I patched emscripten, because it
+wouldn't create the correct a.out file for C++ configure tests. Later I found that
+the `emconfigure` sets `EMMAKEN_JUST_CONFIGURE` to work around the problem.
+
+ICU bug:
+
+- <https://github.com/emscripten-core/emscripten/issues/10129>
+
+Alternative, probably:
+
+- <https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Intl>
+
+There is a wasm64, but that still uses 32bit pointers!
+
+Old outdated docs:
+
+- <https://wiki.documentfoundation.org/Development/Emscripten>
+
+Reverted patch:
+
+- <https://cgit.freedesktop.org/libreoffice/core/commit/?id=0e21f6619c72f1e17a7b0a52b6317810973d8a3e>
+
+Generally <https://emscripten.org/docs/porting>:
+
+- <https://emscripten.org/docs/porting/guidelines/api_limitations.html#api-limitations>
+- <https://emscripten.org/docs/porting/files/file_systems_overview.html#file-system-overview>
+- <https://emscripten.org/docs/porting/pthreads.html>
+- <https://emscripten.org/docs/porting/emscripten-runtime-environment.html>
+
+This will be interesting:
+
+- <https://emscripten.org/docs/getting_started/FAQ.html#how-do-i-run-an-event-loop>
+
+This didn't help much yet:
+
+- <https://github.com/emscripten-ports>
+
+Emscripten supports standalone WASI binaries:
+
+- <https://github.com/emscripten-core/emscripten/wiki/WebAssembly-Standalone>
+- <https://www.qt.io/qt-examples-for-webassembly>
+- <http://qtandeverything.blogspot.com/2017/06/qt-for-web-assembly.html>
+- <http://qtandeverything.blogspot.com/2020/>
+- <https://emscripten.org/docs/api_reference/Filesystem-API.html>
+- <https://discuss.python.org/t/add-a-webassembly-wasm-runtime/3957/12>
+- <http://git.savannah.gnu.org/cgit/config.git>
+- <https://webassembly.org/specs/>
+- <https://developer.chrome.com/docs/native-client/>
+- <https://emscripten.org/docs/getting_started/downloads.html>
+- <https://github.com/openpgpjs/openpgpjs/blob/master/README.md#getting-started>
+- <https://developer.mozilla.org/en-US/docs/WebAssembly/Using_the_JavaScript_API>
+- <https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-intro.md>
+- <https://www.ip6.li/de/security/x.509_kochbuch/openssl-fuer-webassembly-compilieren>
+- <https://emscripten.org/docs/introducing_emscripten/about_emscripten.html#about-emscripten-porting-code>
+- <https://emscripten.org/docs/compiling/Building-Projects.html>
+
+## Building headless LibreOffice as WASM for use in another product
+
+### Set up Emscripten
+
+Follow the instructions in the first part of this document.
+
+### No Qt needed.
+
+You don't need any dependencies other than those that normally are
+downloaded and compiled when building LibreOffice.
+
+### Set up LO
+
+For instance, this autogen.input works for me:
+
+`--disable-debug`
+`--enable-sal-log`
+`--disable-crashdump`
+`--host=wasm32-local-emscripten`
+`--disable-gui`
+`--with-main-module=writer`
+
+For building LO core for use in COWASM, it is known to work to use
+Emscripten 3.1.30 (and not just 2.0.31 which is what the LO+Qt5 work
+has been using).
+
+### That's all
+
+After all, in this case you are building LO core headless for it to be used by other software.
+
+Note that a soffice.wasm will be built, but that is just because of
+how the makefilery has been set up. We do need the soffice.data file
+that contains the in-memory file system needed by the LibreOffice
+Technology core code during run-time, though. That is at the moment
+built as a side-effect when building soffice.wasm.