summaryrefslogtreecommitdiffstats
path: root/platform/nodejs/README.md
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-17 05:47:55 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-17 05:47:55 +0000
commit31d6ff6f931696850c348007241195ab3b2eddc7 (patch)
tree615cb1c57ce9f6611bad93326b9105098f379609 /platform/nodejs/README.md
parentInitial commit. (diff)
downloadublock-origin-31d6ff6f931696850c348007241195ab3b2eddc7.tar.xz
ublock-origin-31d6ff6f931696850c348007241195ab3b2eddc7.zip
Adding upstream version 1.55.0+dfsg.upstream/1.55.0+dfsg
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'platform/nodejs/README.md')
-rw-r--r--platform/nodejs/README.md158
1 files changed, 158 insertions, 0 deletions
diff --git a/platform/nodejs/README.md b/platform/nodejs/README.md
new file mode 100644
index 0000000..0b3e3d8
--- /dev/null
+++ b/platform/nodejs/README.md
@@ -0,0 +1,158 @@
+# uBlock Origin Core
+
+The core filtering engines used in the uBlock Origin ("uBO") extension, and has
+no external dependencies.
+
+## Installation
+
+Install: `npm install @gorhill/ubo-core`
+
+This is a very early version and the API is subject to change at any time.
+
+This package uses [native JavaScript modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules).
+
+
+## Description
+
+The package contains uBO's static network filtering engine ("SNFE"), which
+purpose is to parse and enforce filter lists. The matching algorithm is highly
+efficient, and _especially_ optimized to match against large sets of pure
+hostnames.
+
+The SNFE can be fed filter lists from a variety of sources, such as [EasyList/EasyPrivacy](https://easylist.to/),
+[uBlock filters](https://github.com/uBlockOrigin/uAssets/tree/master/filters),
+and also lists of domain names or hosts file format (i.e. block lists from [The Block List Project](https://github.com/blocklistproject/Lists#the-block-list-project),
+[Steven Black's HOSTS](https://github.com/StevenBlack/hosts#readme), etc).
+
+
+## Usage
+
+At the moment, there can be only one instance of the static network filtering
+engine ("SNFE"), which proxy API must be imported as follow:
+
+```js
+import { StaticNetFilteringEngine } from '@gorhill/ubo-core';
+```
+
+If you must import as a NodeJS module:
+
+```js
+const { StaticNetFilteringEngine } = await import('@gorhill/ubo-core');
+```
+
+
+Create an instance of SNFE:
+
+```js
+const snfe = StaticNetFilteringEngine.create();
+```
+
+Feed the SNFE with filter lists -- `useLists()` accepts an array of
+objects (or promises to object) which expose the raw text of a list
+through the `raw` property, and optionally the name of the list through the
+`name` property (how you fetch the lists is up to you):
+
+```js
+await snfe.useLists([
+ fetch('easylist').then(raw => ({ name: 'easylist', raw })),
+ fetch('easyprivacy').then(raw => ({ name: 'easyprivacy', raw })),
+]);
+```
+
+Now we are ready to match network requests:
+
+```js
+// Not blocked
+if ( snfe.matchRequest({
+ originURL: 'https://www.bloomberg.com/',
+ url: 'https://www.bloomberg.com/tophat/assets/v2.6.1/that.css',
+ type: 'stylesheet'
+}) !== 0 ) {
+ console.log(snfe.toLogData());
+}
+
+// Blocked
+if ( snfe.matchRequest({
+ originURL: 'https://www.bloomberg.com/',
+ url: 'https://securepubads.g.doubleclick.net/tag/js/gpt.js',
+ type: 'script'
+}) !== 0 ) {
+ console.log(snfe.toLogData());
+}
+
+// Unblocked
+if ( snfe.matchRequest({
+ originURL: 'https://www.bloomberg.com/',
+ url: 'https://sourcepointcmp.bloomberg.com/ccpa.js',
+ type: 'script'
+}) !== 0 ) {
+ console.log(snfe.toLogData());
+}
+```
+
+It is possible to pre-parse filter lists and save the intermediate results for
+later use -- useful to speed up the loading of filter lists. This will be
+documented eventually, but if you feel adventurous, you can look at the code
+and use this capability now if you figure out the details.
+
+---
+
+## Extras
+
+You can directly use specific APIs exposed by this package, here are some of
+them, which are used internally by uBO's SNFE.
+
+### HNTrieContainer
+
+A well optimised [compressed trie](https://en.wikipedia.org/wiki/Trie#Compressing_tries)
+container specialized to specifically store and lookup hostnames.
+
+The matching algorithm is designed for hostnames, i.e. the hostname labels
+making up a hostname are matched from right to left, such that `www.example.org`
+with be a match if `example.org` is stored into the trie, while
+`anotherexample.org` won't be a match.
+
+`HNTrieContainer` is designed to store a large number of hostnames with CPU and
+memory efficiency as a main concern -- and is a key component of uBO.
+
+To create and use a standalone `HNTrieContainer` object:
+
+```js
+import HNTrieContainer from '@gorhill/ubo-core/js/hntrie.js';
+
+const trieContainer = new HNTrieContainer();
+
+const aTrie = trieContainer.createOne();
+trieContainer.add(aTrie, 'example.org');
+trieContainer.add(aTrie, 'example.com');
+
+const anotherTrie = trieContainer.createOne();
+trieContainer.add(anotherTrie, 'foo.invalid');
+trieContainer.add(anotherTrie, 'bar.invalid');
+
+// matches() return the position at which the match starts, or -1 when
+// there is no match.
+
+// Matches: return 4
+console.log("trieContainer.matches(aTrie, 'www.example.org')", trieContainer.matches(aTrie, 'www.example.org'));
+
+// Does not match: return -1
+console.log("trieContainer.matches(aTrie, 'www.foo.invalid')", trieContainer.matches(aTrie, 'www.foo.invalid'));
+
+// Does not match: return -1
+console.log("trieContainer.matches(anotherTrie, 'www.example.org')", trieContainer.matches(anotherTrie, 'www.example.org'));
+
+// Matches: return 0
+console.log("trieContainer.matches(anotherTrie, 'foo.invalid')", trieContainer.matches(anotherTrie, 'foo.invalid'));
+```
+
+The `reset()` method must be used to remove all the tries from a trie container,
+you can't remove a single trie from the container.
+
+```js
+trieContainer.reset();
+```
+
+When you reset a trie container, you can't use the reference to prior instances
+of trie, i.e. `aTrie` and `anotherTrie` are no longer valid and shouldn't be
+used following a reset.