From 31d6ff6f931696850c348007241195ab3b2eddc7 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Wed, 17 Apr 2024 07:47:55 +0200 Subject: Adding upstream version 1.55.0+dfsg. Signed-off-by: Daniel Baumann --- platform/nodejs/README.md | 158 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100644 platform/nodejs/README.md (limited to 'platform/nodejs/README.md') diff --git a/platform/nodejs/README.md b/platform/nodejs/README.md new file mode 100644 index 0000000..0b3e3d8 --- /dev/null +++ b/platform/nodejs/README.md @@ -0,0 +1,158 @@ +# uBlock Origin Core + +The core filtering engines used in the uBlock Origin ("uBO") extension, and has +no external dependencies. + +## Installation + +Install: `npm install @gorhill/ubo-core` + +This is a very early version and the API is subject to change at any time. + +This package uses [native JavaScript modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules). + + +## Description + +The package contains uBO's static network filtering engine ("SNFE"), which +purpose is to parse and enforce filter lists. The matching algorithm is highly +efficient, and _especially_ optimized to match against large sets of pure +hostnames. + +The SNFE can be fed filter lists from a variety of sources, such as [EasyList/EasyPrivacy](https://easylist.to/), +[uBlock filters](https://github.com/uBlockOrigin/uAssets/tree/master/filters), +and also lists of domain names or hosts file format (i.e. block lists from [The Block List Project](https://github.com/blocklistproject/Lists#the-block-list-project), +[Steven Black's HOSTS](https://github.com/StevenBlack/hosts#readme), etc). + + +## Usage + +At the moment, there can be only one instance of the static network filtering +engine ("SNFE"), which proxy API must be imported as follow: + +```js +import { StaticNetFilteringEngine } from '@gorhill/ubo-core'; +``` + +If you must import as a NodeJS module: + +```js +const { StaticNetFilteringEngine } = await import('@gorhill/ubo-core'); +``` + + +Create an instance of SNFE: + +```js +const snfe = StaticNetFilteringEngine.create(); +``` + +Feed the SNFE with filter lists -- `useLists()` accepts an array of +objects (or promises to object) which expose the raw text of a list +through the `raw` property, and optionally the name of the list through the +`name` property (how you fetch the lists is up to you): + +```js +await snfe.useLists([ + fetch('easylist').then(raw => ({ name: 'easylist', raw })), + fetch('easyprivacy').then(raw => ({ name: 'easyprivacy', raw })), +]); +``` + +Now we are ready to match network requests: + +```js +// Not blocked +if ( snfe.matchRequest({ + originURL: 'https://www.bloomberg.com/', + url: 'https://www.bloomberg.com/tophat/assets/v2.6.1/that.css', + type: 'stylesheet' +}) !== 0 ) { + console.log(snfe.toLogData()); +} + +// Blocked +if ( snfe.matchRequest({ + originURL: 'https://www.bloomberg.com/', + url: 'https://securepubads.g.doubleclick.net/tag/js/gpt.js', + type: 'script' +}) !== 0 ) { + console.log(snfe.toLogData()); +} + +// Unblocked +if ( snfe.matchRequest({ + originURL: 'https://www.bloomberg.com/', + url: 'https://sourcepointcmp.bloomberg.com/ccpa.js', + type: 'script' +}) !== 0 ) { + console.log(snfe.toLogData()); +} +``` + +It is possible to pre-parse filter lists and save the intermediate results for +later use -- useful to speed up the loading of filter lists. This will be +documented eventually, but if you feel adventurous, you can look at the code +and use this capability now if you figure out the details. + +--- + +## Extras + +You can directly use specific APIs exposed by this package, here are some of +them, which are used internally by uBO's SNFE. + +### HNTrieContainer + +A well optimised [compressed trie](https://en.wikipedia.org/wiki/Trie#Compressing_tries) +container specialized to specifically store and lookup hostnames. + +The matching algorithm is designed for hostnames, i.e. the hostname labels +making up a hostname are matched from right to left, such that `www.example.org` +with be a match if `example.org` is stored into the trie, while +`anotherexample.org` won't be a match. + +`HNTrieContainer` is designed to store a large number of hostnames with CPU and +memory efficiency as a main concern -- and is a key component of uBO. + +To create and use a standalone `HNTrieContainer` object: + +```js +import HNTrieContainer from '@gorhill/ubo-core/js/hntrie.js'; + +const trieContainer = new HNTrieContainer(); + +const aTrie = trieContainer.createOne(); +trieContainer.add(aTrie, 'example.org'); +trieContainer.add(aTrie, 'example.com'); + +const anotherTrie = trieContainer.createOne(); +trieContainer.add(anotherTrie, 'foo.invalid'); +trieContainer.add(anotherTrie, 'bar.invalid'); + +// matches() return the position at which the match starts, or -1 when +// there is no match. + +// Matches: return 4 +console.log("trieContainer.matches(aTrie, 'www.example.org')", trieContainer.matches(aTrie, 'www.example.org')); + +// Does not match: return -1 +console.log("trieContainer.matches(aTrie, 'www.foo.invalid')", trieContainer.matches(aTrie, 'www.foo.invalid')); + +// Does not match: return -1 +console.log("trieContainer.matches(anotherTrie, 'www.example.org')", trieContainer.matches(anotherTrie, 'www.example.org')); + +// Matches: return 0 +console.log("trieContainer.matches(anotherTrie, 'foo.invalid')", trieContainer.matches(anotherTrie, 'foo.invalid')); +``` + +The `reset()` method must be used to remove all the tries from a trie container, +you can't remove a single trie from the container. + +```js +trieContainer.reset(); +``` + +When you reset a trie container, you can't use the reference to prior instances +of trie, i.e. `aTrie` and `anotherTrie` are no longer valid and shouldn't be +used following a reset. -- cgit v1.2.3