Code dump

author Chris Morgan <me@chrismorgan.info>

Sat, 6 May 2023 12:52:30 +0000

committer Chris Morgan <me@chrismorgan.info>

Sat, 6 May 2023 12:52:30 +0000
author: Chris Morgan <me@chrismorgan.info>
Sat, 6 May 2023 12:52:30 +0000
committer: Chris Morgan <me@chrismorgan.info>
Sat, 6 May 2023 12:52:30 +0000
diff --git a/.gitignore b/.gitignore

new file mode 100644 (file)

index 0000000..6b0f34a
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,5 @@
+/rust-client/target/
+/rust-client/bin/
+/rust-client/pkg/
+Cargo.lock
+wasm-pack.log
diff --git a/COPYING b/COPYING

new file mode 100644 (file)

index 0000000..8c3fdef
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,17 @@
+Copyright © 2019– Chris Morgan
+
+This project is distributed under the terms of three different licenses,
+at your choice:
+
+  • Blue Oak Model License 1.0.0: https://blueoakcouncil.org/license/1.0.0
+  • MIT License: https://opensource.org/licenses/MIT
+  • Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0
+
+If you do not have particular cause to select the MIT or the Apache-2.0
+license, Chris Morgan recommends that you select BlueOak-1.0.0, which is
+better and simpler than both MIT and Apache-2.0, which are only offered
+due to their greater recognition and their conventional use in the Rust
+ecosystem. (BlueOak-1.0.0 was only published in March 2019.)
+
+When using this code, ensure you comply with the terms of at least one of
+these licenses.
diff --git a/README.ttt b/README.ttt

new file mode 100644 (file)

index 0000000..6b3fab2
--- /dev/null
+++ b/README.ttt
@@ -0,0 +1,195 @@
+# remote-dom-vm: efficiently render to DOM from a worker
+
+    Run all your UI code from inside a web worker, rather than on the main thread.
+
+    This project provides a minimal instruction set for efficiently driving the DOM in a write-only fashion from Rust WASM code in a web worker.
+    That is its scope.
+    The “web worker” and “Rust WASM” bits are technically negotiable,
+    but if you want to use it outside of a worker or in a different language then you’ll have to implement those parts yourself.
+
+    “Write-only” means that you can perform write operations like Document.createElement, Element.setAttribute and Node.appendChild,
+    but not read operations like Element.getAttribute, ParentNode.children or Element.getBoundingClientRect.
+
+## The architecture
+
+    The main thread runs a small VM, written in JavaScript,
+    which takes byte code instructions and executes them.
+
+    The worker runs Rust code that accepts and queues DOM mutations into byte code,
+    then can flush them to the main thread to be executed by the TypeScript VM.
+
+    Logically, communication only operates in a single direction: from worker to main thread, write-only.
+    This is more efficient than bidirectional communication.
+
+    DOM nodes are accessed by index, with nodes assigned incremental indexes.
+    (It’s possible that in the future the worker will assign IDs;
+    but for now, auto-incrementing is adequate.)
+
+    Since DOM operations are only performed by this VM when queued instructions are flushed,
+    this gets you efficient batching with no hazard of triggering layout multiple times in a frame.
+    Indeed, at this time you can’t query anything about the DOM or layout.
+
+## Background: DOM interop from WASM
+
+    At the time of writing, WASM code can’t interact with the DOM directly.
+    Instead, it has to go through JavaScript FFI,
+    where the FFI performs the operation and gives the Rust what amounts to a pointer.
+    Here’s an approximation of how it happens:
+
+    JavaScript code (written here as TypeScript):
+
+        ◊code.ts`
+        type Ref = number;
+        let refs: {[ref: Ref]: Node} = {};
+        let next_ref: Ref = 0;
+        exports.get_document = function (): Ref {
+            const document_ref = next_ref++;
+            refs[document_ref] = document;
+            return document_ref;
+        };
+
+        exports.create_element = function (document_ref: number, element_name: string) {
+            const document = refs[document_ref];
+            const result = document.createElement(element_name);
+        };
+        `
+
+    And the Rust code that uses it:
+
+        ◊code.rs`
+        let document_ref = get_document();
+        let element_ref = create_element(document_ref, "div");
+        `
+
+    It is intended that eventually WASM be able to work with browser GC objects directly;
+    when that is done, this JS layer will be able to disappear.
+    (Rust’s wasm-bindgen has been designed with such future-compatibility in mind.)
+
+    I have decided to instead view this FFI layer as a feature, and take the separation further,
+    performing not synchronous FFI, but asynchronous cross-thread FFI.
+
+    This also explains part of why a bytecode VM is used,
+    rather than using objects for structure, as WorkerDOM does
+    (though even with it, you will notice that it uses a dictionary to compress its objects,
+    which suggests to me that they found that so doing improved memory usage and/or performance).
+    In the end, even once WASM can work with GC objects directly,
+    if you use structured objects,
+    you’ll still be needing to do some forms of type conversion,
+    so it’s fastest to just do all the work in an `ArrayBuffer` to avoid needing more than one format shift.
+
+## Why?
+
+    The reasons in the first 41 slides of <https://speakerdeck.com/cramforce/workerdom-javascript-concurrency-and-the-dom> apply.
+
+    This is an experiment in various things:
+
+      • efficient batching of DOM operations;
+      • seeing how much can be shifted off the main thread;
+      • seeing if even event handling can be shifted so.
+
+    The less you run on the main thread, the greater your chances of avoiding all jank.
+    There are two logical extremes here:
+      ① run no code; and
+      ② run all code on workers.
+    This experiment runs with the second of those logical extremes,
+    seeing how close we can get to it.
+
+    A possible goal is effective full-powered rendering to popup windows from the same code,
+    which frameworks don’t tend to consider.
+
+    A stretch goal is seeing if we can run all of this not in a worker on the concerned document, but on the service worker;
+    if it can be done, this could introduce interesting possibilities in totally synced multiple-tab support,
+    and improved memory efficiency with respect to local object caches.
+    Not sure whether it’ll pan out; my service worker is rusty.
+
+## Event handling
+
+    Event handling will be most interesting, to see whether it is actually possible to manage without maintaining at least the structure of the DOM on the worker side,
+    because I’m pairing it with doing event dispatch entirely manually,
+    so that we just register one event handler for each type on the window or document,
+    and traverse our own *component* tree to dispatch.
+    (The most interesting thing there will be preventDefault();
+    that must be done synchronously,
+    so we’ll need to define on the main thread a pure function to decide whether that should be called.
+    I think it should generally be reasonable,
+    but it will probably require breaking out of the component model occasionally.
+    Why am I writing all this here?
+    This stuff belongs in a section of its own, not in the “Why?” section.)
+
+## Performing other operations (especially getters) from the worker
+
+    That is, things like `Window.getComputedStyle`,
+    `Element.getBoundingClientRect` and `Element.scrollTop` for the inevitable occasions when you need them.
+    Also other mutating methods that aren’t particularly useful without such getters,
+    like `Element.scrollTo` which is only useful for zero and infinity unless you can get the coordinates of something inside it.
+
+    Answer: I don’t know.
+    I’m going to start without them and see what happens when I need them.
+    I have a few ideas: I may relax the “write-only” nature of the instruction set;
+    or introduce some kind of FFI layer in the byte code;
+    or declare it out of scope, just providing a `get_node(Ref)` function on the VM in the main thread;
+    or genuinely refuse them and see where that constraint leaves me.
+
+## Similar or related projects
+
+### [WorkerDOM <https://github.com/ampproject/worker-dom>]
+
+    I had the idea for this project fairly firmly mapped out in my mind,
+    then I went and searched the web to see if someone else had implemented something like it.
+    I found WorkerDOM, published about nine months prior, which implements the full DOM API inside a Web Worker.
+
+    Most of WorkerDOM’s rationale applies to remote-dom-vm as well;
+    the talk in which it was announced,
+    [*WorkerDOM: JavaScript Concurrency and the DOM* <https://speakerdeck.com/cramforce/workerdom-javascript-concurrency-and-the-dom>],
+    is great.
+
+    WorkerDOM maintains its own representation of the structure of the real DOM inside the worker. DOM methods then fall into three categories:
+
+     1. Methods that can be implemented based upon this record of the DOM structure,
+        by copying the algorithm browsers use (this is almost everything);
+
+     2. Methods that need to be run asynchronously instead: mostly for methods that interact with layout,
+        such as getBoundingClientRect;
+        asynchronous alternatives are defined which call the main thread to perform the operation and yield the answer.
+
+     3. Methods that are unimplementable with no alternative:
+        synchronous methods on events, like event.preventDefault().
+        You can’t do those asynchronously, so you’ll need a bit of main thread code there.
+
+    By comparison, remote-dom-vm is deliberately write-only,
+    and does not maintain any knowledge of DOM structure,
+    exposing only a few mutator method calls and no getters.
+    You’re expected to maintain any knowledge of the structure that you need yourself.
+
+    This makes it *much* more flexible than remote-dom-vm, which roughly just defines an assembly code.
+
+    Notably, remote-dom-vm has no interest at this time in providing *read* access to *anything*. Even if it ever does, it’s not going to be done as a DOM shim.
+
+### [Glimmer <https://glimmerjs.com>]
+
+    Glimmer manages to be a rather fast DOM rendering engine by using a VM architecture.
+    Its instruction set is a good deal more complicated than remote-dom-vm’s, as it represents a lot more semantics;
+    remote-dom-vm’s instruction set, by comparison, is limited to just a few DOM things.
+    Glimmer is CISC to remote-dom-vm’s RISC, if you like.
+
+## Author
+
+    [Chris Morgan <https://chrismorgan.info/>] is the primary author and maintainer of remote-dom-vm.
+
+## License
+
+    Copyright © 2019– Chris Morgan
+
+    This project is distributed under the terms of three different licenses, at your choice:
+
+      • [Blue Oak Model License 1.0.0 <https://blueoakcouncil.org/license/1.0.0>]
+      • [MIT License <https://opensource.org/licenses/MIT>]
+      • [Apache License, Version 2.0 <https://www.apache.org/licenses/LICENSE-2.0>]
+
+    If you do not have particular cause to select the MIT or the Apache-2.0 license,
+    Chris Morgan recommends that you select BlueOak-1.0.0,
+    which is better and simpler than both MIT and Apache-2.0,
+    which are only offered due to their greater recognition and their conventional use in the Rust ecosystem.
+    (BlueOak-1.0.0 was only published in March 2019.)
+
+    When using this code, ensure you comply with the terms of at least one of these licenses.
diff --git a/bytecode.ttt b/bytecode.ttt

new file mode 100644 (file)

index 0000000..156bb8e
--- /dev/null
+++ b/bytecode.ttt
@@ -0,0 +1,177 @@
+# remote-dom-vm bytecode format description
+
+    **Précis:**
+        write-only instruction set;
+        single byte opcodes;
+        variable-width instructions;
+        big-endian;
+        UTF-8 strings prefixed by 32-bit length.
+
+## Types
+
+    ### u32 (unsigned 32-bit integers)
+
+        Four bytes in big-endian order.
+
+            ◊code`
+              0   1   2   3  
+            ├───┴───┴───┴───┤
+            │ number        │
+            └───────────────┘
+            `
+
+        As an example, the bytes \[0x12, 0x34, 0x56, 0x78\] represent the number 0x12345678 (305,419,896₁₀).
+
+    ### String
+
+            ◊code`
+              0   1   2   3   ...
+            ├───┴───┴───┴───┼───────────────────
+            │ length: u32   │ ... data
+            └───────────────┴───────────────────
+            `
+
+        A string is represented as its length in UTF-8 code units as a u32,
+        followed by the UTF-8 code points.
+
+        Although JavaScript strings allow unmatched surrogate code points,
+        this layer only allows legal Unicode,
+        for performance, simplicity and brevity of implementation.
+
+        As a 32-bit unsigned integer, a string’s length lies in the range ◊`[0, 2³²)`.
+        It is not possible to represent strings greater than 4,294,967,295 UTF-8 code units long.
+        If you feel a pressing need to put strings even one *hundredth* of this size into the DOM, you should rethink things.
+
+    ### NodeId
+
+        DOM nodes (regardless of subtype, whether document, element, text or comment) are identified by an unsigned 32-bit integer.
+        NodeId 0 is currently reserved for the document being operated upon.
+        (This may change in the future, to a single VM host supporting working with multiple documents.)
+        Certain instructions (opcodes 0–4) reserve a new NodeId.
+        The client and host agree at compile time which of the following NodeId allocation schemes will be used,
+        so that they can both calculate what the next NodeId will be,
+        so that the VM can be write-only.
+
+## NodeId allocation schemes
+
+    ### Simplistic
+
+        Each new NodeId is one greater than the previous NodeId,
+        with no reuse of past NodeIds.
+
+          • Excellent for static or almost-static pages,
+            having the lowest baseline memory usage and highest performance.
+
+          • Cannot create more than 2\³\² nodes in total.
+            In practice, this is not a serious concern:
+            if you create ten thousand nodes per second,
+            which would suggest you were doing something wildly wrong,
+            it still takes five days to exhaust this.
+            (In practice, the memory usage thing will crash your page long before this.)
+
+          • Memory usage is likely to be proportional to the number of nodes ever created,
+            so heavily dynamic pages will use more memory than they should.
+            (On the VM host side, it will become a sparse array,
+            and the JavaScript engine will probably optimise it so,
+            leaving memory usage proportional to number of current nodes.
+            But on the client side, this is unlikely to be the case,
+            the node map probably being something like ◊code.rs`Vec<Option<Node>>`,
+            and so it’ll probably be using at least four bytes per node ever allocated.
+            This, incidentally, will cause an out-of-memory crash much earlier,
+            probably by no more than a billion nodes.)
+
+    ### Compact
+
+        NodeIds are reused in most-recently-freed order,
+        and if there are no freed NodeIds remaining then a new NodeId is minted,
+        one higher than the previous highest.
+
+        (The implementation of this is straightforward,
+        but a prose description cumbersome;
+        look at the code if you want more.
+        It involves leaving tombstones and swapping next NodeIds.)
+
+          • Somewhat higher baseline memory per node,
+            and probably very slightly lower performance.
+
+          • Memory usage is proportional to the largest number of nodes that existed at one time.
+
+## Instructions
+
+    The first byte of an instruction is the opcode.
+    After that, instruction widths vary.
+
+    ### 0: CreateElement
+
+          • document (NodeId of a Document in JavaScript-land)
+          • tag_name (string)
+
+        Allocates a new NodeId, corresponding to an HTMLElement in JavaScript-land.
+
+    ### 1: CreateSvgElement
+
+          • document (NodeId of a Document in JavaScript-land)
+          • tag_name (string)
+
+        Allocates a new NodeId, corresponding to an SVGElement in JavaScript-land.
+
+    ### 2: CreateTextNode
+
+          • document (NodeId of a Document in JavaScript-land)
+          • data (string)
+
+        Allocates a new NodeId, corresponding to a Text in JavaScript-land.
+
+    ### 3: CreateComment
+
+          • document (NodeId of a Document in JavaScript-land)
+          • data (string)
+
+        Allocates a new NodeId, corresponding to a Comment in JavaScript-land.
+
+    ### 4: CreateDocumentFragment
+
+          • document (NodeId of a Document in JavaScript-land)
+
+        Allocates a new NodeId, corresponding to a DocumentFragment in JavaScript-land.
+
+    ### 5: SetData
+
+          • node (NodeId of a CharacterData in JavaScript-land, meaning a Text or a Comment)
+          • data (string)
+
+    ### 6: SetAttribute
+
+          • element (NodeId of an Element in JavaScript-land)
+          • name (string)
+          • value (string)
+
+    ### 7: RemoveAttribute
+
+          • element (NodeId of an Element in JavaScript-land)
+          • name (string)
+
+    ### 8: AppendChild
+
+          • parent (NodeId of an Element in JavaScript-land)
+          • new_child (NodeId, typically of a DocumentFragment, Element, Text or Comment)
+
+    ### 9: InsertBefore
+
+          • parent (NodeId of an Element in JavaScript-land)
+          • reference_child (NodeId, typically of a Element, Text or Comment)
+          • new_child (NodeId, typically of a DocumentFragment, Element, Text or Comment)
+
+    ### 10: Free
+
+          • node (NodeId of any Node)
+
+## Error handling
+
+    Current implementations assume correct usage and are not robust against incorrect input.
+    Behaviour in the presence of incorrect input is undefined.
+
+    Client and VM implementations may coordinate on which instructions to support;
+    for example, on an app that never creates any SVG,
+    the code implementing opcode 1 (CreateSvgElement) may be removed;
+    calling it would then produce undefined behaviour.
diff --git a/rust-client/Cargo.toml b/rust-client/Cargo.toml

new file mode 100644 (file)

index 0000000..13ea522
--- /dev/null
+++ b/rust-client/Cargo.toml
@@ -0,0 +1,19 @@
+[package]
+authors = ["Chris Morgan <me@chrismorgan.info>"]
+edition = "2018"
+license = "BlueOak-1.0.0 OR MIT OR Apache-2.0"
+name = "remote-dom-client"
+version = "0.1.0"
+
+[features]
+default = ["svg", "document-fragment", "comment"]
+dictionaries = []
+# CreateSvgElement opcode and corresponding create_svg_element method
+svg = []
+# CreateDocumentFragment opcode and corresponding create_document_fragment method
+document-fragment = []
+# CreateComment opcode and corresponding create_comment method
+comment = []
+
+[profile.release]
+opt-level = "s"
diff --git a/rust-client/src/dictionaries.rs b/rust-client/src/dictionaries.rs

new file mode 100644 (file)

index 0000000..11eaf51
--- /dev/null
+++ b/rust-client/src/dictionaries.rs
@@ -0,0 +1,1199 @@
+//! Dictionaries used in the byte code. Each dictionary must have no more than 255 entries.
+
+/// All known HTML tag names.
+///
+/// Taken from https://html.spec.whatwg.org/#elements-3, minus <mathml> and <svg>.
+pub static CREATE_ELEMENT: [&str; 112] = [
+    "a",
+    "abbr",
+    "address",
+    "area",
+    "article",
+    "aside",
+    "audio",
+    "b",
+    "base",
+    "bdi",
+    "bdo",
+    "blockquote",
+    "body",
+    "br",
+    "button",
+    "canvas",
+    "caption",
+    "cite",
+    "code",
+    "col",
+    "colgroup",
+    "data",
+    "datalist",
+    "dd",
+    "del",
+    "details",
+    "dfn",
+    "dialog",
+    "div",
+    "dl",
+    "dt",
+    "em",
+    "embed",
+    "fieldset",
+    "figcaption",
+    "figure",
+    "footer",
+    "form",
+    "h1",
+    "h2",
+    "h3",
+    "h4",
+    "h5",
+    "h6",
+    "head",
+    "header",
+    "hgroup",
+    "hr",
+    "html",
+    "i",
+    "iframe",
+    "img",
+    "input",
+    "ins",
+    "kbd",
+    "label",
+    "legend",
+    "li",
+    "link",
+    "main",
+    "map",
+    "mark",
+    "menu",
+    "meta",
+    "meter",
+    "nav",
+    "noscript",
+    "object",
+    "ol",
+    "optgroup",
+    "option",
+    "output",
+    "p",
+    "param",
+    "picture",
+    "pre",
+    "progress",
+    "q",
+    "rp",
+    "rt",
+    "ruby",
+    "s",
+    "samp",
+    "script",
+    "section",
+    "select",
+    "slot",
+    "small",
+    "source",
+    "span",
+    "strong",
+    "style",
+    "sub",
+    "summary",
+    "sup",
+    "table",
+    "tbody",
+    "td",
+    "template",
+    "textarea",
+    "tfoot",
+    "th",
+    "thead",
+    "time",
+    "title",
+    "tr",
+    "track",
+    "u",
+    "ul",
+    "var",
+    "video",
+    "wbr",
+];
+
+/// All known SVG tag names.
+///
+/// Taken from https://svgwg.org/svg2-draft/eltindex.html.
+pub static CREATE_SVG_ELEMENT: [&str; 64] = [
+    "a",
+    "animate",
+    "animateMotion",
+    "animateTransform",
+    "circle",
+    "clipPath",
+    "defs",
+    "desc",
+    "discard",
+    "ellipse",
+    "feBlend",
+    "feColorMatrix",
+    "feComponentTransfer",
+    "feComposite",
+    "feConvolveMatrix",
+    "feDiffuseLighting",
+    "feDisplacementMap",
+    "feDistantLight",
+    "feDropShadow",
+    "feFlood",
+    "feFuncA",
+    "feFuncB",
+    "feFuncG",
+    "feFuncR",
+    "feGaussianBlur",
+    "feImage",
+    "feMerge",
+    "feMergeNode",
+    "feMorphology",
+    "feOffset",
+    "fePointLight",
+    "feSpecularLighting",
+    "feSpotLight",
+    "feTile",
+    "feTurbulence",
+    "filter",
+    "foreignObject",
+    "g",
+    "image",
+    "line",
+    "linearGradient",
+    "marker",
+    "mask",
+    "metadata",
+    "mpath",
+    "path",
+    "pattern",
+    "polygon",
+    "polyline",
+    "radialGradient",
+    "rect",
+    "script",
+    "set",
+    "stop",
+    "style",
+    "svg",
+    "switch",
+    "symbol",
+    "text",
+    "textPath",
+    "title",
+    "tspan",
+    "use",
+    "view",
+];
+
+/*
+/// All known HTML attribute names.
+///
+/// Taken from https://html.spec.whatwg.org/#attributes-3, *excluding* the event handler content
+/// attributes.
+///
+/// Also added all the ARIA attributes (using the list from the SVG spec, TODO source better).
+pub static HTML_ATTRIBUTE_NAMES: [&str; 172] = [
+    "abbr",
+    "accept",
+    "accept-charset",
+    "accesskey",
+    "action",
+    "allow",
+    "allowfullscreen",
+    "allowpaymentrequest",
+    "alt",
+    "aria-activedescendant",
+    "aria-atomic",
+    "aria-autocomplete",
+    "aria-busy",
+    "aria-checked",
+    "aria-colcount",
+    "aria-colindex",
+    "aria-colspan",
+    "aria-controls",
+    "aria-current",
+    "aria-describedby",
+    "aria-details",
+    "aria-disabled",
+    "aria-dropeffect",
+    "aria-errormessage",
+    "aria-expanded",
+    "aria-flowto",
+    "aria-grabbed",
+    "aria-haspopup",
+    "aria-hidden",
+    "aria-invalid",
+    "aria-keyshortcuts",
+    "aria-label",
+    "aria-labelledby",
+    "aria-level",
+    "aria-live",
+    "aria-modal",
+    "aria-multiline",
+    "aria-multiselectable",
+    "aria-orientation",
+    "aria-owns",
+    "aria-placeholder",
+    "aria-posinset",
+    "aria-pressed",
+    "aria-readonly",
+    "aria-relevant",
+    "aria-required",
+    "aria-roledescription",
+    "aria-rowcount",
+    "aria-rowindex",
+    "aria-rowspan",
+    "aria-selected",
+    "aria-setsize",
+    "aria-sort",
+    "aria-valuemax",
+    "aria-valuemin",
+    "aria-valuenow",
+    "aria-valuetext",
+    "as",
+    "async",
+    "autocapitalize",
+    "autocomplete",
+    "autofocus",
+    "autoplay",
+    "charset",
+    "checked",
+    "cite",
+    "class",
+    "color",
+    "cols",
+    "colspan",
+    "content",
+    "contenteditable",
+    "controls",
+    "coords",
+    "crossorigin",
+    "data",
+    "datetime",
+    "decoding",
+    "default",
+    "defer",
+    "dir",
+    "dirname",
+    "disabled",
+    "download",
+    "draggable",
+    "enctype",
+    "enterkeyhint",
+    "for",
+    "form",
+    "formaction",
+    "formenctype",
+    "formmethod",
+    "formnovalidate",
+    "formtarget",
+    "headers",
+    "height",
+    "hidden",
+    "high",
+    "href",
+    "hreflang",
+    "http-equiv",
+    "id",
+    "imagesizes",
+    "imagesrcset",
+    "inputmode",
+    "integrity",
+    "is",
+    "ismap",
+    "itemid",
+    "itemprop",
+    "itemref",
+    "itemscope",
+    "itemtype",
+    "kind",
+    "label",
+    "lang",
+    "list",
+    "loop",
+    "low",
+    "manifest",
+    "max",
+    "maxlength",
+    "media",
+    "method",
+    "min",
+    "minlength",
+    "multiple",
+    "muted",
+    "name",
+    "nomodule",
+    "nonce",
+    "novalidate",
+    "open",
+    "optimum",
+    "pattern",
+    "ping",
+    "placeholder",
+    "playsinline",
+    "poster",
+    "preload",
+    "readonly",
+    "referrerpolicy",
+    "rel",
+    "required",
+    "reversed",
+    "rows",
+    "rowspan",
+    "sandbox",
+    "scope",
+    "selected",
+    "shape",
+    "size",
+    "sizes",
+    "slot",
+    "span",
+    "spellcheck",
+    "src",
+    "srcdoc",
+    "srclang",
+    "srcset",
+    "start",
+    "step",
+    "style",
+    "tabindex",
+    "target",
+    "title",
+    "translate",
+    "type",
+    "usemap",
+    "value",
+    "width",
+    "wrap",
+];
+
+/// All known SVG attribute names.
+///
+/// Taken from https://svgwg.org/svg2-draft/attindex.html#RegularAttributes, *excluding* the “on*”
+/// event handler attributes. Presentation attributes are also excluded, not being in that section.
+pub static SVG_ATTRIBUTE_NAMES: [&str; 184] = [
+    "accumulate",
+    "additive",
+    "amplitude",
+    "aria-activedescendant",
+    "aria-atomic",
+    "aria-autocomplete",
+    "aria-busy",
+    "aria-checked",
+    "aria-colcount",
+    "aria-colindex",
+    "aria-colspan",
+    "aria-controls",
+    "aria-current",
+    "aria-describedby",
+    "aria-details",
+    "aria-disabled",
+    "aria-dropeffect",
+    "aria-errormessage",
+    "aria-expanded",
+    "aria-flowto",
+    "aria-grabbed",
+    "aria-haspopup",
+    "aria-hidden",
+    "aria-invalid",
+    "aria-keyshortcuts",
+    "aria-label",
+    "aria-labelledby",
+    "aria-level",
+    "aria-live",
+    "aria-modal",
+    "aria-multiline",
+    "aria-multiselectable",
+    "aria-orientation",
+    "aria-owns",
+    "aria-placeholder",
+    "aria-posinset",
+    "aria-pressed",
+    "aria-readonly",
+    "aria-relevant",
+    "aria-required",
+    "aria-roledescription",
+    "aria-rowcount",
+    "aria-rowindex",
+    "aria-rowspan",
+    "aria-selected",
+    "aria-setsize",
+    "aria-sort",
+    "aria-valuemax",
+    "aria-valuemin",
+    "aria-valuenow",
+    "aria-valuetext",
+    "attributeName",
+    "azimuth",
+    "baseFrequency",
+    "begin",
+    "bias",
+    "by",
+    "calcMode",
+    "class",
+    "clipPathUnits",
+    "crossorigin",
+    "cx",
+    "cy",
+    "diffuseConstant",
+    "divisor",
+    "download",
+    "dur",
+    "dx",
+    "dy",
+    "edgeMode",
+    "elevation",
+    "end",
+    "exponent",
+    "fill",
+    "filterUnits",
+    "fr",
+    "from",
+    "fx",
+    "fy",
+    "gradientTransform",
+    "gradientUnits",
+    "height",
+    "href",
+    "hreflang",
+    "id",
+    "in",
+    "in2",
+    "intercept",
+    "k1",
+    "k2",
+    "k3",
+    "k4",
+    "kernelMatrix",
+    "kernelUnitLength",
+    "keyPoints",
+    "keySplines",
+    "keyTimes",
+    "lang",
+    "lengthAdjust",
+    "limitingConeAngle",
+    "markerHeight",
+    "markerUnits",
+    "markerWidth",
+    "maskContentUnits",
+    "maskUnits",
+    "max",
+    "media",
+    "method",
+    "min",
+    "mode",
+    "numOctaves",
+    "offset",
+    "operator",
+    "order",
+    "orient",
+    "origin",
+    "path",
+    "pathLength",
+    "patternContentUnits",
+    "patternTransform",
+    "patternUnits",
+    "ping",
+    "playbackorder",
+    "points",
+    "pointsAtX",
+    "pointsAtY",
+    "pointsAtZ",
+    "preserveAlpha",
+    "preserveAspectRatio",
+    "primitiveUnits",
+    "r",
+    "radius",
+    "refX",
+    "refY",
+    "referrerpolicy",
+    "rel",
+    "repeatCount",
+    "repeatDur",
+    "requiredExtensions",
+    "restart",
+    "result",
+    "role",
+    "rotate",
+    "scale",
+    "seed",
+    "side",
+    "slope",
+    "spacing",
+    "specularConstant",
+    "specularExponent",
+    "spreadMethod",
+    "startOffset",
+    "stdDeviation",
+    "stitchTiles",
+    "style",
+    "surfaceScale",
+    "systemLanguage",
+    "tabindex",
+    "tableValues",
+    "target",
+    "targetX",
+    "targetY",
+    "textLength",
+    "timelinebegin",
+    "title",
+    "to",
+    "transform",
+    "type",
+    "values",
+    "viewBox",
+    "width",
+    "x",
+    "x1",
+    "x2",
+    "xChannelSelector",
+    "xlink:href",
+    "xlink:title",
+    "xml:space",
+    "y",
+    "y1",
+    "y2",
+    "yChannelSelector",
+    "z",
+    "zoomAndPan",
+];
+
+/// A unified dictionary of HTML and SVG attributes. A work in progress, because there are too many
+/// in total, so I want to remove some obsolete or extremely uncommon ones.
+pub static ATTRIBUTE_NAMES: [&str; 287] = [  // XXX: 287 is too long, cull some I suppose.
+    "abbr",
+    "accept",
+    "accept-charset",
+    "accesskey",
+    "accumulate",
+    "action",
+    "additive",
+    "allow",
+    "allowfullscreen",
+    "allowpaymentrequest",
+    "alt",
+    "amplitude",
+    "aria-activedescendant",
+    "aria-atomic",
+    "aria-autocomplete",
+    "aria-busy",
+    "aria-checked",
+    "aria-colcount",
+    "aria-colindex",
+    "aria-colspan",
+    "aria-controls",
+    "aria-current",
+    "aria-describedby",
+    "aria-details",
+    "aria-disabled",
+    "aria-dropeffect",
+    "aria-errormessage",
+    "aria-expanded",
+    "aria-flowto",
+    "aria-grabbed",
+    "aria-haspopup",
+    "aria-hidden",
+    "aria-invalid",
+    "aria-keyshortcuts",
+    "aria-label",
+    "aria-labelledby",
+    "aria-level",
+    "aria-live",
+    "aria-modal",
+    "aria-multiline",
+    "aria-multiselectable",
+    "aria-orientation",
+    "aria-owns",
+    "aria-placeholder",
+    "aria-posinset",
+    "aria-pressed",
+    "aria-readonly",
+    "aria-relevant",
+    "aria-required",
+    "aria-roledescription",
+    "aria-rowcount",
+    "aria-rowindex",
+    "aria-rowspan",
+    "aria-selected",
+    "aria-setsize",
+    "aria-sort",
+    "aria-valuemax",
+    "aria-valuemin",
+    "aria-valuenow",
+    "aria-valuetext",
+    "as",
+    "async",
+    "attributeName",
+    "autocapitalize",
+    "autocomplete",
+    "autofocus",
+    "autoplay",
+    "azimuth",
+    "baseFrequency",
+    "begin",
+    "bias",
+    "by",
+    "calcMode",
+    "charset",
+    "checked",
+    "cite",
+    "class",
+    "clipPathUnits",
+    "color",
+    "cols",
+    "colspan",
+    "content",
+    "contenteditable",
+    "controls",
+    "coords",
+    "crossorigin",
+    "cx",
+    "cy",
+    "data",
+    "datetime",
+    "decoding",
+    "default",
+    "defer",
+    "diffuseConstant",
+    "dir",
+    "dirname",
+    "disabled",
+    "divisor",
+    "download",
+    "draggable",
+    "dur",
+    "dx",
+    "dy",
+    "edgeMode",
+    "elevation",
+    "enctype",
+    "end",
+    "enterkeyhint",
+    "exponent",
+    "fill",
+    "filterUnits",
+    "for",
+    "form",
+    "formaction",
+    "formenctype",
+    "formmethod",
+    "formnovalidate",
+    "formtarget",
+    "fr",
+    "from",
+    "fx",
+    "fy",
+    "gradientTransform",
+    "gradientUnits",
+    "headers",
+    "height",
+    "hidden",
+    "high",
+    "href",
+    "hreflang",
+    "http-equiv",
+    "id",
+    "imagesizes",
+    "imagesrcset",
+    "in",
+    "in2",
+    "inputmode",
+    "integrity",
+    "intercept",
+    "is",
+    "ismap",
+    "itemid",
+    "itemprop",
+    "itemref",
+    "itemscope",
+    "itemtype",
+    "k1",
+    "k2",
+    "k3",
+    "k4",
+    "kernelMatrix",
+    "kernelUnitLength",
+    "keyPoints",
+    "keySplines",
+    "keyTimes",
+    "kind",
+    "label",
+    "lang",
+    "lengthAdjust",
+    "limitingConeAngle",
+    "list",
+    "loop",
+    "low",
+    "manifest",
+    "markerHeight",
+    "markerUnits",
+    "markerWidth",
+    "maskContentUnits",
+    "maskUnits",
+    "max",
+    "maxlength",
+    "media",
+    "method",
+    "min",
+    "minlength",
+    "mode",
+    "multiple",
+    "muted",
+    "name",
+    "nomodule",
+    "nonce",
+    "novalidate",
+    "numOctaves",
+    "offset",
+    "open",
+    "operator",
+    "optimum",
+    "order",
+    "orient",
+    "origin",
+    "path",
+    "pathLength",
+    "pattern",
+    "patternContentUnits",
+    "patternTransform",
+    "patternUnits",
+    "ping",
+    "placeholder",
+    "playbackorder",
+    "playsinline",
+    "points",
+    "pointsAtX",
+    "pointsAtY",
+    "pointsAtZ",
+    "poster",
+    "preload",
+    "preserveAlpha",
+    "preserveAspectRatio",
+    "primitiveUnits",
+    "r",
+    "radius",
+    "readonly",
+    "refX",
+    "refY",
+    "referrerpolicy",
+    "rel",
+    "repeatCount",
+    "repeatDur",
+    "required",
+    "requiredExtensions",
+    "restart",
+    "result",
+    "reversed",
+    "role",
+    "rotate",
+    "rows",
+    "rowspan",
+    "sandbox",
+    "scale",
+    "scope",
+    "seed",
+    "selected",
+    "shape",
+    "side",
+    "size",
+    "sizes",
+    "slope",
+    "slot",
+    "spacing",
+    "span",
+    "specularConstant",
+    "specularExponent",
+    "spellcheck",
+    "spreadMethod",
+    "src",
+    "srcdoc",
+    "srclang",
+    "srcset",
+    "start",
+    "startOffset",
+    "stdDeviation",
+    "step",
+    "stitchTiles",
+    "style",
+    "surfaceScale",
+    "systemLanguage",
+    "tabindex",
+    "tableValues",
+    "target",
+    "targetX",
+    "targetY",
+    "textLength",
+    "timelinebegin",
+    "title",
+    "to",
+    "transform",
+    "translate",
+    "type",
+    "usemap",
+    "value",
+    "values",
+    "viewBox",
+    "width",
+    "wrap",
+    "x",
+    "x1",
+    "x2",
+    "xChannelSelector",
+    "xlink:href",
+    "xlink:title",
+    "xml:space",
+    "y",
+    "y1",
+    "y2",
+    "yChannelSelector",
+    "z",
+    "zoomAndPan",
+];
+
+/// All known SVG event names.
+///
+/// Taken from https://svgwg.org/svg2-draft/attindex.html, all the “on…” attribute names.
+/// TODO better source.
+pub static SVG_EVENT_NAMES: [&str; 77] = [
+    "abort",
+    "afterprint",
+    "beforeprint",
+    "begin",
+    "cancel",
+    "canplay",
+    "canplaythrough",
+    "change",
+    "click",
+    "close",
+    "copy",
+    "cuechange",
+    "cut",
+    "dblclick",
+    "drag",
+    "dragend",
+    "dragenter",
+    "dragexit",
+    "dragleave",
+    "dragover",
+    "dragstart",
+    "drop",
+    "durationchange",
+    "emptied",
+    "end",
+    "ended",
+    "error",
+    "focus",
+    "focusin",
+    "focusout",
+    "hashchange",
+    "input",
+    "invalid",
+    "keydown",
+    "keypress",
+    "keyup",
+    "load",
+    "loadeddata",
+    "loadedmetadata",
+    "loadstart",
+    "message",
+    "mousedown",
+    "mouseenter",
+    "mouseleave",
+    "mousemove",
+    "mouseout",
+    "mouseover",
+    "mouseup",
+    "offline",
+    "online",
+    "pagehide",
+    "pageshow",
+    "paste",
+    "pause",
+    "play",
+    "playing",
+    "popstate",
+    "progress",
+    "ratechange",
+    "repeat",
+    "reset",
+    "resize",
+    "scroll",
+    "seeked",
+    "seeking",
+    "select",
+    "show",
+    "stalled",
+    "storage",
+    "submit",
+    "suspend",
+    "timeupdate",
+    "toggle",
+    "unload",
+    "volumechange",
+    "waiting",
+    "wheel",
+];
+
+// TODO: make a dictionary of known attribute values (e.g. "", "true", "false", "stylesheet", "icon")
+
+/// All known HTML event names.
+///
+/// Taken from https://html.spec.whatwg.org/#events-2, https://html.spec.whatwg.org/#mediaevents,
+/// https://html.spec.whatwg.org/#appcacheevents and https://html.spec.whatwg.org/#dndevents.
+/// TODO what about other things like upgrade?
+pub static HTML_EVENT_NAMES: [&str; 77] = [
+    "DOMContentLoaded",
+    "abort",
+    "afterprint",
+    "beforeprint",
+    "beforeunload",
+    "blur",
+    "cached",
+    "cancel",
+    "canplay",
+    "canplaythrough",
+    "change",
+    "checking",
+    "click",
+    "close",
+    "connect",
+    "contextmenu",
+    "copy",
+    "cut",
+    "downloading",
+    "drag",
+    "dragend",
+    "dragenter",
+    "dragexit",
+    "dragleave",
+    "dragover",
+    "dragstart",
+    "drop",
+    "durationchange",
+    "emptied",
+    "ended",
+    "error",
+    "focus",
+    "formdata",
+    "hashchange",
+    "input",
+    "invalid",
+    "languagechange",
+    "load",
+    "loadeddata",
+    "loadedmetadata",
+    "loadend",
+    "loadstart",
+    "message",
+    "messageerror",
+    "noupdate",
+    "obsolete",
+    "offline",
+    "online",
+    "open",
+    "pagehide",
+    "pageshow",
+    "paste",
+    "pause",
+    "play",
+    "playing",
+    "popstate",
+    "progress",
+    "ratechange",
+    "readystatechange",
+    "rejectionhandled",
+    "reset",
+    "resize",
+    "securitypolicyviolation",
+    "seeked",
+    "seeking",
+    "select",
+    "stalled",
+    "storage",
+    "submit",
+    "suspend",
+    "timeupdate",
+    "toggle",
+    "unhandledrejection",
+    "unload ",
+    "updateready",
+    "volumechange ",
+    "waiting",
+];
+
+pub static ATTRIBUTE_VALUES: [&str; 120] = [
+    // Various
+    "",
+    "on",
+    "off",
+    "none",
+    "auto",
+
+    // autocapitalize
+    "sentences",
+    "words",
+    "characters", 
+    "true",
+    "false",
+
+    // <meta charset>
+    "utf-8",
+
+    // <link rel>
+    "preload",
+    "modulepreload",
+
+    // <link as>
+    "fetch",
+    "audio",
+    "audioworklet",
+    "document",
+    "embed",
+    "font",
+    "image",
+    "manifest",
+    "object",
+    "paintworklet",
+    "report",
+    "script",
+    "serviceworker",
+    "sharedworker",
+    "style",
+    "track",
+    "video",
+    "worker",
+    "xslt",
+
+    // crossorigin
+    "anonymous",
+    "use-credentials",
+
+    // <img decoding>
+    "sync",
+    "async",
+
+    // dir
+    "rtl",
+    "ltr",
+
+    // <form enctype>, formenctype
+    "application/x-www-form-urlencoded",
+    "multipart/form-data",
+    "text/plain" ,
+
+    // enterkeyhint
+    "enter",
+    "done",
+    "go",
+    "next",
+    "previous",
+    "search",
+    "send",
+
+    // formmethod
+    "get",
+    "post",
+    "dialog",
+
+    // http-equiv
+    "content-type",
+    "default-style",
+    "refresh",
+    "x-ua-compatible",
+    "content-security-policy",
+
+    // inputmode
+    "text",
+    "tel",
+    "email",
+    "url",
+    "numeric",
+    "decimal",
+    "search", 
+
+    // <track kind>
+    "subtitles",
+    "captions",
+    "descriptions",
+    "chapters",
+    "metadata",
+
+    // preload
+    "metadata",
+
+    // referrerpolicy
+    "no-referrer",
+    "no-referrer-when-downgrade",
+    "same-origin",
+    "origin",
+    "strict-origin",
+    "origin-when-cross-origin",
+    "strict-origin-when-cross-origin",
+    "unsafe-url",
+
+    // rel: TODO
+
+    // scope
+    "row",
+    "col",
+    "rowgroup",
+    "colgroup",
+
+    // <area shape>
+    "circle",
+    "default",
+    "poly",
+    "rect",
+
+    // step
+    "any",
+
+    // translate
+    "yes",
+    "no",
+
+    // <button type>
+    "submit",
+    "reset",
+    "button",
+
+    // <input type>
+    "hidden",
+    "text",
+    "search",
+    "tel",
+    "url",
+    "email",
+    "password",
+    "date",
+    "month",
+    "week",
+    "time",
+    "datetime-local",
+    "number",
+    "range",
+    "color",
+    "checkbox",
+    "radio",
+    "file",
+    "image",
+    // Also <button type>
+    "submit",
+    "reset",
+    "button",
+
+    // <ol type>
+    "1",
+    "a",
+    "A",
+    "i",
+    "I",
+
+    // <textarea wrap>
+    "soft",
+    "hard",
+];
+*/
diff --git a/rust-client/src/lib.rs b/rust-client/src/lib.rs

new file mode 100644 (file)

index 0000000..262b69b
--- /dev/null
+++ b/rust-client/src/lib.rs
@@ -0,0 +1,280 @@
+use std::convert::TryFrom;
+#[cfg(feature = "dictionaries")]
+mod dictionaries;
+use std::vec::Drain;
+
+// It is an invariant that in the VM bytecode a node takes up four bytes, like a 32-bit integer.
+#[derive(Copy, Clone, PartialEq, Eq, Debug)]
+pub struct Node(u32);
+
+macro_rules! node_types {
+    ($($(#[$meta:meta])* $ident:ident $(impl $($interface:ident),*)?;)*) => {
+        $(
+            $(#[$meta])*
+            #[derive(Copy, Clone, PartialEq, Eq, Debug)]
+            pub struct $ident(Node);
+
+            impl From<$ident> for Node {
+                fn from(node: $ident) -> Node {
+                    node.0
+                }
+            }
+
+            $(
+                $(
+                    impl $interface for $ident {}
+                )*
+            )?
+        )*
+    }
+}
+
+node_types! {
+    /// Since the VM is currently write-only, this offers a mild predicament for getting the
+    /// document; and indeed, at some point it may be necessary to get another document; but for
+    /// now, we can just assume we’ll work with the global document, and so the VM defines as
+    /// initial state that node 0 is assigned to `Window.document`.
+    Document impl ParentNode;
+    #[cfg(feature = "document-fragment")]
+    DocumentFragment impl ParentNode;
+    Element impl ParentNode, ChildNode;
+    Text impl ChildNode, CharacterData;
+    #[cfg(feature = "comment")]
+    Comment impl ChildNode, CharacterData;
+    // Deliberately not implemented: DocumentType, ProcessingInstruction.
+}
+
+pub trait ParentNode: Into<Node> {}
+pub trait ChildNode: Into<Node> {}
+pub trait CharacterData: Into<Node> {}
+
+#[repr(u8)]
+enum Opcode {
+    /// { document: Document, tag_name: &'static str }
+    CreateElement = 0,
+    /// { document: Document, tag_name: &'static str }
+    #[cfg(feature = "svg")]
+    CreateSvgElement = 1,
+    /// { document: Document, data: &str }
+    CreateTextNode = 2,
+    /// { document: Document, data: &str }
+    #[cfg(feature = "comment")]
+    CreateComment = 3,
+    /// { document: Document }
+    #[cfg(feature = "document-fragment")]
+    CreateDocumentFragment = 4,
+    /// { node: CharacterData, data: &str }
+    SetData = 5,
+    /// { element: Element, name: &'static str, value: &str }
+    SetAttribute = 6,
+    /// { element: Element, name: &'static str }
+    RemoveAttribute = 7,
+    /// { parent: ParentNode, new_child: Node }
+    AppendChild = 8,
+    /// { parent: ParentNode, reference_child: Node, new_child: Node }
+    InsertBefore = 9,
+    /// { node: Node }
+    Free = 10,
+}
+
+/// A writer of VM bytecode.
+///
+/// This type does not concern itself with logic or any particular sanity checking; it faithfully
+/// transcribes the instructions requested. The most notable part of this is that freeing is
+/// entirely manual.
+pub struct VmBytecodeWriter {
+    instructions: Vec<u8>,
+    // node 0 is reserved for document.
+    last_node: Node,
+}
+
+impl VmBytecodeWriter {
+    pub fn new() -> VmBytecodeWriter {
+        VmBytecodeWriter {
+            instructions: Vec::new(),
+            last_node: Node(0),
+        }
+    }
+
+    fn push_opcode(&mut self, opcode: Opcode) {
+        self.instructions.push(opcode as u8);
+    }
+
+    fn push_u32(&mut self, u32: u32) {
+        self.instructions.push(((u32 & 0xff000000) >> 24) as u8);
+        self.instructions.push(((u32 & 0x00ff0000) >> 16) as u8);
+        self.instructions.push(((u32 & 0x0000ff00) >> 8) as u8);
+        self.instructions.push(((u32 & 0x000000ff) >> 0) as u8);
+    }
+
+    fn push_node(&mut self, node: impl Into<Node>) {
+        self.push_u32(node.into().0);
+    }
+
+    #[cfg(feature = "dictionaries")]
+    fn push_string_with_dictionary(&mut self, string: &str, dictionary: &'static [&'static str]) {
+        // TODO: speed-compare a phf or similar.
+        if let Ok(index) = dictionary.binary_search(&string) {
+            self.instructions.push(index as u8);
+        } else {
+            self.instructions.push(255);
+            self.push_string(string);
+        }
+    }
+
+    fn push_string(&mut self, string: &str) {
+        let len = u32::try_from(string.len()).expect("strings may not exceed 4GB");
+        self.push_u32(len);
+        self.instructions.extend(string.as_bytes());
+    }
+
+    fn next_node(&mut self) -> Node {
+        self.last_node.0 += 1;
+        self.last_node
+    }
+
+    /// Create an element in the default (HTML) namespace.
+    ///
+    /// This is equivalent to `document.createElement(tag_name)` in JavaScript.
+    pub fn create_element(&mut self, document: Document, tag_name: &str) -> Element {
+        self.push_opcode(Opcode::CreateElement);
+        self.push_node(document);
+        #[cfg(feature = "dictionaries")]
+        self.push_string_with_dictionary(tag_name, &dictionaries::CREATE_ELEMENT);
+        #[cfg(not(feature = "dictionaries"))]
+        self.push_string(tag_name);
+        Element(self.next_node())
+    }
+
+    /// Create an element in the SVG namespace.
+    ///
+    /// This is equivalent to `document.createElementNS("http://www.w3.org/2000/svg", tag_name)` in
+    /// JavaScript.
+    ///
+    /// Note how this is deliberately *not* generic like the Document.createElementNS method.
+    /// That’s for efficiency, operating under the belief that no other values will be used.
+    /// Look, strictly that may not be true; there’s also MathML. If anyone pipes up *really*
+    /// wanting MathML support, we can add an opcode for it.
+    #[cfg(feature = "svg")]
+    pub fn create_svg_element(&mut self, document: Document, tag_name: &str) -> Element {
+        self.push_opcode(Opcode::CreateSvgElement);
+        self.push_node(document);
+        #[cfg(feature = "dictionaries")]
+        self.push_string_with_dictionary(tag_name, &dictionaries::CREATE_SVG_ELEMENT);
+        #[cfg(not(feature = "dictionaries"))]
+        self.push_string(tag_name);
+        Element(self.next_node())
+    }
+
+    /// Create a text node.
+    ///
+    /// This is equivalent to `document.createTextNode(data)` in JavaScript.
+    ///
+    /// At this time, this is the only way of creating text nodes; there is nothing equivalent to
+    /// `ParentNode.append`’s ability to take strings, or `Element.innerHTML` or `Node.textContent`
+    /// setters. This means that all text nodes are tracked by the VM with an ID of their own,
+    /// which may be suboptimal for constant strings. This decision that will be reviewed later.
+    pub fn create_text_node(&mut self, document: Document, data: &str) -> Text {
+        self.push_opcode(Opcode::CreateTextNode);
+        self.push_node(document);
+        self.push_string(data);
+        Text(self.next_node())
+    }
+
+    /// Create a comment node.
+    ///
+    /// This is equivalent to `document.createComment(data)` in JavaScript.
+    #[cfg(feature = "comment")]
+    pub fn create_comment_node(&mut self, document: Document, data: &str) -> Comment {
+        self.push_opcode(Opcode::CreateComment);
+        self.push_node(document);
+        self.push_string(data);
+        Comment(self.next_node())
+    }
+
+    /// Create a document fragment.
+    ///
+    /// This is equivalent to `document.createDocumentFragment()` in JavaScript.
+    #[cfg(feature = "document-fragment")]
+    pub fn create_document_fragment(&mut self, document: Document) -> DocumentFragment {
+        self.push_opcode(Opcode::CreateDocumentFragment);
+        self.push_node(document);
+        DocumentFragment(self.next_node())
+    }
+
+    /// Set the data of a text or comment node.
+    ///
+    /// This is equivalent to `node.data = data` in JavaScript.
+    pub fn set_data(&mut self, node: impl CharacterData, data: &str) {
+        self.push_opcode(Opcode::SetData);
+        self.push_node(node);
+        self.push_string(data);
+    }
+
+    /// Set an attribute on an element.
+    ///
+    /// This is equivalent to `element.setAttribute(name, value)` in JavaScript.
+    ///
+    /// This is used for SVG elements as well as HTML attributes; although the *element* is created
+    /// in the SVG namespace, *attributes* are created in the default namespace. This is why there
+    /// is no equivalent to `Element.setAttributeNS`: it’s basically obsolete in HTML5.
+    pub fn set_attribute(&mut self, element: Element, name: &str, value: &str) {
+        self.push_opcode(Opcode::SetAttribute);
+        self.push_node(element);
+        self.push_string(name);
+        self.push_string(value);
+    }
+
+    /// Remove an attribute on an element.
+    ///
+    /// This is equivalent to `element.removeAttribute(name)` in JavaScript.
+    ///
+    /// As with `set_attribute`, this is used for SVG elements as well as HTML elements.
+    pub fn remove_attribute(&mut self, element: Element, name: &str) {
+        self.push_opcode(Opcode::RemoveAttribute);
+        self.push_node(element);
+        self.push_string(name);
+    }
+
+    /// Append a node to the end of a parent.
+    ///
+    /// This is equivalent to `parent.appendChild(new_child)` in JavaScript.
+    pub fn append_child(&mut self, parent: impl ParentNode, new_child: impl ChildNode) {
+        self.push_opcode(Opcode::AppendChild);
+        self.push_node(parent);
+        self.push_node(new_child);
+    }
+
+    /// Insert a child node at a position other than the end of its parent.
+    ///
+    /// This is equivalent to `parent.insertBefore(new_child, reference_child)` in JavaScript.
+    /// Note, however, the different argument order, which is a conscious design decision,
+    /// explained in the source if you’re at all interested.
+    //
+    // And here is the explanation as a not-doc-comment: once you’re taking the parent as an
+    // argument rather than method receiver, I think this is the order that makes sense. I’ve also,
+    // incidentally, made a bit of a hobby of asking developers (web and otherwise) which order
+    // they’d expect insertBefore to take its arguments, and almost all think the other. (For
+    // myself, I’ve memorised it as “insert new node before reference node”; the other way would
+    // read more like “insert, before reference node, new node”.)
+    pub fn insert_before(
+        &mut self,
+        parent: impl ParentNode,
+        reference_child: impl ChildNode,
+        new_child: impl ChildNode,
+    ) {
+        self.push_opcode(Opcode::InsertBefore);
+        self.push_node(parent);
+        self.push_node(reference_child);
+        self.push_node(new_child);
+    }
+
+    pub fn free(&mut self, node: impl Into<Node>) {
+        self.push_opcode(Opcode::Free);
+        self.push_node(node);
+    }
+
+    pub fn drain(&mut self) -> Drain<u8> {
+        self.instructions.drain(..)
+    }
+}
diff --git a/rust-client/tests/web.rs b/rust-client/tests/web.rs

new file mode 100644 (file)

index 0000000..ac0c495
--- /dev/null
+++ b/rust-client/tests/web.rs
@@ -0,0 +1,13 @@
+//! Test suite for the Web and headless browsers.
+
+#![cfg(target_arch = "wasm32")]
+
+extern crate wasm_bindgen_test;
+use wasm_bindgen_test::{wasm_bindgen_test, wasm_bindgen_test_configure};
+
+wasm_bindgen_test_configure!(run_in_browser);
+
+#[wasm_bindgen_test]
+fn pass() {
+    assert_eq!(1 + 1, 2);
+}
diff --git a/todo.ttt b/todo.ttt

new file mode 100644 (file)

index 0000000..de89caf
--- /dev/null
+++ b/todo.ttt
@@ -0,0 +1,262 @@
+# Miscellaneous notes and things to possibly do and such
+
+    (This started out as a short file of simple entries,
+    but then I went adding more notes to individual items,
+    and it ended up what it is now.)
+
+## Better node ID allocator
+
+    I’ve currently implemented the simplest possible node ID allocator: auto-increment without reuse;
+    for this, the client need only track a u32 of the next ID, and the VM side uses an ever-growing sparse array.
+
+    I should compare it with the following more serious allocator:
+    client and VM both track all allocated IDs in an array that will always be as large as the maximum number of IDs that have existed at any given time.
+    The VM code will look quite like this:
+
+        ◊code.ts`
+       let nodes: (Node | u32)[] = [document];
+       let next_id: u32 = 1;
+
+       function free(id) {
+               nodes[id] = next_id;
+               next_id = id;
+       }
+
+       function push_node(node) {
+               if (next_id === nodes.length) nodes.push(next_id + 1);
+               const id = next_id;
+               next_id = nodes[id];
+
+               nodes[id] = node;
+               return id;
+       }
+        `
+
+    And the client side like this:
+
+        ◊code.rs`
+       enum NodeSlot {
+               Free {
+                       next_free_slot: NonZeroU32,
+               },
+               Full,
+       }
+
+       struct Vm {
+               nodes: Vec<NodeSlot> = vec![NodeSlot::Full],
+               next_id: NonZeroU32 = 1,
+       }
+
+       impl Vm {
+               fn free(&mut self, id: NonZeroU32) {
+                       self.nodes[id] = NodeSlot::Free { next_free_slot: self.next_id };
+                       self.next_id = id;
+               }
+
+               fn allocate_node(&mut self) -> NodeId {
+                       if self.next_id == self.nodes.len() {
+                               self.nodes.push(NodeSlot::Free { next_free_slot: self.next_id + 1 });
+                       }
+                       let id = self.next_id;
+                       self.next_id = match &mut self.nodes[id] {
+                               slot @ NodeSlot::Free { next_free_slot } => {
+                                       *slot = NodeSlot::Full;
+                                       next_free_slot
+                               },
+                               NodeSlot::Full => unreachable!(),
+                       };
+                       id
+               }
+       }
+        `
+
+    If I ever decide that the client side needs to materialise an actual DOM-like tree
+    (which I’m trying to avoid, but which may become desirable for event propagation),
+    then this would become a fairly obvious choice
+    (though it might need some finesse to keep size down—2 billion nodes ought to be enough for anyone,
+    I could limit the next free slot numbers to 31-bit and use one bit as a discriminant over some kind of pointer to the actual node data).
+
+    Otherwise, I honestly don’t know;
+    the limit of only ever creating 2\³\² nodes in total is not too awful, and sparse arrays are probably fairly decent.
+    Try to benchmark it plausibly.
+    Try to keep it easy to swap in and out too.
+
+## Terminology
+
+    The **host** (also known as the **VM host**)
+    is the bit of JavaScript that runs in the main browser thread,
+    and executes the VM bytecode.
+
+    The **client** (also known as the **VM client**)
+    is the bit that writes the VM bytecode and potentially receives serialised events.
+    In my primary designed use case,
+    it runs in a worker and is WebAssembly,
+    with ideally a SharedArrayBuffer or two as the channel between the client and host,
+    but it could be written in any language and even run on a completely different machine.
+    In fact, you could use this to implement something like Phoenix LiveView with a WebSocket as the channel—in that instance,
+    the VM client would run on the server!
+    (Mind you, you *would* need to be a bit careful about disconnect recovery.)
+
+## Event handling
+
+    Yes, the first part that’s not write-only.
+
+     1. By some means, the host will learn what events the client is interested in.
+
+        (Qwik’s approach is interesting here:
+        it writes the events names of interest to the DOM,
+        and Qwik only even bothers registering for the ones that are *on screen*,
+        IntersectionObserver and all.
+        I don’t intend to *start* that way,
+        but I’m keeping it in mind and definitely not ruling it out.)
+
+     2. There will also be some means of the host synchronously determining where `preventDefault` should be invoked.
+        (The two main approaches that I have in mind:
+        declare a JavaScript function that takes the target element and returns a boolean;
+        or specify a DOM attribute so that you can check ◊code.js`target.closest('[prevent-default]')`.)
+
+     3. The host will add corresponding event listeners to the rendering root
+        (which will probably not be ◊code.js`document` or ◊code.js`document.documentElement`).
+
+        (It’d be quite at liberty to add and remove them dynamically, too.)
+
+     4. When an event is fired, the host will serialise it along with the target’s node ID.
+        (The precise serialisation may vary by application, as with the instruction bytecode—
+        for optimal size and performance, this project is a VM factory, not a VM.)
+
+     5. The client will deserialise this and can do what it likes with it,
+        which can, if desired, entail a total reimplementation of event dispatch except for `preventDefault`.
+        (It actually doesn’t even need a full tree,
+         only a mapping from node to parent
+         (which doesn’t even take more memory, given the reallocater:
+          limit it to ◊`2³¹ − 1` node IDs, and store the parent ◊code.rs`Option<NonZeroU31>` in ◊code.rs`NodeSlot::Full`):
+         trace the path to the root node,
+         fire capture phase while unwinding that,
+         then trace it again for bubbling.)
+
+        (In practice, I’m not decided on the precise event dispatch model.
+        There is some appeal to widget-based dispatch like Druid’s,
+        where propagation is entirely left to the component.)
+
+    With the proposed node → parent mapping,
+    the client side will still not be able to enumerate their children,
+    a single “remove node” instruction (… huh, haven’t implemented that yet)
+    may need to be followed be a large number of “free” instructions,
+    because I expect that’ll still be the cheapest way of implementing it.
+    I’ll still want a “remove all children” opcode, too,
+    because for nodes with many children its performance is better
+    (though I should benchmark that at length in various realistic circumstances).
+
+## The race condition problem
+
+    It sounds like everything’s neat and sorted at this point,
+    but there’s one rather uncomfortable problem remaining: race conditions.
+    This architecture is asynchronous,
+    so the host could send an event for a node that has been removed or moved.
+    *Most* of the time, these race conditions should be harmless,
+    because root-finding on a removed node will fail due to a parentless or freed node,
+    or event semantics are probably equivalent on a moved node in its before and after locations.
+    But both could be bad: if a node is removed and freed,
+    that ID may already have been reallocated,
+    and so the event could be fired on the wrong node altogether.
+    The sorts of checks that would be required to catch that
+    (some kind of DOM epoch tracking) would be onerous and usually wasteful.
+    Moved could also be bad: in a lazy-rendered scrolling component,
+    nodes are commonly reused so that the same node ID.
+    In each case, the problem is rendered extremely uncommon by intended very low latency between client and host,
+    and mitigated further by the unlikelihood of user events so very immediately following UI state change,
+    but I’d find it a genuine concern for something like LiveView on high latency connections because that easily gives a problem window *baseline* of hundreds of milliseconds,
+    and probably want to do event dispatch by something other than node ID,
+    e.g. serialise any event-data DOM attributes from the event target’s lineage.
+
+## Limitations
+
+    No dynamic `preventDefault`.
+
+    No rewriting of an ◊code.css`a[href]` in the mousedown or click handler,
+    which can occasionally be genuinely useful (and I don’t count Google’s historical use of it in search result pages),
+    and `preventDefault()` followed by href change followed by `click()` is not a suitable alternative because it’ll lose the effect of modifiers like ◊kbd`Ctrl`.
+    This will be one of those “redesign your UI” cases, which will I think *normally* be a good thing.
+
+    One concern of doing this asynchronously is that user-activation-gated API calls might not work.
+    <https://html.spec.whatwg.org/multipage/interaction.html#tracking-user-activation> has the spec details.
+    We should be OK now (though when they first started doing this we probably *wouldn’t* have been):
+    it’s time-based, and although the transient activation duration is not specified and not supposed to be more than a few seconds,
+    that should be enough for conceptually-relatively-synchronous actions even if the VM client is on the other side of the world.
+    (In Firefox as I write this on 2021-09-23,
+    it’s controlled by dom.user_activation.transient.timeout which defaults to 5000 milliseconds.)
+
+    Might have issues calling things that are only allowed to be called in response to a user action—not certain.
+    But per , probably OK.
+
+## Opcodes to implement
+
+      • `Remove { node: ChildNode }`: equivalent to `node.remove()`.
+        Does *not* free: you might want to put it back later.
+
+## Other potential opcodes
+
+    (But they need justification, whether of functionality or performance.)
+
+      • `AppendChildren { parent: ParentNode, number_of_children: u32, children: [...Node] }`:
+        equivalent to `parent.append(...children)`.
+        Compare performance with a sequence of `appendChild`.
+
+      • `AppendUntrackedString { parent: ParentNode, data: string }`:
+        equivalent to CreateTextNode followed by AppendChild, but without allocating a node ID.
+        The host can use `parent.append(data)` instead of `parent.appendChild(document.createTextNode(data))`.
+        Very unlikely to be warranted, but `AppendChildren` got me thinking of it;
+        I don’t think there’d be any sane way of integrating untracked string children into `AppendChildren`.
+
+      • `RemoveAllChildren { node: ParentNode }`:
+        removes all children;
+        equivalent to `node.textContent = ""` or any number of ways of doing the same thing.
+        (Doesn’t *free* them, though.)
+
+      • `SetUntrackedInnerHtml { parent: ParentNode, inner_html: string }`:
+        sets innerHTML.
+        The resulting child nodes will be invisible to both client and host.
+        At first glance this might seem to complicate event handling for the host because now it may need to traverse target’s parents until it finds one with a node ID,
+        but browser extensions mean that we could never quite trust that all targets would have node IDs anyway,
+        so making it more robust in this way is probably for the best.
+        This could be justified as a performance thing for what I’ll call “static trees”,
+        meaning parts of components that don’t contain anything where node identification is required
+        (though performance claims will naturally need to be verified).
+        In whatever sort of component layer my own client implementations end up with,
+        this could really start to shine if components could be identified as being static trees,
+        and allow such components to be part of static trees,
+        and merge them recursively—
+        I don’t know of any framework where components are able to be zero-cost like this,
+        even in environments that try to be at least a little bit clever about it
+        (I’m talking DOM ones; SSR ones there are plenty that directly write strings to a buffer so that the difference will be negligible).
+        Any tracked children removed by this operation would still be allocated and need freeing.
+
+      • `SetTextContent { node: Node, data: string }`:
+        sets `textContent`.
+        TODO: verify performance on setting `data` and `textContent` on `CharacterData`,
+        as they do exactly the same thing for `CharacterData` and maybe I should remove `SetData` in favour of `SetTextContent`.
+        If this were implemented,
+        I’d probably also ditch plans for `RemoveAllChildren`
+        in favour of `SetTextContent { node, inner_html: "" }`
+        as saving four bytes on the instruction isn’t worth the cost to the VM size.
+
+      • `SetTrackedInnerHtml { parent: ParentNode, inner_html: string }`:
+        sets innerHTML and then traverses the children recursively in tree order,
+        assigning a node ID to each.
+        (The client must naturally have done something equivalent.)
+        It’s *possible* that this could be faster than the corresponding sequence of `createElement`/`createTextNode`/`appendChild` calls,
+        but I think it’d need to be quite big before it’d stand much chance of being worthwhilely better.
+        As far as wire size is concerned,
+        the bytecode is very compact,
+        so serialised HTML would be unnlikely to be smaller.
+        I’d levy a pretty big burden of proof on this one.
+
+      • `InsertTrackedAdjacentHtml { parent: ParentNode, position: enum { BeforeBegin, AfterBegin, BeforeEnd, AfterEnd }, html: string }`:
+        `SetTrackedInnerHtml`, but for `insertAdjacentHTML` instead of `innerHTML`.
+        A smidgeon more complicated.
+
+## On dictionaries and string interning
+
+    I doubt that a dynamic dictionary will be worthwhile, though I could turn out to be wrong.
+    But a static dictionary, absolutely: I like the idea of applying profile-guided optimisation.
+    Definitely going to need to generate the remote-dom-vm JS and client together for this.
diff --git a/vm/src/build.rs b/vm/src/build.rs

new file mode 100644 (file)

index 0000000..c6dcf3f
--- /dev/null
+++ b/vm/src/build.rs
@@ -0,0 +1,41 @@
+fn main() {
+    let out_dir = std::env::var("OUT_DIR").unwrap();
+    let context = gpp::Context::new();
+    context.macros.insert(
+    );
+    // Now onto the features.
+    context.macros.insert("WITH_REALLOCATOR".into(), String::new());
+
+    #[cfg(feature = "svg")] {
+        context.macros.insert("WITH_SVG".into(), String::new());
+    }
+    #[cfg(feature = "comment")] {
+        context.macros.insert("WITH_COMMENT".into(), String::new());
+    }
+    #[cfg(feature = "document-fragment")] {
+        context.macros.insert("WITH_DOCUMENT_FRAGMENT".into(), String::new());
+    }
+    std::fs::write(
+        out_dir.join("vm.ts"),
+        gpp::process_str(
+            concat!(
+                "// This file is generated based on the required features, in order to keep file\n\
+                // size minimal while maintaining compatibility between the VM and client.\n\
+                // Therefore, don’t modify it directly.",
+                include_str!("vm.ts.template"),
+            ),
+            &mut context,
+        ).unwrap(),
+    );
+
+    std::fs::write(
+        out_dir.join("vm.min.js"),
+        &gpp::process_str(
+            include_str!("vm.min.js.template"),
+            &mut context,
+        ).unwrap()
+        // Minified source: remove leading whitespace on each line, and all line breaks.
+        // (Aside: I love how smooth Rust makes this kind of operation.)
+        .lines().map(str::trim_start).collect::<String>(),
+    );
+}
diff --git a/vm/src/vm.min.js.template b/vm/src/vm.min.js.template

new file mode 100644 (file)

index 0000000..56ce359
--- /dev/null
+++ b/vm/src/vm.min.js.template
@@ -0,0 +1,140 @@
+#define // This is the master file for minified source, containing various tricks to shrink it as far as possible.
+#define // After being fed through gpp, all leading whitespace on every line is removed, and then all line breaks are removed.
+#define // End result, an ECMAScript module of fully minified code.
+
+#define Opcode.CreateElement 0
+#define Opcode.CreateSvgElement 1
+#define Opcode.CreateTextNode 2
+#define Opcode.CreateComment 3
+#define Opcode.CreateDocumentFragment 4
+#define Opcode.SetData 5
+#define Opcode.SetAttribute 6
+#define Opcode.RemoveAttribute 7
+#define Opcode.AppendChild 8
+#define Opcode.InsertBefore 9
+#define Opcode.Free 10
+
+#define // Global variables (the textDecoder/var1 collision is fine)
+#define textDecoder t
+#define readString a
+#define // Local variables of execute(), readString() and maybe a little else
+#define instructions e
+#define nodes s
+#define offset b
+#define var1 t
+#define var2 r
+#define var3 c
+#define length n
+#define // Local variables of push_node (also uses nodes)
+#define node t
+#define id a
+#define // VM instance properties
+#define NODES t
+#define NEXT_ID a
+#define push_node e
+
+export default class VM{
+       constructor(){
+#              define // FIXME: is 0 document or what? (Perhaps I should define a “document” opcode rather than prefilling it?)
+#              ifdef WITH_REALLOCATOR
+                       this.NEXT_ID = 0;
+#              endif
+               this.NODES=[];
+       }
+
+#ifdef WITH_REALLOCATOR
+       push_node(node) {
+               var nodes=this.NODES,
+                       id=this.NEXT_ID;
+               id<nodes.length||nodes.push(id+1),
+               this.NEXT_ID=nodes[id],
+               nodes[id]=node
+       }
+#      define PUSH_NODE this.push_node
+#else
+#      define PUSH_NODE nodes.push
+#endif
+
+       execute(instructions){
+               for(
+                       var var1,
+                               var2,
+                               var3,
+                               nodes=this.NODES,
+                               length=instructions.byteLength,
+                               offset=0;
+                       offset<length;
+               )
+               while(offset<length)
+                       switch(
+                               var3=instructions[offset++],
+                               var1=instructions[offset++]<<24|instructions[offset++]<<16|instructions[offset++]<<8|instructions[offset++],
+                               var3
+                       ){
+                               case Opcode.CreateElement:
+#ifdef WITH_SVG
+                               case Opcode.CreateSvgElement:
+#endif
+                                       [var2,offset]=readString(instructions,offset),
+                                       PUSH_NODE(
+#ifdef WITH_SVG
+                                               var3?
+#define // The way I’m abusing #define for comments here gets veeeeery close to falling apart on the next line! 😀
+                                                       nodes[var1].createElementNS("http://www.w3.org/2000/svg",var2):
+#endif
+                                                       nodes[var1].createElement(var2)
+                                       );
+                                       break;
+                               case Opcode.CreateTextNode:
+                                       [var2,offset]=readString(instructions,offset),
+                                       PUSH_NODE(nodes[var1].createTextNode(var2));
+                                       break;
+#ifdef WITH_COMMENT
+                               case Opcode.CreateComment:
+                                       [var2,offset]=readString(instructions,offset),
+                                       PUSH_NODE(nodes[var1].createComment(var2));
+                                       break;
+#endif
+#ifdef WITH_DOCUMENT_FRAGMENT
+                               case Opcode.CreateDocumentFragment:
+                                       PUSH_NODE(nodes[var1].createDocumentFragment());
+                                       break;
+#endif
+                               case Opcode.SetData:
+                                       [var2,offset]=readString(instructions,offset),
+                                       nodes[var1].data=var2;
+                                       break;
+                               case Opcode.SetAttribute:
+                                       [var2,offset]=readString(instructions,offset),
+                                       [var3,offset]=readString(instructions,offset),
+                                       nodes[var1].setAttribute(var2,var3);
+                                       break;
+                               case Opcode.RemoveAttribute:
+                                       [var2,offset]=readString(instructions,offset),
+                                       nodes[var1].removeAttribute(var2);
+                                       break;
+                               case Opcode.AppendChild:
+                                       nodes[var1].appendChild(nodes[instructions[offset++]<<24|instructions[offset++]<<16|instructions[offset++]<<8|instructions[offset++]]);
+                                       break;
+                               case Opcode.InsertBefore:
+                                       var2=nodes[instructions[offset++]<<24|instructions[offset++]<<16|instructions[offset++]<<8|instructions[offset++]],
+                                       nodes[var1].insertBefore(
+                                               nodes[instructions[offset++]<<24|instructions[offset++]<<16|instructions[offset++]<<8|instructions[offset++]],
+                                               var2
+                                       );
+                                       break;
+                               case Opcode.Free:
+                                       delete nodes[var1];
+                                       break
+                       }
+       }
+}
+
+var textDecoder=new TextDecoder,
+function readString(instructions,offset){
+       var length=instructions[offset++]<<24|instructions[offset++]<<16|instructions[offset++]<<8|instructions[offset++];
+       return[
+               textDecoder.decode(new Uint8Array(instructions.buffer,instructions.byteOffset+offset,length)),
+               offset+length
+       ]
+}
diff --git a/vm/src/vm.ts.template b/vm/src/vm.ts.template

new file mode 100644 (file)

index 0000000..d804e62
--- /dev/null
+++ b/vm/src/vm.ts.template
@@ -0,0 +1,206 @@
+type u32 = number;
+type Offset = u32;
+
+const enum Opcode {
+       /// { document: Document, tag_name: &'static str }
+       CreateElement = 0,
+#ifdef WITH_SVG
+       /// { document: Document, tag_name: &'static str }
+       CreateSvgElement = 1,
+#endif
+       /// { document: Document, data: &str }
+       CreateTextNode = 2,
+#ifdef WITH_COMMENT
+       /// { document: Document, data: &str }
+       CreateComment = 3,
+#endif
+#ifdef WITH_DOCUMENT_FRAGMENT
+       /// { document: Document }
+       CreateDocumentFragment = 4,
+#endif
+       /// { node: CharacterData, data: &str }
+       SetData = 5,
+       /// { element: Element, name: &'static str, value: &str }
+       SetAttribute = 6,
+       /// { element: Element, name: &'static str }
+       RemoveAttribute = 7,
+       /// { parent: Element, new_child: Node }
+       AppendChild = 8,
+       /// { parent: Element, reference_child: Node, new_child: Node }
+       InsertBefore = 9,
+       /// { reference: Node }
+       Free = 10,
+}
+
+class VM {
+#ifdef WITH_REALLOCATOR
+       private next_id: u32;
+       private nodes: (Node | u32)[];
+#      define PUSH_NODE push_node
+#else
+       private nodes: Node[];
+#      define PUSH_NODE nodes.push
+#endif
+       private textDecoder: TextDecoder;
+
+       constructor() {
+               // FIXME: is 0 document or what? (Perhaps I should define a “document” opcode rather than prefilling it?)
+#ifdef WITH_REALLOCATOR
+               this.next_id = 0;
+#endif
+               this.nodes = [];
+               this.textDecoder = new TextDecoder();
+       }
+
+#ifdef WITH_REALLOCATOR
+       private push_node(node: Node) {
+               if (this.next_id === this.nodes.length) {
+                       this.nodes.push(this.next_id + 1);
+               }
+               const id = this.next_id;
+               this.next_id = this.nodes[id] as u32;
+               this.nodes[id] = node;
+       }
+
+#endif
+       execute(instructions: Uint8Array) {
+               const length = instructions.byteLength;
+               let offset: Offset = 0;
+               while (offset < length) {
+                       // Variable-width instruction set. First byte is the opcode.
+                       const opcode = instructions[offset++];
+                       switch (opcode) {
+                               case Opcode.CreateElement:
+#ifdef WITH_SVG
+                               case Opcode.CreateSvgElement:
+#endif
+                               {
+                                       const [document, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       const [tagName, offset2] = this.readString(
+                                               instructions,
+                                               offset1,
+                                       );
+                                       offset = offset2;
+#ifdef WITH_SVG
+                                       const element = (opcode == Opcode.CreateElement) ?
+                                               document.createElement(tagName) :
+                                               document.createElementNS('http://www.w3.org/2000/svg', tagName);
+#else
+                                       const element = document.createElement(tagName);
+#endif
+                                       this.PUSH_NODE(element);
+                                       break;
+                               }
+                               case Opcode.CreateTextNode: {
+                                       const [document, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       const [data, offset2] = this.readString(instructions, offset1);
+                                       offset = offset2;
+                                       this.PUSH_NODE(document.createTextNode(data));
+                                       break;
+                               }
+#ifdef WITH_COMMENT
+                               case Opcode.CreateComment: {
+                                       const [document, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       const [data, offset2] = this.readString(instructions, offset1);
+                                       offset = offset2;
+                                       this.PUSH_NODE(document.createComment(data));
+                                       break;
+                               }
+#endif
+#ifdef WITH_DOCUMENT_FRAGMENT
+                               case Opcode.CreateDocumentFragment: {
+                                       const [document, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       offset = offset1;
+                                       this.PUSH_NODE(document.createDocumentFragment());
+                                       break;
+                               }
+#endif
+                               case Opcode.SetData: {
+                                       const [characterData, offset1] = this.readNode(instructions, offset) as [CharacterData, Offset];
+                                       const [data, offset2] = this.readString(instructions, offset1);
+                                       offset = offset2;
+                                       characterData.data = data;
+                                       break;
+                               }
+                               case Opcode.SetAttribute: {
+                                       const [element, offset1] = this.readNode(instructions, offset) as [Element, Offset];
+                                       const [name, offset2] = this.readString(instructions, offset1);
+                                       const [value, offset3] = this.readString(instructions, offset2);
+                                       offset = offset2;
+                                       element.setAttribute(name, value);
+                                       break;
+                               }
+                               case Opcode.RemoveAttribute: {
+                                       const [element, offset1] = this.readNode(instructions, offset) as [Element, Offset];
+                                       const [attributeName, offset2] = this.readString(instructions, offset1);
+                                       offset = offset2;
+                                       element.removeAttribute(attributeName);
+                                       break;
+                               }
+                               case Opcode.AppendChild: {
+                                       const [parentNode, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       const [newChild, offset2] = this.readNode(instructions, offset1);
+                                       offset = offset2;
+                                       parentNode.appendChild(newChild);
+                                       break;
+                               }
+                               case Opcode.InsertBefore: {
+                                       const [parentNode, offset1] = this.readNode(instructions, offset) as [Document, Offset];
+                                       const [referenceChild, offset2] = this.readNode(instructions, offset1);
+                                       const [newChild, offset3] = this.readNode(instructions, offset2);
+                                       offset = offset3;
+                                       parentNode.insertBefore(newChild, referenceChild);
+                                       break;
+                               }
+                               case Opcode.Free: {
+                                       const [node, offset1] = this.readU32(instructions, offset);
+                                       offset = offset1;
+#ifdef WITH_REALLOCATOR
+                                       this.nodes[node] = this.next_id;
+                                       this.next_id = node;
+#else
+                                       delete this.nodes[node];
+#endif
+                                       break;
+                               }
+                               default: {
+                                       const error = new Error('Unknown opcode ');
+                                       error.name = 'VmError';
+                                       throw error;
+                               }
+                       }
+               }
+       }
+
+       private readStringWithDictionary(instructions: Uint8Array, offset: Offset, dictionary: string[]): [string, Offset] {
+               const index = instructions[offset++];
+               if (index == 255) {
+                       return this.readString(instructions, offset);
+               } else {
+                       return [dictionary[index], offset];
+               }
+       }
+
+       private readString(instructions: Uint8Array, offset: Offset): [string, Offset] {
+               const [length, offset1] = this.readU32(instructions, offset);
+               const string = this.textDecoder.decode(new Uint8Array(instructions.buffer, instructions.byteOffset + offset1, length));
+               return [string, offset1 + length];
+       }
+
+       private readU32(instructions: Uint8Array, offset: Offset): [u32, Offset] {
+               return [
+                       (
+                               (instructions[offset++] << 24) |
+                               (instructions[offset++] << 16) |
+                               (instructions[offset++] << 8) |
+                               (instructions[offset++] << 0)
+                       ),
+                       offset,
+               ];
+       }
+
+       private readNode(instructions: Uint8Array, offset: Offset): [Node, Offset] {
+               const [node, offset1] = this.readU32(instructions, offset);
+               return [this.nodes[node], offset1];
+       }
+}
author	Chris Morgan <me@chrismorgan.info>
	Sat, 6 May 2023 12:52:30 +0000
committer	Chris Morgan <me@chrismorgan.info>
	Sat, 6 May 2023 12:52:30 +0000
.gitignore	[new file with mode: 0644]	patch \| blob
COPYING	[new file with mode: 0644]	patch \| blob
README.ttt	[new file with mode: 0644]	patch \| blob
bytecode.ttt	[new file with mode: 0644]	patch \| blob
rust-client/Cargo.toml	[new file with mode: 0644]	patch \| blob
rust-client/src/dictionaries.rs	[new file with mode: 0644]	patch \| blob
rust-client/src/lib.rs	[new file with mode: 0644]	patch \| blob
rust-client/tests/web.rs	[new file with mode: 0644]	patch \| blob
todo.ttt	[new file with mode: 0644]	patch \| blob
vm/src/build.rs	[new file with mode: 0644]	patch \| blob
vm/src/vm.min.js.template	[new file with mode: 0644]	patch \| blob
vm/src/vm.ts.template	[new file with mode: 0644]	patch \| blob