+# Miscellaneous notes and things to possibly do and such
+
+ (This started out as a short file of simple entries,
+ but then I went adding more notes to individual items,
+ and it ended up what it is now.)
+
+## Better node ID allocator
+
+ I’ve currently implemented the simplest possible node ID allocator: auto-increment without reuse;
+ for this, the client need only track a u32 of the next ID, and the VM side uses an ever-growing sparse array.
+
+ I should compare it with the following more serious allocator:
+ client and VM both track all allocated IDs in an array that will always be as large as the maximum number of IDs that have existed at any given time.
+ The VM code will look quite like this:
+
+ ◊code.ts`
+ let nodes: (Node | u32)[] = [document];
+ let next_id: u32 = 1;
+
+ function free(id) {
+ nodes[id] = next_id;
+ next_id = id;
+ }
+
+ function push_node(node) {
+ if (next_id === nodes.length) nodes.push(next_id + 1);
+ const id = next_id;
+ next_id = nodes[id];
+
+ nodes[id] = node;
+ return id;
+ }
+ `
+
+ And the client side like this:
+
+ ◊code.rs`
+ enum NodeSlot {
+ Free {
+ next_free_slot: NonZeroU32,
+ },
+ Full,
+ }
+
+ struct Vm {
+ nodes: Vec<NodeSlot> = vec![NodeSlot::Full],
+ next_id: NonZeroU32 = 1,
+ }
+
+ impl Vm {
+ fn free(&mut self, id: NonZeroU32) {
+ self.nodes[id] = NodeSlot::Free { next_free_slot: self.next_id };
+ self.next_id = id;
+ }
+
+ fn allocate_node(&mut self) -> NodeId {
+ if self.next_id == self.nodes.len() {
+ self.nodes.push(NodeSlot::Free { next_free_slot: self.next_id + 1 });
+ }
+ let id = self.next_id;
+ self.next_id = match &mut self.nodes[id] {
+ slot @ NodeSlot::Free { next_free_slot } => {
+ *slot = NodeSlot::Full;
+ next_free_slot
+ },
+ NodeSlot::Full => unreachable!(),
+ };
+ id
+ }
+ }
+ `
+
+ If I ever decide that the client side needs to materialise an actual DOM-like tree
+ (which I’m trying to avoid, but which may become desirable for event propagation),
+ then this would become a fairly obvious choice
+ (though it might need some finesse to keep size down—2 billion nodes ought to be enough for anyone,
+ I could limit the next free slot numbers to 31-bit and use one bit as a discriminant over some kind of pointer to the actual node data).
+
+ Otherwise, I honestly don’t know;
+ the limit of only ever creating 2\³\² nodes in total is not too awful, and sparse arrays are probably fairly decent.
+ Try to benchmark it plausibly.
+ Try to keep it easy to swap in and out too.
+
+## Terminology
+
+ The **host** (also known as the **VM host**)
+ is the bit of JavaScript that runs in the main browser thread,
+ and executes the VM bytecode.
+
+ The **client** (also known as the **VM client**)
+ is the bit that writes the VM bytecode and potentially receives serialised events.
+ In my primary designed use case,
+ it runs in a worker and is WebAssembly,
+ with ideally a SharedArrayBuffer or two as the channel between the client and host,
+ but it could be written in any language and even run on a completely different machine.
+ In fact, you could use this to implement something like Phoenix LiveView with a WebSocket as the channel—in that instance,
+ the VM client would run on the server!
+ (Mind you, you *would* need to be a bit careful about disconnect recovery.)
+
+## Event handling
+
+ Yes, the first part that’s not write-only.
+
+ 1. By some means, the host will learn what events the client is interested in.
+
+ (Qwik’s approach is interesting here:
+ it writes the events names of interest to the DOM,
+ and Qwik only even bothers registering for the ones that are *on screen*,
+ IntersectionObserver and all.
+ I don’t intend to *start* that way,
+ but I’m keeping it in mind and definitely not ruling it out.)
+
+ 2. There will also be some means of the host synchronously determining where `preventDefault` should be invoked.
+ (The two main approaches that I have in mind:
+ declare a JavaScript function that takes the target element and returns a boolean;
+ or specify a DOM attribute so that you can check ◊code.js`target.closest('[prevent-default]')`.)
+
+ 3. The host will add corresponding event listeners to the rendering root
+ (which will probably not be ◊code.js`document` or ◊code.js`document.documentElement`).
+
+ (It’d be quite at liberty to add and remove them dynamically, too.)
+
+ 4. When an event is fired, the host will serialise it along with the target’s node ID.
+ (The precise serialisation may vary by application, as with the instruction bytecode—
+ for optimal size and performance, this project is a VM factory, not a VM.)
+
+ 5. The client will deserialise this and can do what it likes with it,
+ which can, if desired, entail a total reimplementation of event dispatch except for `preventDefault`.
+ (It actually doesn’t even need a full tree,
+ only a mapping from node to parent
+ (which doesn’t even take more memory, given the reallocater:
+ limit it to ◊`2³¹ − 1` node IDs, and store the parent ◊code.rs`Option<NonZeroU31>` in ◊code.rs`NodeSlot::Full`):
+ trace the path to the root node,
+ fire capture phase while unwinding that,
+ then trace it again for bubbling.)
+
+ (In practice, I’m not decided on the precise event dispatch model.
+ There is some appeal to widget-based dispatch like Druid’s,
+ where propagation is entirely left to the component.)
+
+ With the proposed node → parent mapping,
+ the client side will still not be able to enumerate their children,
+ a single “remove node” instruction (… huh, haven’t implemented that yet)
+ may need to be followed be a large number of “free” instructions,
+ because I expect that’ll still be the cheapest way of implementing it.
+ I’ll still want a “remove all children” opcode, too,
+ because for nodes with many children its performance is better
+ (though I should benchmark that at length in various realistic circumstances).
+
+## The race condition problem
+
+ It sounds like everything’s neat and sorted at this point,
+ but there’s one rather uncomfortable problem remaining: race conditions.
+ This architecture is asynchronous,
+ so the host could send an event for a node that has been removed or moved.
+ *Most* of the time, these race conditions should be harmless,
+ because root-finding on a removed node will fail due to a parentless or freed node,
+ or event semantics are probably equivalent on a moved node in its before and after locations.
+ But both could be bad: if a node is removed and freed,
+ that ID may already have been reallocated,
+ and so the event could be fired on the wrong node altogether.
+ The sorts of checks that would be required to catch that
+ (some kind of DOM epoch tracking) would be onerous and usually wasteful.
+ Moved could also be bad: in a lazy-rendered scrolling component,
+ nodes are commonly reused so that the same node ID.
+ In each case, the problem is rendered extremely uncommon by intended very low latency between client and host,
+ and mitigated further by the unlikelihood of user events so very immediately following UI state change,
+ but I’d find it a genuine concern for something like LiveView on high latency connections because that easily gives a problem window *baseline* of hundreds of milliseconds,
+ and probably want to do event dispatch by something other than node ID,
+ e.g. serialise any event-data DOM attributes from the event target’s lineage.
+
+## Limitations
+
+ No dynamic `preventDefault`.
+
+ No rewriting of an ◊code.css`a[href]` in the mousedown or click handler,
+ which can occasionally be genuinely useful (and I don’t count Google’s historical use of it in search result pages),
+ and `preventDefault()` followed by href change followed by `click()` is not a suitable alternative because it’ll lose the effect of modifiers like ◊kbd`Ctrl`.
+ This will be one of those “redesign your UI” cases, which will I think *normally* be a good thing.
+
+ One concern of doing this asynchronously is that user-activation-gated API calls might not work.
+ <https://html.spec.whatwg.org/multipage/interaction.html#tracking-user-activation> has the spec details.
+ We should be OK now (though when they first started doing this we probably *wouldn’t* have been):
+ it’s time-based, and although the transient activation duration is not specified and not supposed to be more than a few seconds,
+ that should be enough for conceptually-relatively-synchronous actions even if the VM client is on the other side of the world.
+ (In Firefox as I write this on 2021-09-23,
+ it’s controlled by dom.user_activation.transient.timeout which defaults to 5000 milliseconds.)
+
+ Might have issues calling things that are only allowed to be called in response to a user action—not certain.
+ But per , probably OK.
+
+## Opcodes to implement
+
+ • `Remove { node: ChildNode }`: equivalent to `node.remove()`.
+ Does *not* free: you might want to put it back later.
+
+## Other potential opcodes
+
+ (But they need justification, whether of functionality or performance.)
+
+ • `AppendChildren { parent: ParentNode, number_of_children: u32, children: [...Node] }`:
+ equivalent to `parent.append(...children)`.
+ Compare performance with a sequence of `appendChild`.
+
+ • `AppendUntrackedString { parent: ParentNode, data: string }`:
+ equivalent to CreateTextNode followed by AppendChild, but without allocating a node ID.
+ The host can use `parent.append(data)` instead of `parent.appendChild(document.createTextNode(data))`.
+ Very unlikely to be warranted, but `AppendChildren` got me thinking of it;
+ I don’t think there’d be any sane way of integrating untracked string children into `AppendChildren`.
+
+ • `RemoveAllChildren { node: ParentNode }`:
+ removes all children;
+ equivalent to `node.textContent = ""` or any number of ways of doing the same thing.
+ (Doesn’t *free* them, though.)
+
+ • `SetUntrackedInnerHtml { parent: ParentNode, inner_html: string }`:
+ sets innerHTML.
+ The resulting child nodes will be invisible to both client and host.
+ At first glance this might seem to complicate event handling for the host because now it may need to traverse target’s parents until it finds one with a node ID,
+ but browser extensions mean that we could never quite trust that all targets would have node IDs anyway,
+ so making it more robust in this way is probably for the best.
+ This could be justified as a performance thing for what I’ll call “static trees”,
+ meaning parts of components that don’t contain anything where node identification is required
+ (though performance claims will naturally need to be verified).
+ In whatever sort of component layer my own client implementations end up with,
+ this could really start to shine if components could be identified as being static trees,
+ and allow such components to be part of static trees,
+ and merge them recursively—
+ I don’t know of any framework where components are able to be zero-cost like this,
+ even in environments that try to be at least a little bit clever about it
+ (I’m talking DOM ones; SSR ones there are plenty that directly write strings to a buffer so that the difference will be negligible).
+ Any tracked children removed by this operation would still be allocated and need freeing.
+
+ • `SetTextContent { node: Node, data: string }`:
+ sets `textContent`.
+ TODO: verify performance on setting `data` and `textContent` on `CharacterData`,
+ as they do exactly the same thing for `CharacterData` and maybe I should remove `SetData` in favour of `SetTextContent`.
+ If this were implemented,
+ I’d probably also ditch plans for `RemoveAllChildren`
+ in favour of `SetTextContent { node, inner_html: "" }`
+ as saving four bytes on the instruction isn’t worth the cost to the VM size.
+
+ • `SetTrackedInnerHtml { parent: ParentNode, inner_html: string }`:
+ sets innerHTML and then traverses the children recursively in tree order,
+ assigning a node ID to each.
+ (The client must naturally have done something equivalent.)
+ It’s *possible* that this could be faster than the corresponding sequence of `createElement`/`createTextNode`/`appendChild` calls,
+ but I think it’d need to be quite big before it’d stand much chance of being worthwhilely better.
+ As far as wire size is concerned,
+ the bytecode is very compact,
+ so serialised HTML would be unnlikely to be smaller.
+ I’d levy a pretty big burden of proof on this one.
+
+ • `InsertTrackedAdjacentHtml { parent: ParentNode, position: enum { BeforeBegin, AfterBegin, BeforeEnd, AfterEnd }, html: string }`:
+ `SetTrackedInnerHtml`, but for `insertAdjacentHTML` instead of `innerHTML`.
+ A smidgeon more complicated.
+
+## On dictionaries and string interning
+
+ I doubt that a dynamic dictionary will be worthwhile, though I could turn out to be wrong.
+ But a static dictionary, absolutely: I like the idea of applying profile-guided optimisation.
+ Definitely going to need to generate the remote-dom-vm JS and client together for this.