Flyweight proxy

All the way back in 2012 (before JavaScript frameworks were all the rage), I started exploring more "serious" JS development beyond the wizardry of JQuery. We all have our moments of weakness.

A lot was being written about development within the confines of the browser. An excellent book I came across at the time was Learning JavaScript Design Patterns by Addy Osmani. It is still a relevant read, packed with proven programming patterns adapted to JavaScript, as well as some historical trends that the ecosystem was exploring back then. It's available for free on his website.

The Flyweight pattern

One particular design pattern that caught my attention in the book was the Flyweight. Here's how Addy describes it:

The Flyweight pattern [...] aims to minimize the use of memory in an application by sharing as much data as possible with related objects

You can read the chapter, but the basic idea is to separate intrinsic and extrinsic properties of an object in order to optimise memory usage. In other words, we separate an object's properties into those that are unique about the object (intrinsic), and things that are a factor of something else (extrinsic), that can be shared among other similar objects. For example, consider a "comment" object that could be used in a discussion board:

let comment = {
id: 54719,
username: "nemzes",
joined: "2018-06-09",
userRank: "Seasoned",
totalPosts: 122,
votingScore: 1982,
profileImage: "https://…",
date: "2020-08-19",
text: "I disagree, and here's why…",
likes: 3,
};

If we have many of these comment objects repeating the same user data we can extract the extrinsic information into a "manager" object, and link the two objects via an ID. This way comments that belong to the same user don't duplicate that user's information.

let comment = {
id: 54719,
userId: 3282,
date: "2020-08-19",
text: "I disagree, and here's why…",
likes: 3,
};

const userManager = {
// …
3282: {
username: "nemzes",
joined: "2018-06-09",
userRank: "Seasoned",
totalPosts: 122,
votingScore: 1982,
profileImage: "https://…",
},
// …
};

We reduced data duplication — and thus memory usage — at the cost of a more complex data structure. If you are familiar with database design you will recognise this process as data normalisation. We'll keep calling it the Flyweight pattern.

There are several ways to implement the pattern. If we are using classes to define our comment system, we can do something like this:

class Comment {
constructor(id, userId, date, text, likes) {
this.id = id;
this.userId = userId;
this.date = date;
this.text = text;
this.likes = likes;
}

getUser() {
return userManager[this.userId];
}
}

const myComment = new Comment(54719, 3282, "2020-08-19", "I disagree…", 3, 1);
const username = myComment.getUser().username;

console.log(`User ${username} wrote "${myComment.text}"`);

Note that when using myComment we must know how to access the extrinsic information (via the getUser() method). What if we can't or don't want to modify the consumer of the Comment objects? In this specific case, it would be possible to use getters to work around the issue:

class Comment {
constructor(id, userId, date, text, likes) {
// as above
}

get username() {
return userManager[this.userId].username;
}

get profileImage() {
return userManager[this.userId].profileImage;
}

// … other extrinsic property getters …
}

let myComment = new Comment(54719, 3282, "2020-08-19", "I disagree…", 3, 1);

console.log(`User ${myComment.username} wrote "${myComment.text}"`);

I have used this pattern, or variations, many times to great success. Easy peasy, right?

A tree of problems

Recently an interesting scenario showed up. An application I work with loads a large JSON object containing a hierarchical data structure — a tree. This tree is used to populate several navigation widgets on the screen, giving the user the ability to choose different nodes in different contexts. The data looks a bit like this:

[
{
"label": "Some parent",
"children": [
{
"label": "Some child",
"children": []
},
{
"label": "Another child",
"children": [
{
"label": "Grandchild",
"children": []
}
]
}
]
}
]

The JSON file is currently over 5MB. Since the application runs in a browser this is a particularly Bad Thing™: we shouldn't load large amounts of data like this if we can avoid it. Ideally we would load data lazily, maybe by calling a backend system that delivers branches of the tree as the user opens them. Due to practical reasons this was not an option in this case — data has to be loaded in one large chunk. Until the 5MB are downloaded and parsed, the navigation widgets are not useable — but that isn't even the worse part.

The ugly

Each widget expects to keep state of which nodes are open or closed in a boolean property on each node (the property is called, reasonably enough, open). When inspected, a widget's internal data property looks like this:

[
{
"label": "Some parent",
"open": true,
"children": [
{
"label": "Some child",
"open": false,
"children": []
},
{
"label": "Another child",
"open": true,
"children": [
{
"label": "Grandchild",
"open": false,
"children": []
}
]
}
]
}
]

Each widget's state (i.e. which nodes are open or closed) must be independent of each other so that the user can use the widgets independently. Since each widget mutates its data structure to keep track of this state, we must keep the data structures separate. The initial implementation simply cloned the full tree data into each widget when it was instantiated. This is slow and uses a lot of memory. 😭

const infoNav = new TreeWidget(deepClone(treeData));
const selectNav = new TreeWidget(deepClone(treeData));
const compareNav = new TreeWidget(deepClone(treeData));

This seemed like a perfect candidate for the Flyweight pattern. For each widget, the extrinsic (shared) data is the tree (all 5MB of it) and the intrinsic state is simply the open property in each node. But how do we separate this data?

The bad

We could keep a list of open nodes in each widget, separate from the tree data. As we render the tree, we can read the local list of open nodes and display each accordingly as open or closed. Opening a node would store its ID in the widget's "open nodes" list; closing it would remove it. Here's how that behaviour would look like:

class TreeWidget {
constructor(data) {
this.data = data;
this.openNodes = new Set();
}

// called when toggling a node
onToggleNode(node) {
if (this.openNodes.has(node.id)) {
this.openNodes.delete(node.id);
} else {
this.openNodes.add(node.id);
}
}

// called when rendering a node
isNodeOpen(node) {
return this.openNodes.has(node.id);
}
}

If you are not familiar with Set, think of it as a simple list (like an array) that inherently prevents duplicates (the Set documentation is pretty easy to follow). You could of course achieve the same with an array (but more code).

This would be a good solution, and a textbook example of the Flyweight. However, we had two problems…

  • We would prefer not to change the tree widget (it's a shared component in several projects)
  • The nodes in our data tree don't have unique identifiers

The second point was a deal-breaker. We can't associate a node's extrinsic and intrinsic data if there is no way to keep track these via identifiers. The hierarchical structure is itself what identifies a node, so we need to store the open-or-closed state in each node for each widget, and yet share the data structure between widgets somehow. In each node, we would need something like this:

{
"label": "I'm a node",
"open-in-widget-1": true,
"open-in-widget-2": false,
"open-in-widget-3": true,
"children": []
}

If we were changing our widget code (which we didn't want to do) this could be handled using a Symbol or some other unique identifier for each widget instance:

class TreeWidget {
constructor(data) {
this.data = data;
this.instanceId = Symbol();
// Math.random() would also work if you are unfamiliar with Symbols
}

// called when toggling a node
onToggleNode(node) {
if (node[this.instanceId]) {
node[this.instanceId] = false;
} else {
node[this.instanceId] = true;
}
}

// called when rendering a node
isNodeOpen(node) {
return node[this.instanceId];
}
}

This approach works well — but we are adding properties directly to the data structure instead of to a separate object. And if not using Symbol, these properties are just obfuscated, so it's not great encapsulation. Even though Symbol works quite well, it still requires that our widget is aware of this behaviour.

The good

So, what can we use to associate extra data (metadata!) to the tree structure without modifying the structure in a way that impacts its existing consumers (i.e. our widgets)? Proxy was pretty much designed for this purpose. If you're not familiar with Proxies, here is a quick example:

const order = {
food: 'pizza',
drink: 'soda'
};

const handler = {
get: (obj, prop) => (prop in obj) ? obj[prop].toUpperCase() : 'no thanks'
};

const loudOrder = new Proxy(order, handler);

console.log(loudOrder.food); // "PIZZA"
console.log(loudOrder.drink); // "SODA"
console.log(loudOrder.dessert); // "no thanks"

A proxy simply wraps another object and intercept calls (via get or set). Code that uses the proxied object can be completely oblivious to its existence.

Here's the interesting part: since the proxy is itself an object, we can use it to store data independently of the wrapped object.

Let's do that, using a proxy to keep track of whether a tree node is open. We'll declare our proxy as a class so that each node can keep its own proxy instance with state:

const aNode = {
"label": "Some parent",
"children": [ /* omitted for brevity */ ]
};

class ToggleNodeProxy {
isOpen = false; // The extrinsic data

get(obj, prop) {
if (prop === 'open') {
return this.isOpen;
}

return obj[prop];
};

set(obj, prop, value) {
if (prop === 'open') {
this.isOpen = value;
} else {
obj[prop] = value;
}

return true;
};
};

const proxyNode = new Proxy(aNode, new ToggleNodeProxy());

console.log(proxyNode.open); // false
proxyNode.open = true;
console.log(proxyNode.open); // true
console.log(proxyNode); // {label: "Some parent", children: Array(2)}

And we now have a flyweight proxy!

Now we just need to apply this to… all… nodes in our tree. Oops. 😬 We don't want to traverse the JSON and run new ToggleNodeProxy() everywhere; it will take a long time to walk the tree and still use a lot of memory — exactly what we set off to avoid.

Fortunately we don't need to do that. When the tree is read by the widget, it iterates over the root nodes. If any of these nodes are marked as open then the nodes in its children property are read (and so on, recursively). So nodes are only accessed via the children property as the tree's open nodes are traversed. And children is just a property of a node, so if the node is wrapped in a Proxy we can intercept the access to its children, just like we do with open. At this point do we can wrap the child nodes in proxies, if they haven't yet been wrapped. Here's how that code looks like:

class ToggleNodeProxy {
proxyChildren = null;

wrapChildrenInProxies(nodes) {
this.proxyChildren = nodes.map(
node => new Proxy(node, new ToggleNodeProxy())
);
}

get(obj, prop) {
if (prop === 'children') {
if (!this.proxyChildren) {
this.wrapChildrenInProxies(obj.children);
}

return this.proxyChildren;
}

return obj[prop];
};
}

We just need to wrap the root nodes of the tree in proxies before passing them to the our widgets. All navigation through children will trigger further proxy wrapping. Assuming our widget expects an array of root nodes, we wrap the root nodes like this:

widget = new Widget(nodesArray.map(
node => new Proxy(node, new ToggleNodeProxy()))
);

As you can see in the chart below, memory consumption and time-to-responsiveness had a radical improvement over the original data cloning approach. With a large number of widgets cloning used 68MB of memory, while the flyweight proxy kept memory usage flat around 9MB.

Comparison chart; cloning method uses up to 68MB, proxy method uses 9.3MB

And here is the final, tidied up code for our ToggleNodeProxy implementation:

class ToggleNodeProxy {
isOpen = false;
proxyChildren = null;

wrapChildrenInProxies(nodes) {
this.proxyChildren = nodes.map(
node => new Proxy(node, new ToggleNodeProxy())
);
}

get(obj, prop) {
switch(prop) {
case 'open':
return this.isOpen;

case 'children':
if (!this.proxyChildren) {
this.wrapChildrenInProxies(obj.children);
}

return this.proxyChildren;

default:
return obj[prop];
}
};

set(obj, prop, value) {
switch(prop) {
case 'open':
this.isOpen = value;
break;

case 'children':
this.proxyChildren = null;
break;

default:
obj[prop] = value;
}

return true;
};

deleteProperty(obj, prop) {
if (prop === 'children') {
this.proxyChildren = null;
}

return delete obj[prop];
};
}