JS Memory Leak Troubleshooting Methods

Preface

JS memory issues often appear in Single Page Applications (SPA), generally considered to have the following scenario characteristics:

Long page lifecycle (users may stay for 10 minutes, half an hour, or even 2 hours)
Many interactive features (pages are function-oriented rather than display-oriented)
Heavy JS application (complex data state and view management on the front-end)

Memory leaks are a cumulative process, only becoming an issue when the page lifecycle is slightly longer (the so-called "refresh and fully recover"). Frequent interactions can accelerate the accumulation process, while display-oriented pages rarely expose such issues. Finally, relatively complex JS logic is required for memory issues to occur ("many bugs because the codebase is large, even I can't hold it"). If it's just simple form validation and submission, there's little chance of affecting memory.

So what are the standards for many interactive features and complex JS logic? At what level does it become dangerous?

In reality, even simple pages with slight interactive features (such as partial refresh) can leave memory hazards if not handled carefully, which are called memory issues when exposed.

1. Tool Environment

Tools:

Chrome Task Manager
Chrome DevTools Performance Panel
Chrome DevTools Memory Panel

Environment:

Stable, remove variable factors like network (use fake data)
Easy to repeat operations, reduce "accumulation" difficulty (simplify operation steps, such as removing SMS verification links)
No interference, exclude plugin effects (use incognito mode)

In other words (on Mac):

Command + Shift + N to enter incognito mode
Command + Alt + I to open DevTools
Enter URL to open the page

Then you can start pretending to work

2. Terminology Concepts

First, you need basic memory knowledge and understand the meaning of various records provided by DevTools

Mark-and-sweep

GC algorithms related to JS are mainly reference counting (IE's BOM, DOM objects) and mark-and-sweep (mainstream approach), each with advantages and disadvantages:

Reference counting recovers promptly (release immediately when reference count is 0), but circular references can never be released
Mark-and-sweep doesn't have circular reference issues (recycle if inaccessible), but recovery is not timely and requires Stop-The-World

Mark-and-sweep algorithm steps are as follows:

GC maintains a root list, roots are usually global variables holding references in the code. In JS, the window object is an example of a global variable serving as a root. The window object always exists, so GC considers it and all its children always exist (non-garbage)
All roots are checked and marked as active (non-garbage), and all their children are recursively checked. Everything accessible through roots will not be treated as garbage
All memory blocks not marked as active are treated as garbage, GC can release them back to the operating system

Modern GC technology has made various improvements to this algorithm, but the essence is the same: memory blocks marked as accessible are identified, and the rest is garbage

Shallow Size & Retained Size

Memory can be viewed as a graph composed of primitive types (such as numbers and strings) and objects (associative arrays). More vividly, memory can be represented as a graph composed of multiple interconnected points, as shown below:

  3-->5->7
  ^      ^
 /|      |
1 |      6-->8
 \|     /^
  v    /
  2-->4

Objects can occupy memory in two ways:

Directly through the object itself
Implicitly by holding references to other objects, preventing these objects from being automatically processed by the garbage collector (GC)

In DevTools' heap memory snapshot analysis panel, you'll see Shallow Size and Retained Size representing the memory size occupied by objects through these two methods respectively

Shallow Size

The size of memory occupied by the object itself. Usually, only arrays and strings have significant Shallow Size. However, the main storage of strings and external arrays is generally located in renderer memory, with only a small wrapper object placed on the JavaScript heap

Renderer memory is the sum of memory for the page rendering process: native memory + page's JS heap memory + JS heap memory of all dedicated workers started by the page. Nevertheless, even a small object may indirectly occupy a large amount of memory by preventing other objects from being processed by the automatic garbage collection process

Retained Size

The amount of memory released after the object itself and objects depending on it (objects no longer accessible from GC root) are deleted

There are many internal GC roots, most of which don't need attention. From an application perspective, GC roots include the following categories:

Window global object (located in each iframe). There's a distance field in the heap snapshot, indicating the number of property references on the shortest retention path starting from window.
Document DOM tree, composed of all native DOM nodes accessible through traversing document. Not all nodes have JS wrappers, but if there's a wrapper and the document is active, the wrapper will also be active
Sometimes, objects may be retained by debugger context and DevTools console (for example, after console evaluation). So when creating heap snapshots for debugging, clear the console and remove breakpoints

The memory graph starts from root, root can be the browser's window object or Node.js module's Global object, we cannot control how root objects are garbage collected

  3-->5->7   9-->10
  ^      ^
 /|      |
1 |      6-->8
 \|     /^
  v    /
  2-->4

Among them, 1 is root (root node), 7 and 8 are primitive values (leaf nodes), 9 and 10 will be GC'd (isolated nodes), and the rest are objects (non-root non-leaf nodes)

Object's retaining tree

The heap is a network of interconnected objects. In mathematics, such a structure is called a "graph" or memory graph. A graph consists of nodes connected by edges, both represented by given labels:

Nodes (or objects) are labeled with the constructor name (used to construct nodes)
Edges are labeled with property names

distance refers to the distance from GC root. If most objects of a certain type have the same distance, with only a few objects having larger distances, it's necessary to investigate carefully

Dominator

Dominated objects are composed of tree structures, because each object has only one (direct) dominator. The dominator of an object may not have direct references to the objects it dominates, so the dominator tree is not a spanning tree of the graph

In the object reference graph, if all paths pointing to object B pass through object A, then A dominates B. If object A is the nearest dominator of object B, then A is considered B's direct dominator

In the diagram below:

  1     1 dominates 2
  |     2 dominates 3 4 6
  v
  2
/   \
v   v
4   3   3 dominates 5
|  /|
| / |
|/  |
v   v
6   5   5 dominates 8; 6 dominates 7
|   |
v   v
7   8

So 7's direct dominator is 6, while 7's dominators are 1, 2, 6

V8's JS Object Representation

primitive type

3 primitive types:

Numbers
Booleans
Strings

They cannot reference other values, so they are always leaf or terminal nodes

Numbers have two storage methods:

Direct 31-bit integer values called Small Integers (SMI)
Heap objects, referenced as heap numbers. Heap numbers are used to store values that don't fit SMI format (such as double type), or when a value needs to be boxed, such as setting properties on it

Strings also have two storage methods:

VM heap
Renderer memory (external), creating a wrapper object to access external storage space. For example, script source code and other content received from the web are placed in external storage space rather than copied to the VM heap

New JS object memory is allocated from dedicated JS heap (or VM heap), these objects are managed by V8's GC, therefore, as long as there's a strong reference to them, they will remain active

Native Object

Native objects are everything outside the JS heap. Compared to heap objects, the entire lifecycle of native objects is not managed by V8's GC, and can only be accessed from JS through wrapper objects

Cons String

Concatenated strings consist of pairs of strings stored and concatenated together, only connecting the content of concatenated strings when needed, such as when taking a substring of a concatenated string

For example, concatenating a and b yields string (a, b) representing the concatenation result, then concatenating d with this result yields another concatenated string ((a, b), d)

Array

Arrays are objects with numeric keys. Widely used in V8 VM to store large amounts of data, key-value pair collections used as dictionaries also use array form (for storage)

Typical JS objects correspond to two array types, used to store:

Named properties
Numeric elements

If the number of properties is very small, they can be placed inside the JS object itself

Map

An object describing the type of object and its layout. For example, maps are used to describe implicit object hierarchy structures to implement fast property access

Object group

(each in object group) Each native object consists of objects holding references to each other. For example, each node in a DOM subtree has associations pointing to its parent, next child, and next sibling, thus forming a connected graph. Native objects are not represented in the JS heap, so their size is 0. Instead, wrapper objects are created

Each wrapper object holds a reference to the corresponding native object, used to redirect commands to itself. Thus, object groups hold wrapper objects. But this doesn't form unreclaimable cycles, because GC is smart enough to release the corresponding object group when whose wrapper is no longer referenced. But if you forget to release the wrapper, it will hold the entire object group and related wrappers

3. Tool Usage

Task Manager

Used to roughly view memory usage

Entry point is Three dots in upper right -> More tools -> Task Manager, then Right-click header -> Check JS memory used, mainly focus on two columns:

Memory column represents native memory. DOM nodes are stored in native memory, if this value is increasing, it means DOM nodes are being created
JS memory used column represents JS heap. This column contains two values, what needs attention is the live value (the number in parentheses). The live value represents the amount of memory being used by accessible objects on the page. If this value is increasing, either new objects are being created, or existing objects are growing

Performance

Used to observe memory change trends

Entry point is the Performance panel in DevTools, then check Memory. If you want to see memory usage during the page's first load, Command + R refresh the page, it will automatically record the entire loading process. If you want to see memory changes before and after certain operations, click the "black dot" button to start recording before the operation, click the "red dot" button to end recording after the operation is complete

After recording is complete, check JS Heap in the middle, the blue line represents memory change trend. If the overall trend keeps rising without significant drop, confirm through manual GC: operate and record again, do manual GC several times (click the "black trash can" button) before or during the operation. If the line doesn't drop significantly at GC points and the overall trend keeps rising, there may be a memory leak

Or a more brute-force confirmation method, start recording -> repeat operation 50 times -> see if there's a significant drop caused by automatic GC. Automatic GC occurs when memory usage reaches the threshold. If there's a leak, operating n times will eventually reach the threshold, which can also be used to confirm whether the memory leak issue has been fixed

P.S. You can also see document count (possibly for iframes), node count, event listener count, GPU memory usage change trends, among which node count and event listener count changes are also instructive

Memory

This panel has 3 tools: heap snapshot, allocation profile, and allocation timeline:

Take Heap Snapshot, used to specifically analyze the survival status of various object types, including instance count, reference paths, etc.
Record Allocation Profile, used to view memory size allocated to each function
Record Allocation Timeline, used to view real-time memory allocation and reclamation

Among them, allocation timeline and heap snapshot are more useful, timeline is used to locate memory leak operations, snapshots are used for specific problem analysis

For more introduction about specific usage, please check Fix Memory Problems

Record Allocation Timeline

Open the timeline, perform various interactions on the page, blue bars appearing represent new memory allocation, gray represents release and reclamation. If there are regular blue bars on the timeline, there's a high probability of memory leak

Then repeatedly operate and observe to see what operation causes blue bar residue, isolate the specific operation

Take Heap Snapshot

Heap snapshots are used for further analysis to find the specific object types that are leaking

At this point, suspicious operations should already be locked. By repeatedly performing the operation and observing the quantity changes in heap snapshot items, locate the leaking object type

Heap snapshots have 4 viewing modes:

Summary: Summary view, expand and select child items to view Object's retaining tree (reference path)
Comparison: Comparison view, compare with other snapshots to see add, delete, Delta count and memory size
Containment: Overview view, look at the heap from top to bottom, root nodes include window object, GC root, native objects, etc.
Dominators: Dominator tree view, new Chrome seems to have removed it, displays the dominator tree mentioned in the terminology concepts section earlier

The most commonly used are comparison view and summary view. Comparison view can diff snapshots between 2 operations and 1 operation, see Delta increment, find out which type of object keeps growing. Summary view is used to analyze these suspicious objects, look at Distance, find out which link on the strange long path was forgotten to disconnect

There's a small tip for viewing summary view: newly added items are yellow background with black text, deleted items are red background with black text, and existing items are white background with black text. This is crucial

For more illustrations about snapshot usage, please check How to Record Heap Snapshots

4. Troubleshooting Steps

1. Confirm the problem, find suspicious operations

First confirm whether there's really a memory leak:

Switch to Performance panel, start recording (if necessary to record from the beginning)
Start recording -> operate -> stop recording -> analyze -> repeat to confirm
If memory leak is confirmed, narrow down the scope, determine what interaction operation caused it

You can further confirm the problem through Memory panel's allocation timeline. The advantage of Performance panel is that you can see DOM node count and event listener change trends. Even when not sure if performance is dragged down by memory issues, you can also see network response speed, CPU usage and other factors through the Performance panel

2. Analyze heap snapshots, find suspicious objects

After locking down suspicious interaction operations, go deeper through memory snapshots:

Switch to Memory panel, take snapshot 1
Perform one suspicious interaction operation, take snapshot 2
Compare snapshot 2 and 1, see if count Delta is normal
Perform one more suspicious interaction operation, take snapshot 3
Compare 3 and 2, see if count Delta is normal, guess the object count change trend of abnormal Delta
Perform 10 suspicious interaction operations, take snapshot 4
Compare 4 and 3, verify the guess, determine what wasn't reclaimed as expected

3. Locate the problem, find the cause

After locking down suspicious objects, further locate the problem:

Is the Distance of this type of object normal? If most instances are level 3 or 4, individual ones reaching level 10 or above is abnormal
Look at instances with path depth 10+ levels (or obviously deeper than other instances of the same type), what's referencing them

4. Release references, fix and verify

At this point, the problem source is basically found. Next, solve the problem:

Find a way to disconnect this reference
Sort out the logic flow, see if there are references elsewhere that won't be used again, release them all
Modify and verify, if not resolved, re-locate

Of course, sorting out the logic flow can be done from the beginning. Analyze with tools while confirming logic flow loopholes, work on both fronts. Finally, verify by looking at the trend line in Performance panel or the timeline in Memory panel

5. Common Cases

These scenarios may have memory leak hazards. Of course, proper cleanup work can solve them

1. Implicit global variables

function foo(arg) {
    bar = "this is a hidden global variable";
}

bar gets hung on window. If bar points to a huge object or a DOM node, it will cause memory hazards

Another less obvious way is when a constructor is called directly (without calling through new):

function foo() {
    this.variable = "potential accidental global";
}

// Foo called on its own, this points to the global object (window)
// rather than being undefined.
foo();

Or this in anonymous functions, which also points to global in non-strict mode. These obvious problems can be avoided through lint checks or enabling strict mode

2. Forgotten timers or callbacks

var someResource = getData();
setInterval(function() {
    var node = document.getElementById('Node');
    if(node) {
        // Do stuff with node and someResource.
        node.innerHTML = JSON.stringify(someResource));
    }
}, 1000);

If the node with id of Node is removed later, the node variable in the timer still holds its reference, causing the游离 DOM subtree to be unable to release

Callback function scenarios are similar to timers:

var element = document.getElementById('button');

function onClick(event) {
    element.innerHtml = 'text';
}

element.addEventListener('click', onClick);
// Do stuff
element.removeEventListener('click', onClick);
element.parentNode.removeChild(element);
// Now when element goes out of scope,
// both element and onClick will be collected even in old browsers that don't
// handle cycles well.

Event listeners on nodes should be removed before removing nodes, because IE6 didn't handle circular references between DOM nodes and JS (because BOM and DOM object GC strategies are both reference counting), which may cause memory leaks. Modern browsers no longer need to do this; if nodes can no longer be accessed, listeners will be recycled

3. References to游离 DOM

var elements = {
    button: document.getElementById('button'),
    image: document.getElementById('image'),
    text: document.getElementById('text')
};

function doStuff() {
    image.src = 'http://some.url/image';
    button.click();
    console.log(text.innerHTML);
    // Much more logic
}

function removeButton() {
    // The button is a direct child of body.
    document.body.removeChild(document.getElementById('button'));

    // At this point, we still have a reference to #button in the global
    // elements dictionary. In other words, the button element is still in
    // memory and cannot be collected by the GC.
}

DOM node references are often cached (for performance or code simplicity considerations), but when removing nodes, cached references should be released synchronously, otherwise游离 subtrees cannot be released

Another more hidden scenario:

var select = document.querySelector;
var treeRef = select("#tree");
var leafRef = select("#leaf");
var body = select("body");

body.removeChild(treeRef);

//#tree can't be GC yet due to treeRef
treeRef = null;

//#tree can't be GC yet due to indirect
//reference from leafRef

leafRef = null;
//#NOW can be #tree GC

As shown below:

[caption id="attachment_1464" align="alignnone" width="368"] treegc[/caption]

If any node reference in a游离 subtree is not released, the entire subtree cannot be released, because all other nodes can be found (accessed) through one node, all marked as active and won't be cleared

4. Closures

var theThing = null;
var replaceThing = function () {
  var originalThing = theThing;
  var unused = function () {
    if (originalThing)
      console.log("hi");
  };
  theThing = {
    longStr: new Array(1000000).join('*'),
    someMethod: function () {
      console.log(someMessage);
    }
  };
};
setInterval(replaceThing, 1000);

Paste into console to execute, then look at memory changes through Performance panel trend line or Memory panel timeline. You can find very regular memory leaks (line steadily rises, one blue bar per second, straight as an arrow)

Because the typical implementation of closures is that each function object has an association pointing to a dictionary object, this dictionary object represents its lexical scope. If functions defined in replaceThing actually all use originalThing, then it's necessary to ensure they all get the same object, even if originalThing is reassigned over and over, so these (functions defined in replaceThing) share the same lexical environment

But V8 is smart enough to remove variables not used by any closure from the lexical environment, so if you delete unused (or remove the originalThing access inside unused), you can solve the memory leak

As long as a variable is used by any closure, it will be added to the lexical environment, shared by all closures under that scope. This is the key to closures causing memory leaks

P.S. For more detailed information about this interesting memory leak issue, please check An interesting kind of JavaScript memory leak

6. Other Memory Issues

Besides memory leaks, there are two other common memory issues:

Memory bloat
Frequent GC

Memory bloat means too much memory is occupied, but there's no clear boundary. Different devices have different performance, so it should be user-centric. Understand what devices are popular among the user base, then test pages on these devices. If the experience is poor, the page may have memory bloat issues

Frequent GC greatly affects experience (the feeling of page pause, because of Stop-The-World). You can see this through Task Manager memory size values or Performance trend lines:

In Task Manager, if memory or JS memory used values frequently rise and fall, it indicates frequent GC
In trend lines, if JS heap size or node count frequently rises and falls, it indicates frequent GC

Frequent GC issues can be solved by optimizing storage structure (avoid creating大量 fine-grained small objects), caching and reusing (such as using flyweight factory to implement reuse), etc.