React SSR Principles

Preface

The previous article React SSR API introduced the React SSR related APIs in detail. This article will dive into the source code and focus on the following 3 questions to understand the implementation principles:

How do React components become HTML strings?
How are these strings streamed while being concatenated?
What does hydrate actually do?

I. How do React components become HTML strings?

Input a React component:

class MyComponent extends React.Component {
  constructor() {
    super();
    this.state = {
      title: 'Welcome to React SSR!',
    };
  }

  handleClick() {
    alert('clicked');
  }

  render() {
    return (
      <div>
        <h1 className="site-title" onClick={this.handleClick}>{this.state.title} Hello There!</h1>
      </div>
    );
  }
}

After processing through ReactDOMServer.renderToString(), output HTML string:

'<div data-reactroot=""><h1 class="site-title">Welcome to React SSR!<!-- --> Hello There!</h1></div>'

What happened in between?

First, create component instance, then execute render and lifecycle methods before it, finally map DOM elements to HTML strings

Create Component Instance

inst = new Component(element.props, publicContext, updater);

Inject external updater through the third parameter updater, used to intercept setState and other operations:

var updater = {
  isMounted: function (publicInstance) {
    return false;
  },
  enqueueForceUpdate: function (publicInstance) {
    if (queue === null) {
      warnNoop(publicInstance, 'forceUpdate');
      return null;
    }
  },
  enqueueReplaceState: function (publicInstance, completeState) {
    replace = true;
    queue = [completeState];
  },
  enqueueSetState: function (publicInstance, currentPartialState) {
    if (queue === null) {
      warnNoop(publicInstance, 'setState');
      return null;
    }

    queue.push(currentPartialState);
  }
};

Compared to the previous solution of maintaining virtual DOM, this state update interception method is faster:

In React 16, though, the core team rewrote the server renderer from scratch, and it doesn't do any vDOM work at all. This means it can be much, much faster.

(Excerpted from What's New With Server-Side Rendering in React 16)

The part that replaces React's built-in updater is located in the React.Component base class constructor:

function Component(props, context, updater) {
  this.props = props;
  this.context = context; // If a component has string refs, we will assign a different object later.

  this.refs = emptyObject; // We initialize the default updater but the real one gets injected by the
  // renderer.

  this.updater = updater || ReactNoopUpdateQueue;
}

Render Component

After getting initial data (inst.state), execute component lifecycle functions in order:

// getDerivedStateFromProps
var partialState = Component.getDerivedStateFromProps.call(null, element.props, inst.state);
inst.state = _assign({}, inst.state, partialState);

// componentWillMount
if (typeof Component.getDerivedStateFromProps !== 'function') {
  inst.componentWillMount();
}

// UNSAFE_componentWillMount
if (typeof inst.UNSAFE_componentWillMount === 'function' && typeof Component.getDerivedStateFromProps !== 'function') {
  // In order to support react-lifecycles-compat polyfilled components,
  // Unsafe lifecycles should not be invoked for any component with the new gDSFP.
  inst.UNSAFE_componentWillMount();
}

Note the mutual exclusion relationship between old and new lifecycles, getDerivedStateFromProps takes priority, only if it doesn't exist will componentWillMount/UNSAFE_componentWillMount be executed. Specially, if these two old lifecycle functions both exist, both functions will be executed in the above order

Next, prepare for render, but before that, first check the updater queue, because componentWillMount/UNSAFE_componentWillMount may trigger state updates:

if (queue.length) {
  var nextState = oldReplace ? oldQueue[0] : inst.state;
  for (var i = oldReplace ? 1 : 0; i < oldQueue.length; i++) {
    var partial = oldQueue[i];
    var _partialState = typeof partial === 'function' ? partial.call(inst, nextState, element.props, publicContext) : partial;
    nextState = _assign({}, nextState, _partialState);
  }
  inst.state = nextState;
}

Then enter render:

child = inst.render();

And recursively process child components the same way (processChild):

while (React.isValidElement(child)) {
  // Safe because we just checked it's an element.
  var element = child;
  var Component = element.type;

  if (typeof Component !== 'function') {
    break;
  }

  processChild(element, Component);
}

Until encountering native DOM elements (component type is not function), "render" DOM elements to strings and output:

if (typeof elementType === 'string') {
  return this.renderDOM(nextElement, context, parentNamespace);
}

"Render" DOM Elements

Specially, first preprocess props of [controlled components](/articles/从 componentwillreceiveprops 说起/#articleHeader5):

// input
props = _assign({
  type: undefined
}, props, {
  defaultChecked: undefined,
  defaultValue: undefined,
  value: props.value != null ? props.value : props.defaultValue,
  checked: props.checked != null ? props.checked : props.defaultChecked
});

// textarea
props = _assign({}, props, {
  value: undefined,
  children: '' + initialValue
});

// select
props = _assign({}, props, {
  value: undefined
});

// option
props = _assign({
  selected: undefined,
  children: undefined
}, props, {
  selected: selected,
  children: optionChildren
});

Then officially start concatenating strings, first create opening tag:

// Create opening tag
var out = createOpenTagMarkup(element.type, tag, props, namespace, this.makeStaticMarkup, this.stack.length === 1);

function createOpenTagMarkup(tagVerbatim, tagLowercase, props, namespace, makeStaticMarkup, isRootElement) {
  var ret = '<' + tagVerbatim;
  for (var propKey in props) {
    var propValue = props[propKey];
    // Serialize style value
    if (propKey === STYLE) {
      propValue = createMarkupForStyles(propValue);
    }
    // Create tag attribute
    var markup = null;
    markup = createMarkupForProperty(propKey, propValue);
    // Append to opening tag
    if (markup) {
      ret += ' ' + markup;
    }
  }

  // renderToStaticMarkup() directly returns clean HTML tag
  if (makeStaticMarkup) {
    return ret;
  }
  // renderToString() adds extra react attribute data-reactroot="" to root element
  if (isRootElement) {
    ret += ' ' + createMarkupForRoot();
  }

  return ret;
}

Then create closing tag:

// Create closing tag
var footer = '';
if (omittedCloseTags.hasOwnProperty(tag)) {
  out += '/>';
} else {
  out += '>';
  footer = '</' + element.type + '>';
}

And process child nodes:

// Text child nodes, directly append to opening tag
var innerMarkup = getNonChildrenInnerMarkup(props);
if (innerMarkup != null) {
  out += innerMarkup;
} else {
  children = toArray(props.children);
}
// Non-text child nodes, output opening tag (return), push closing tag to stack
var frame = {
  domNamespace: getChildNamespace(parentNamespace, element.type),
  type: tag,
  children: children,
  childIndex: 0,
  context: context,
  footer: footer
};
this.stack.push(frame);
return out;

Note, at this point the complete HTML fragment hasn't finished rendering yet (child nodes haven't been converted to HTML, so closing tags can't be appended), but the opening tag part is completely determined and can be output to the client

II. How are these strings streamed while being concatenated?

In this way, each pass only renders one node, until there are no pending rendering tasks in the stack:

function read(bytes) {
  try {
    var out = [''];

    while (out[0].length < bytes) {
      if (this.stack.length === 0) {
        break;
      }

      // Get rendering task from top of stack
      var frame = this.stack[this.stack.length - 1];

      // All child nodes under this node have finished rendering
      if (frame.childIndex >= frame.children.length) {
        var footer = frame.footer;
        // Current node (rendering task) pops from stack
        this.stack.pop();
        // Append closing tag, current node is done
        out[this.suspenseDepth] += footer;
        continue;
      }

      // Each time a child node is processed, childIndex + 1
      var child = frame.children[frame.childIndex++];
      var outBuffer = '';

      try {
        // Render one node
        outBuffer += this.render(child, frame.context, frame.domNamespace);
      } catch (err) { /*...*/ }

      out[this.suspenseDepth] += outBuffer;
    }

    return out[0];
  } finally { /*...*/ }
}

This fine-grained task scheduling makes streaming while concatenating possible, similar to React Fiber scheduling mechanism, also small task segments, Fiber scheduling is based on time, SSR scheduling is based on workload (while (out[0].length < bytes))

Output block by block according to given target workload (bytes), this is exactly the basic characteristic of stream:

stream is a data collection, similar to arrays, strings. But stream doesn't access all data at once, but sends/receives part by part (chunk-style)

The producer's production mode already fully conforms to stream characteristics, therefore, just need to wrap it into Readable Stream:

function ReactMarkupReadableStream(element, makeStaticMarkup, options) {
  var _this;

  // Create Readable Stream
  _this = _Readable.call(this, {}) || this;
  // Directly use renderToString's rendering logic
  _this.partialRenderer = new ReactDOMServerRenderer(element, makeStaticMarkup, options);
  return _this;
}

var _proto = ReactMarkupReadableStream.prototype;
// Override _read() method, read specified size of string each time
_proto._read = function _read(size) {
  try {
    this.push(this.partialRenderer.read(size));
  } catch (err) {
    this.destroy(err);
  }
};

Extremely simple:

function renderToNodeStream(element, options) {
  return new ReactMarkupReadableStream(element, false, options);
}

P.S. As for non-streaming API, it reads all at once (read(Infinity)):

function renderToString(element, options) {
  var renderer = new ReactDOMServerRenderer(element, false, options);

  try {
    var markup = renderer.read(Infinity);
    return markup;
  } finally {
    renderer.destroy();
  }
}

III. What does hydrate actually do?

After components are injected with data on the server side and "rendered" to HTML, they can directly present meaningful content on the client side, but don't have interactive behavior, because the above server-side rendering process didn't process attributes like onClick (actually intentionally ignored these attributes):

function shouldIgnoreAttribute(name, propertyInfo, isCustomComponentTag) {
  if (name.length > 2 && (name[0] === 'o' || name[0] === 'O') && (name[1] === 'n' || name[1] === 'N')) {
    return true;
  }
}

Also didn't execute lifecycle methods after render, components weren't completely "rendered". Therefore, another part of rendering work still needs to be completed on the client side, this process is hydrate

Differences between hydrate and render

hydrate() and render() have completely identical function signatures, both can render components on specified container nodes:

ReactDOM.hydrate(element, container[, callback])
ReactDOM.render(element, container[, callback])

But different from render() starting from scratch, hydrate() happens on top of server-side rendering products, so the biggest difference is the hydrate process will reuse DOM nodes already rendered by the server

Node Reuse Strategy

In hydrate mode, component rendering process is also divided into two phases:

First phase (render/reconciliation): Find existing nodes that can be reused, hang them on fiber node's stateNode
Second phase (commit): diffHydratedProperties decides whether existing nodes need updating, the rule is to see if attributes on DOM nodes are consistent with props

That is to say, find a "possibly reusable" (hydratable) existing DOM node at the corresponding position, temporarily record it as rendering result, then try to reuse this node in the commit phase

Selecting existing nodes specifically:

// When renderRoot, get the first (possibly reusable) child node
function updateHostRoot(current, workInProgress, renderLanes) {
  var root = workInProgress.stateNode;
  // In hydrate mode, find the first available child node from container
  if (root.hydrate && enterHydrationState(workInProgress)) {
    var child = mountChildFibers(workInProgress, null, nextChildren, renderLanes);
    workInProgress.child = child;
  }
}

function enterHydrationState(fiber) {
  var parentInstance = fiber.stateNode.containerInfo;
  // Get the first (possibly reusable) child node, record it on module-level global variable
  nextHydratableInstance = getFirstHydratableChild(parentInstance);
  hydrationParentFiber = fiber;
  isHydrating = true;
  return true;
}

Selection criteria is node type is element node (nodeType is 1) or text node (nodeType is 3):

// Find the first element node or text node among sibling nodes
function getNextHydratable(node) {
  for (; node != null; node = node.nextSibling) {
    var nodeType = node.nodeType;

    if (nodeType === ELEMENT_NODE || nodeType === TEXT_NODE) {
      break;
    }
  }

  return node;
}

After pre-selecting nodes, when rendering to native components (HostComponent), hang the pre-selected node on fiber node's stateNode:

// Encounter native node
function updateHostComponent(current, workInProgress, renderLanes) {
  if (current === null) {
    // Try to reuse pre-selected existing node
    tryToClaimNextHydratableInstance(workInProgress);
  }
}

function tryToClaimNextHydratableInstance(fiber) {
  // Get pre-selected node
  var nextInstance = nextHydratableInstance;
  // Try to reuse
  tryHydrate(fiber, nextInstance);
}

Taking element nodes as an example (text nodes are similar):

function tryHydrate(fiber, nextInstance) {
  var type = fiber.type;
  // Judge if pre-selected node matches
  var instance = canHydrateInstance(nextInstance, type);

  // If pre-selected node is reusable, hang it on stateNode, temporarily record as rendering result
  if (instance !== null) {
    fiber.stateNode = instance;
    return true;
  }
}

Note, here it doesn't check if attributes match completely, as long as element node tag names are the same (such as div, h1), it's considered reusable:

function canHydrateInstance(instance, type, props) {
  if (instance.nodeType !== ELEMENT_NODE || type.toLowerCase() !== instance.nodeName.toLowerCase()) {
    return null;
  }
  return instance;
}

In the finishing part of the first phase (completeWork) perform attribute consistency check, while attribute value correction actually happens in the second phase:

function completeWork(current, workInProgress, renderLanes) {
  var _wasHydrated = popHydrationState(workInProgress);
  // If matching existing node exists
  if (_wasHydrated) {
    // Check if attributes need updating
    if (prepareToHydrateHostInstance(workInProgress, rootContainerInstance, currentHostContext)) {
      // Correction action is done in the second phase
      markUpdate(workInProgress);
    }
  }
  // Otherwise document.createElement creates node
  else {
    var instance = createInstance(type, newProps, rootContainerInstance, currentHostContext, workInProgress);
    appendAllChildren(instance, workInProgress, false, false);
    workInProgress.stateNode = instance;

    if (finalizeInitialChildren(instance, type, newProps, rootContainerInstance)) {
      markUpdate(workInProgress);
    }
  }
}

Consistency check is to see if attributes on DOM nodes are consistent with component props, mainly does 3 things:

Text child node values different: warn and correct (use client state to correct server rendering result)
Other style, class values different: only warn, don't correct
DOM node has extra attributes: also warn

That is to say, only automatically correct when text child node content has differences, for attribute quantity and value differences only throw warnings, don't correct, therefore, in development phase must pay attention to warnings about rendering result mismatches

P.S. Specifically see diffHydratedProperties, code volume is large, won't expand here

Component Rendering Flow

Same as render, hydrate also executes complete lifecycle (including前置 lifecycle executed on server side):

// Create component instance
var instance = new ctor(props, context);
// Execute前置 lifecycle functions
// ...getDerivedStateFromProps
// ...componentWillMount
// ...UNSAFE_componentWillMount

// render
nextChildren = instance.render();

// componentDidMount
instance.componentDidMount();

So, from client-side rendering performance perspective, hydrate and render actual workload is equivalent, just saves creating DOM nodes, setting initial attribute values, etc.

At this point, all lower-level implementations of React SSR have surfaced

Reference Materials

react-dom @17.0.1