Babel Quick Guide

I. Function

Babel is a JavaScript compiler.

Structurally belongs to compiler, since input is JS source code, output is also JS source code (so-called source to source), so also called transpiler

II. Principle

You give Babel some JavaScript code, it modifies the code and generates the new code back out.

Specifically, source code transformation work is divided into 3 steps:

parsing -> transforming -> generation

First "understand" the semantics possessed by source code, then perform semantic-level transformation, finally map back to source code form from semantic representation form

And semantic representation form, in Babel refers to AST (Abstract Syntax Tree):

How it modifies the code? Exactly! It builds AST, traverses it, modifies it based on plugins applied and then generate new code from modified AST.

So, regarding code representation form, it's through introducing intermediate representation form (AST) to perform semantic transformation:

       parsing      transforming               generation
String -------> AST ------------> modified AST ----------> String

In the entire process, parsing and generation are fixed and unchanging, the most critical is transforming step, supported through babel plugins, this is the key to its extensibility

P.S. For concepts related to compilation principles, see Looking at Compilation Principles Again

parsing

Input JS source code, output AST

parsing (parsing), corresponds to compiler's lexical analysis, and syntax analysis stages. Input source code character sequence undergoes lexical analysis, generates token sequence with lexical meaning (can distinguish keywords, numbers, punctuation marks, etc.), then undergoes syntax analysis, generates AST with syntax meaning (can distinguish statement blocks, comments, variable declarations, function parameters, etc.)

Actually it's process of semantic recognition of code strings, input a code string, how to recognize its syntax meaning, for example:

var a = 'A variable.';

After parsing, generated AST is as follows:

{
  "type": "VariableDeclaration",
  "declarations": [
    {
      "type": "VariableDeclarator",
      "id": {
        "type": "Identifier",
        "name": "a"
      },
      "init": {
        "type": "Literal",
        "value": "A variable.",
        "raw": "'A variable.'"
      }
    }
  ],
  "kind": "var"
}

It says: this is a var type variable declaration, variable name is a, initial value is a literal, value is "A variable."

Yes, AST can completely describe the syntax meaning possessed by code, with this information, compiler can understand code like humans, this is the foundation for performing semantic-level transformation

P.S. AST structure corresponding to JS code can be viewed through AST Explorer tool

transforming

Input AST, output modified AST

transforming (transformation), corresponds to compiler's machine-independent code optimization stage (slightly far-fetched, but both work content is modifying AST), make some modifications to AST, such as changing variable name a to input:

{
  "type": "VariableDeclaration",
  "declarations": [
    {
      "type": "VariableDeclarator",
      "id": {
        "type": "Identifier",
        "name": "input"
      },
      "init": {
        "type": "Literal",
        "value": "A variable.",
        "raw": "'A variable.'"
      }
    }
  ],
  "kind": "var"
}

Just modify AST node attributes, but if wanting to split declaration and assignment, need to add AST nodes:

[{
  "type": "VariableDeclaration",
  "declarations": [
    {
      "type": "VariableDeclarator",
      "id": {
        "type": "Identifier",
        "name": "input"
      },
      "init": null
    }
  ],
  "kind": "var"
},
{
  "type": "ExpressionStatement",
  "expression": {
    "type": "AssignmentExpression",
    "operator": "=",
    "left": {
      "type": "Identifier",
      "name": "input"
    },
    "right": {
      "type": "Literal",
      "value": "A variable.",
      "raw": "'A variable.'"
    }
  }
}]

It says: first statement is a var type variable declaration, variable name is input, no initial value. Second statement is an expression statement, specifically assignment expression, operator is =, left operand is identifier input, right operand is literal, value is "A variable."

Semantic-level transformation specifically refers to add, delete, modify operations on AST, modified AST may have different semantics, mapped back to code strings is also different

generation

Input AST, output JS source code

generation (generation), corresponds to compiler's code generation stage, map AST back to code strings, for example:

var input;
input = 'A variable.';

Compared to parsing, generation process is relatively easier, just concatenating strings

III. Usage

4 core packages:

@babel/core: Use Babel programmatically (not via CLI)
@babel/parser: Parse input source code, create AST
@babel/traverse: Traverse AST
@babel/generator: Convert AST back to JS code

8 tool packages:

@babel/cli: Use Babel via CLI, depends on @babel/core
@babel/types: AST operation tool library, includes judgment, assertion, creation 3 types of APIs (isXXX, assertXXX and xxx, such as t.isArrayExpression(node, opts), t.assertArrayExpression(node, opts) and t.arrayExpression(elements))
@babel/polyfill: Contains some language feature patches (complete ES2015+ environment support), including core-js and regenerator runtime
@babel/runtime: Contains tool methods generated by Babel transformation (_classCallCheck etc.), and a copy of regenerator-runtime, used with @babel/plugin-transform-runtime plugin
@babel/register: Hack require in Node environment to achieve automatic compilation of all files required, used with @babel/node to run
@babel/template: Template syntax for quickly creating AST, supports placeholders
@babel/helpers: A series of predefined @babel/template template methods, for Babel plugins to use
@babel/code-frame: Used to output source code line and column related error information

P.S. For more information about Babel packages, see babel/packages/README.md

P.S. As for why package names are all @babel/xxx form, on one hand to avoid naming conflicts, on the other hand to facilitate distinguishing official packages from community packages, avoid misunderstanding, specifically see Renames: Scoped Packages (@babel/x)

babylon and @babel/parser

@babel/parser was launched with Babel 7, previously called Babylon

The Babel parser (previously Babylon) is a JavaScript parser used in Babel.

Is Babel's JS parser, several features:

Default enables latest version ES (ES2017) feature support
Preserves comments (comment attachment)
Supports JSX, Flow, Typescript
Supports experimental language features (stage-0 and other stage candidate features)

@babel/polyfill and @babel/runtime

These 2 things are both used to provide ES feature patches, such as Promise, Set, Map, etc.:

The babel-polyfill and babel-runtime modules are used to serve the same function in two different ways. Both modules ultimately serve to emulate an ES6 environment.

Difference lies in:

@babel/polyfill: Will pollute global scope, suitable for Apps and command line tools
@babel/runtime: Will be packaged as runtime dependency, doesn't pollute global scope, more suitable for libraries

Simple Example

Convert constant names to uppercase, that is:

// Input
const numberFive = 5;
// Required output
const NUMBER_FIVE = 5;

For clarity, separately reference @babel/parser, @babel/traverse and @babel/generator (not directly using @babel/core provided upper-level APIs):

const parser = require('@babel/parser');
const traverse = require('@babel/traverse').default;
const generate = require('@babel/generator').default;

let input = `
const number = 'number';
const numberFive = 5;
const numberSix = 6, numberSeven = numberSix + 1;
const XMLHttpRequest = window.XMLHttpRequest;
let aString = 'string';
var numberEight = numberSeven + 1;
function f() {
  const numberEleven = numberSeven + 4;
  return numberFive + numberEleven + numberEight;
}
`;

// 1. Parse
let ast = parser.parse(input);
// 2. Transform
function renameConst(name) {
  return name.replace(/([a-z])([A-Z])/, '$1_$2').toUpperCase();
}
function renameConstBindings(path) {
  let ownBindings = path.scope.bindings;
  for (let name in ownBindings) {
    if (ownBindings[name].kind === 'const') {
      path.scope.rename(name, renameConst(name));
    }
  }
}
traverse(ast, {
  Program: {
    exit: renameConstBindings
  },
  Function: {
    exit: renameConstBindings
  }
});
// 3. Generate
let output = generate(ast);

// test
console.log(output.code);

Output:

const NUMBER = 'number';
const NUMBER_FIVE = 5;
const NUMBER_SIX = 6,
      NUMBER_SEVEN = NUMBER_SIX + 1;
const XMLHTTP_REQUEST = window.XMLHttpRequest;
let aString = 'string';
var numberEight = NUMBER_SEVEN + 1;

function f() {
  const NUMBER_ELEVEN = NUMBER_SEVEN + 4;
  return NUMBER_FIVE + NUMBER_ELEVEN + numberEight;
}

Pure scope operations (find constants, then rename), for more scope-related APIs see babel/packages/babel-traverse/src/scope/index.js

IV. Plugins

Definition

Babel plugin general format is:

export default function(babel) {
  return {
    // Required, visitor object used with traverse
    visitor: {},

    // Optional, inherit other plugins, such as recognizing JSX, async function etc. syntax
    inherits: OtherPlugin,
    // Optional, before plugin execution, initialize state, such as cache
    pre(state) {},
    // Optional, after plugin execution, cleanup work
    post(state) {}
  }
}

So it's easy to package the above constant name transformation function into Babel plugin, just take in the transformation part's visitor:

// babel-plugin-transform-const-name.js
export default function(babel) {
  return {
    visitor: {
      Program: {
        exit: renameConstBindings
      },
      Function: {
        exit: renameConstBindings
      }
    }
  }
}

P.S. Plugin parameters set through Babel configuration options, can be read through state.opts, specifically see Plugin Options

Compilation

Node environment where Babel and plugins run doesn't support ES Module (export default), so plugins themselves need compilation, here through @babel/cli to complete:

npx babel plugins --no-babelrc --presets=@babel/preset-env --out-dir lib

Can also do through npm scripts:

"scripts": {
  "compile-plugins": "babel plugins --no-babelrc --presets=@babel/preset-env --out-dir lib"
}

Transform all plugin source code under ./plugins/ directory and put under ./lib/, file names remain unchanged

Configuration

Generally through .babelrc configuration file (placed in project root directory) to apply specified plugins:

{
  "plugins": ['./lib/babel-plugin-transform-const-name.js']
}

Note, here uses compiled (under lib directory) plugin, otherwise will report error not supporting export keyword:

SyntaxError: Unexpected token export

Application

Then through @babel/core let plugin run:

const babel = require('@babel/core');
const input = require('fs').readFileSync('./const-rename-input.js', 'utf-8');

let output = babel.transform(input, {
  filename: 'const-rename-input.js'
});
console.log(output.code);

Note, to go through .babelrc configuration, must specify filename, specifically see babel.transform API is not using .babelrc

.babelrc files are loaded relative to the file being compiled. If this option is omitted, Babel will behave as if babelrc: false has been set.

Or don't go through .babelrc directly run via CLI:

npx babel const-rename-input.js --no-babelrc --presets=@babel/preset-env --plugins=./lib/babel-plugin-transform-const-name.js

P.S. For more Babel CLI usage, see Usage

Output:

"use strict";

const NUMBER = 'number';
const NUMBER_FIVE = 5;
const NUMBER_SIX = 6,
      NUMBER_SEVEN = NUMBER_SIX + 1;
const XMLHTTP_REQUEST = window.XMLHttpRequest;
let aString = 'string';
var numberEight = NUMBER_SEVEN + 1;

function f() {
  const NUMBER_ELEVEN = NUMBER_SEVEN + 4;
  return NUMBER_FIVE + NUMBER_ELEVEN + numberEight;
}

V. Application Scenarios

Remove Debug Code

Remove console.xxx, debugger, specific implementation as follows:

function removeConsoleCall(path, {types: t}) {
  if (path.node.name === 'console') {
    let consoleCall = path.findParent(p => p.isCallExpression());
    if (consoleCall) {
      try {
        consoleCall.remove();
      } catch(ex) {
        consoleCall.replaceWith(t.identifier('undefined'));
      }
    }
  }
}
export default function(babel) {
  return {
    visitor: {
      Identifier: {
        enter(path) {
          removeConsoleCall(path, babel);
        }
      },
      DebuggerStatement: {
        enter(path) {
          path.remove();
        }
      }
    }
  }
}

Note a detail, by default delete console.xxx (consoleCall.remove();), but some situations cannot directly delete, such as when participating in operations as operand, deleting will cause syntax errors, here use path operation's built-in validation, capture such errors and replace with undefined as fallback

Input:

console.log(1);
window.console.log(2);
console.error('err');
let result = 2 > 1 ? console.log(3) : window.console.log(4);
if (true) debugger;
if (true) {
  debugger;console.log(2);alert(3);
  let three = 2 + (console.info('info'), 1);
}

Output:

"use strict";

var result = 2 > 1 ? undefined : undefined;

if (true) {}

if (true) {
  var three = 2 + (1);
}

Looks good, but powerless for things difficult to track like aliases, for example:

let log = console.log.bind(console);
log(4);
var c = window.console;
c.log(5);
// Exists false positive
void function(c) {
  c.log(6);
  alert(7);
}(window.console);

Output:

var log;
log(4);
var c = window.console;
c.log(5); // Exists false positive

void undefined;

Constant Compilation Replacement

At compile time, replace _GET_CONFIG('c3') with corresponding configuration information, such as:

{
  "c1": "#FFFFFF",
  "c2": "#00FFFF",
  "c3": "#FF00FF",
  "c4": "#FFFF00"
}

Plugin content as follows:

const CONFIG_MAP = {
  "c1": "#FFFFFF",
  "c2": "#00FFFF",
  "c3": "#FF00FF",
  "c4": "#FFFF00"
};

export default function({types: t}) {
  return {
    inherits: require("@babel/plugin-syntax-jsx").default,
    visitor: {
      CallExpression: {
        enter(path) {
          if (path.node.callee.name === '_GET_CONFIG') {
            let args = path.node.arguments.map(v => v.value);
            let configValue = CONFIG_MAP[args[0]] || '';
            path.replaceWith(t.stringLiteral(configValue));
          }
        }
      }
    }
  }
}

Input:

function render() {
  return <div style={{color: _GET_CONFIG('c3')}}></div>
}

Output:

"use strict";

function render() {
  return <div style={{
    color: "#FF00FF"
  }}></div>;
}

Similarly, can only handle static replacement scenarios, doesn't support aliases, also doesn't support variables:

let x = 'c3';
_GET_CONFIG(x);
let get = _GET_CONFIG;
get('c4');

Output:

var x = 'c3';
"";
var get = _GET_CONFIG;
get('c4');

Other Scenarios

Implement strong constraints: Such as Using babel plugins to create truly "private" properties, use Symbol as private property's key, turn moral norms into strong constraints
Source code transformation: There's specialized tool facebook/jscodeshift, provides more convenient APIs (such as findVariableDeclarators('foo').renameTo('bar')), especially suitable for scenarios needing large-scale refactoring like API upgrades, such as reactjs/react-codemod
Formatting: Such as Prettier, perform semantically equivalent code style transformations, such as whether arrow function parameters have brackets, whether statements end with semicolons, etc.
Visualization: js2flowchart can output flowcharts based on code, useful for reading source code, can also be used to analyze inherited logic

I. Function

II. Principle

parsing

transforming

generation

III. Usage

babylon and @babel/parser

@babel/polyfill and @babel/runtime

Simple Example

IV. Plugins

Definition

Compilation

Configuration

Application

V. Application Scenarios

Remove Debug Code

Constant Compilation Replacement

Other Scenarios

Reference Materials

Comments