Lerna Beginner's Guide

1. Positioning

Lerna is a tool that optimizes the workflow around managing multi-package repositories with git and npm.

Multi-module management tool, used to help maintain monorepo

P.S. Lerna is a tool used daily by Babel and open-sourced, see Why is Babel a monorepo?

2. monorepo

monorepo (monolithic repository), opposite to multirepo, single code repository vs multiple code repositories (one-repository-per-module)

multirepo is the traditional approach, dividing into multiple code repositories by module, some issues found in practice:

Issue management is chaotic, often people raise module issues in core repo, need to Close this and track that
Changelog is difficult to integrate, need to manually sort out all changed repositories and integrate them
Core repo version update is troublesome, need to synchronize all module updates to their dependent core repo versions

monorepo puts all related modules in one repo, each module is published independently, but uses a unified version number with the repo (such as Babel and React), issues and PRs are centralized to this repo, changelog can be simply sorted out from a commit list (even if commit conventions associate issue tags, standardized changelog can be automatically generated)

monorepo also has some issues, but not as painful as the ones mentioned above:

Repo size is large, may bring version control issues (Git is not suitable for managing too large repos)
Unified build tool, puts higher requirements on build tools, must be able to build various related modules

From source code management perspective, multirepo and monorepo are two different philosophies, the former allows diversified development, each module can have its own practices (build, dependency management, unit testing, etc.), the latter hopes for centralized management, reducing communication costs brought by practice differences

monorepo's标志性 feature is directory structure, such as React:

react-16.2.0/
  packages/
    react/
    react-art/
    react-.../

Each module has its own dependencies (package.json), can be published as independent npm package, just source code is maintained together

Typical cases:

rollup: multirepo
babel: monorepo

P.S. Previously encountered issues with rollup, always went to main repo to check related issues first, then found corresponding plugin repo based on clues, then checked related issues. Always felt extremely troublesome, couldn't say what was wrong, turned out to be the trouble brought by source code organization method

3. lerna Trial

// Install
npm install lerna -g
git init hoho-lerna && cd hoho-lerna
// Initialize directory structure
lerna init

Get the following structure:

hoho-lerna/
  packages/
  lerna.json
  package.json

Create module:

mkdir packages/hoho-lerna-core && cd packages/hoho-lerna-core
npm init

This will eventually get a bunch of packages:

packages/
  hoho-lerna-core/
    package.json
  hoho-lerna-module-a/
    package.json
  hoho-lerna-module-b/
    package.json
  module.../

What we actually do is split into packages by module, and (through module-level package.json) declare dependencies between packages

Dependency Processing

If moduleA depends on core, after processing dependencies through lerna bootstrap command, a soft link will be created under moduleA's node_modules pointing to core directory, there's a living example

Note: npm will not automatically install peerDependencies, lerna doesn't provide this service either

lerna bootstrap follows the previously declared dependencies, actually connects packages through establishing soft links

Publish Package

Since everything is in packages, easy to manage uniformly, so supports one-click publishing of all packages to npm

P.S. First need npm account (register yourself), and npm adduser to add to local configuration

After preparation, can't wait to start shooting n stars with one arrow:

lerna publish

If nothing unexpected, will get similar output:

lerna info version 2.7.0
lerna info current version 0.0.0
lerna info Checking for updated packages...
lerna info Comparing with initial commit.
lerna info Checking for prereleased packages...
? Select a new version (currently 0.0.0) Major (1.0.0)

Changes:
 - hoho-lerna-core: 1.0.0 => 1.0.0
 - hoho-lerna-module-a: 1.0.0 => 1.0.0
 - hoho-lerna-module-b: 1.0.0 => 1.0.0

? Are you sure you want to publish the above changes? Yes
lerna info publish Publishing packages to npm...
lerna info published hoho-lerna-module-b
lerna info published hoho-lerna-core
lerna info published hoho-lerna-module-a
lerna info git Pushing tags...
Successfully published:
 - hoho-lerna-core @1.0.0
 - hoho-lerna-module-a @1.0.0
 - hoho-lerna-module-b @1.0.0
lerna success publish finished

Then, npm registry has 3 garbage packages...

publish's general process is:

Create a tag locally (such as git tag v1.0.0)
Automatically update dependency version numbers example
Then publish each package to npm
Finally push tags and corresponding commits

Note: If publishing to npm fails (such as not configuring npm account), next time directly lerna publish cannot publish directly, seemingly because local tag is already v1.0.0 thinks last publish succeeded. Manually rolling back this tag doesn't work either, .git may have recorded some publish status, after rolling back commit hash matching errors appear, not very friendly here

P.S. For more commands please check Lerna

Automatically Generate Changelog

First install changelog tool:

npm install lerna-changelog -g

Then add corresponding configuration in lerna.json:

"changelog": {
  "repo": "ayqy/hoho-lerna",
  "labels": {
    "enhancement": ":rocket: Enhancement",
    "bug": ":bug: Bug Fix",
    "doc": "Refine Doc",
    "feat": "New Feature"
  },
  "cacheDir": ".changelog"
}

Special note: repo is required, says it can be automatically inferred, actually not very reliable, see The 'repo' field automatically inferred failed, but no error occurred

P.S. In labels, key is the label to configure in Github, used to classify Issue/PR, :bug: in value is just a playful emoji, will be used as title for this type of change in changelog

Not done yet, also need Github repo permission (to check Issues, PRs), expose token as environment variable (if commonly used, can add to ~/.bash_profile):

export GITHUB_AUTH="..."

Configuration complete. To achieve "automatic", prerequisite is daily development maintenance follows agreed conventions, otherwise tool definitely can't guess changelog. Conventions refer to:

(Suggested) commit message associates corresponding issue
(Required) Select our predefined label when creating PR

Because tool only organizes PRs with specified labels on github, and uses commit message as changelog item, suggest associating issue in commit message, generated changelog can associate to corresponding issue:

Uses github PR/Issue names categorized by labels with configurable headings.

For example:

git cm -m "feat: changelog, Close #1"

Then submit PR and attach label: feat, after merge, pull locally and try lerna-changelog:

## Unreleased (2018-01-13)

#### New Feature
* [#2](https://github.com/ayqy/hoho-lerna/pull/2) feat: changelog, Closes [#1](https://github.com/ayqy/hoho-lerna/issues/1). ([ @ayqy](https://github.com/ayqy))

#### Committers: 1
-  黯羽轻扬 ([ayqy](https://github.com/ayqy))

Quite beautiful: https://github.com/ayqy/hoho-lerna/releases/tag/v1.1.0

P.S. Should ignore locally generated changelog temporary files in .gitignore, only locally lerna-changelog when publishing new version, and paste generated changelog to release note. Not automatically publishing release note may be API limitation or out of caution, after all release note is relatively important

Additionally, automatically organizing changelog this way, actually relies on development constraints (PR label conventions, commit message as changelog item conventions), has little to do with lerna, as long as monorepo (Issue/PR) are all together, can follow this approach to get Issue/PR information, organize changelog

Equivalent to distributing the huge workload of final changelog organization to daily development maintenance, changes must go through PR, and must have issue records, if not used to it still very troublesome (there's call for requiring commit message to carry label without going through PR, should support in future)

4. Applicable Scenarios

Which scenarios can adopt monorepo (and manage with lerna)?

Not overly huge projects, if integrated together has 100G source code, better think again
Multi-module/plugin projects, very suitable to use officially maintained plugins as packages

Additionally, also need:

Infrastructure
Team trust

Infrastructure refers to powerful build tools, can meet all modules' build needs (for pure front-end projects, build pressure is not heavy)

monorepo environment, can and encourages changing others' code, on one hand needs continuous integration mechanism (such as React - CircleCI) to confirm impact brought by changes, on other hand also needs trust between different teams, otherwise often one team's changes affect another team, need to roll back others' changes,反而 affecting efficiency

P.S. Lerna has been out for a long time (about same age as Babel), many projects are using it

Reference Materials

Lerna: Very concise official documentation
monorepo 新浪潮 | introduce lerna: Senior's helloworld is not bad
REPO 风格之争：MONO VS MULTI
Mono Repository Tool Comparison: monorepo tool comparison
New wave modularity with Lerna, monorepos, and npm organizations