I. What?
WebAssembly or wasm is a new portable, size- and load-time-efficient format suitable for compilation to the web.
A portable, small volume and fast-loading (binary) format, suitable for compilation to Web.
Main goal is supporting high-performance applications in Web environment. But designed not to depend on Web features, nor provide functionality targeting Web features, can also be used in other environments.
Simply put, it defines a compilation target format that can achieve near-native execution performance in any environment supporting this format. Equivalent to allowing extension of native modules, in performance-critical scenarios, implement with other more suitable languages (such as C++), then compile ahead to WebAssembly form, to get performance experience comparable to native.
Its design goals are divided into 2 aspects:
-
Fast, safe and portable semantics
-
Fast: Execute with performance close to native code, and utilize features common to all modern hardware
-
Safe: Code is verified and executed in memory-safe sandbox environment, preventing data corruption or security violations
-
Well-defined: Fully and precisely define legal programs and their behavior, in a way easy to infer informal and formal
-
Hardware-independent: Can compile on all modern architectures, desktop or mobile devices, and embedded systems
-
Language-independent: Not biased toward any specific language, programming model or object model
-
Platform-independent: Can be embedded in browsers, run as stand-alone VM, or integrated into other environments
-
Open: Programs can interact with their environment in simple and universal ways
-
-
Efficient, portable representation
-
Small: Binary format with smaller volume than typical text or native code formats, enabling fast transmission
-
Modular: Programs can be split into smaller parts, can be transmitted, cached and used separately
-
Efficient: Can be quickly decoded, validated and compiled in single pass, equivalent to real-time (JIT) or ahead-of-time (AOT) compilation
-
Streamable: Allows starting decoding, validation and compilation as early as possible before getting all data
-
Parallelizable: Allows splitting decoding, validation and compilation into multiple independent parallel tasks
-
Portable: No assumptions about architectures not widely supported on modern hardware
-
Driven by mainstream browsers (Chrome, Edge, Firefox, and WebKit) together promoting its standardization process:
WebAssembly is currently being designed as an open standard by a W3C Community Group that includes representatives from all major browsers.
P.S. This matter is led by browser vendors (these 4 standing together doing things, very worth looking forward to), just incidentally establishing open standards (not only面向 Web environment), motivation originates from wanting to further improve JS runtime performance. After introducing JIT in V8, wanting to further improve performance is already not very possible, because facing limitations of JS language features (such as interpreted, weak typing). Web capabilities are becoming more and more powerful, client-side JS is becoming heavier, demand for further improving JS execution performance still exists, so there's WebAssembly's solution from the bottom up.
II. wasm and wast
We know WebAssembly defines a binary format, this format is wasm, for example:
0061 736d 0100 0000 0187 8080 8000 0160
027f 7f01 7f03 8280 8080 0001 0004 8480
8080 0001 7000 0005 8380 8080 0001 0001
0681 8080 8000 0007 9080 8080 0002 066d
656d 6f72 7902 0003 6763 6400 000a ab80
8080 0001 a580 8080 0001 017f 0240 2000
450d 0003 4020 0120 0022 026f 2100 2002
2101 2000 0d00 0b20 020f 0b20 010b
The C code corresponding to this hexadecimal string is:
// 辗转相除法求最大公约数
int gcd(int m, int n) {
if (m == 0) return n;
return gcd(n % m, m);
}
wasm's readability equals 0, to alleviate this problem, a more readable text format is defined, called wast:
(module
(table 0 anyfunc)
(memory $0 1)
(export "memory" (memory $0))
(export "gcd" (func $gcd))
(func $gcd (; 0 ;) (param $0 i32) (param $1 i32) (result i32)
(local $2 i32)
(block $label$0
(br_if $label$0
(i32.eqz
(get_local $0)
)
)
(loop $label$1
(set_local $0
(i32.rem_s
(get_local $1)
(tee_local $2
(get_local $0)
)
)
)
(set_local $1
(get_local $2)
)
(br_if $label$1
(get_local $0)
)
)
(return
(get_local $2)
)
)
(get_local $1)
)
)
Parentheses have a bit of Lisp style, but at least it's readable, for example:
// 导出了两个东西,分别叫`memory`和`gcd`
(export "memory" (memory $0))
(export "gcd" (func $gcd))
// 函数签名,接受 2 个 int32 类型参数,返回 int32 类型值
(func $gcd (; 0 ;) (param $0 i32) (param $1 i32) (result i32)
// 函数体...就不猜了
P.S. wast and wasm can be converted to each other, details see WABT: The WebAssembly Binary Toolkit
Additionally, in browser's Source panel can see another kind of text instruction:
func (param i32 i32) (result i32)
(local i32)
block
get_local 0
i32.eqz
br_if 0
loop
get_local 1
get_local 0
tee_local 2
i32.rem_s
set_local 0
get_local 2
set_local 1
get_local 0
br_if 0
end
get_local 2
return
end
get_local 1
end
Looks very similar to wast, don't know if it has a name, or also belongs to wast? This is converted by browser from wasm.
III. Trial Environment
Environment requirements:
-
C/C++ compilation environment Emscripten
-
Browser supporting WebAssembly (latest Chrome supports by default)
Online Environment
There's a no-damage trial environment: WebAssembly Explorer
COMPILE then DOWNLOAD can get wasm, simply useful.
Note, default is C++ environment, if want to use C, select C99 or C89 on left side, otherwise function names will be mangled, for example C++11's wast:
(module
(table 0 anyfunc)
(memory $0 1)
(export "memory" (memory $0))
(export "_Z3gcdii" (func $_Z3gcdii))
(func $_Z3gcdii (; 0 ;) (param $0 i32) (param $1 i32) (result i32)
(local $2 i32)
(block $label$0
(br_if $label$0
(i32.eqz
(get_local $0)
)
)
(loop $label$1
(set_local $0
(i32.rem_s
(get_local $1)
(tee_local $2
(get_local $0)
)
)
)
(set_local $1
(get_local $2)
)
(br_if $label$1
(get_local $0)
)
)
(return
(get_local $2)
)
)
(get_local $1)
)
)
Function name is mangled to _Z3gcdii, guess it's namespace or similar things at play, not very familiar with C++, obediently use C.
P.S. Besides C/C++, other languages can also play with WebAssembly, for example Rust
Local Environment
-
Download platform SDK
-
Follow installation steps
If no surprises, installation is done here, can try emcc -v:
INFO:root:(Emscripten: Running sanity checks)
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 1.37.22
clang version 4.0.0 (emscripten 1.37.22 : 1.37.22)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: D:\emsdk-portable-64bit\clang\e1.37.22_64bit
INFO:root:(Emscripten: Running sanity checks)
In Windows environment may encounter a DLL missing (MSVCP140.dll) error, can manually install required C++ environment, details see MSVCP140.dll not found · Issue #5605 · kripken/emscripten
Then can compile and try (save previous C code to file gcd.c):
emcc ./c/gcd.c -Os -s WASM=1 -s SIDE_MODULE=1 -s BINARYEN_ASYNC_COMPILATION=0 -o ./output/gcd.wasm
P.S. More usage see Emscripten Tutorial
The obtained gcd.wasm content is as follows:
0061 736d 0100 0000 000c 0664 796c 696e
6b80 80c0 0200 010a 0260 027f 7f01 7f60
0000 0241 0403 656e 760a 6d65 6d6f 7279
4261 7365 037f 0003 656e 7606 6d65 6d6f
7279 0200 8002 0365 6e76 0574 6162 6c65
0170 0000 0365 6e76 0974 6162 6c65 4261
7365 037f 0003 0403 0001 0106 0b02 7f01
4100 0b7f 0141 000b 072b 0312 5f5f 706f
7374 5f69 6e73 7461 6e74 6961 7465 0002
0b72 756e 506f 7374 5365 7473 0001 045f
6763 6400 0009 0100 0a40 0327 0101 7f20
0004 4003 4020 0120 006f 2202 0440 2000
2101 2002 2100 0c01 0b0b 0520 0121 000b
2000 0b03 0001 0b12 0023 0024 0223 0241
8080 c002 6a24 0310 010b
Note, method names default get underscore (_) prefix, in this example exported method name is _gcd, details see Interacting with code:
The keys passed into mergeInto generate functions that are prefixed by _. In other words my_func: function() {}, becomes function _my_func() {}, as all C methods in emscripten have a _ prefix. Keys starting with $ have the $ stripped and no underscore added.
Should add underscore when using module interface in JS (don't know if there's a config option to remove it).
IV. Trial
WebAssembly.compile(new Uint8Array(`
0061 736d 0100 0000 0187 8080 8000 0160
027f 7f01 7f03 8280 8080 0001 0004 8480
8080 0001 7000 0005 8380 8080 0001 0001
0681 8080 8000 0007 9080 8080 0002 066d
656d 6f72 7902 0003 6763 6400 000a ab80
8080 0001 a580 8080 0001 017f 0240 2000
450d 0003 4020 0120 0022 026f 2100 2002
2101 2000 0d00 0b20 020f 0b20 010b
`.match(/\S{2}/g).map(s => parseInt(s, 16))
)).then(module => {
const instance = new WebAssembly.Instance(module);
console.log(instance.exports);
const { gcd } = instance.exports;
console.log('gcd(328, 648)', gcd(328, 648));
});
Hexadecimal string comes from online trial, consistent with initial wasm example content. Paste these things to Chrome's Console to execute, if everything is normal, will get error:
VM40:1 Uncaught (in promise) CompileError: WasmCompile: Wasm code generation disallowed in this context
This is because default CSP (Content Security Policy) restriction, easy to solve, open incognito mode (Ctrl/CMD + Shift + N) is fine.
Will get output:
{memory: Memory, gcd: ?}
gcd(328, 648) 8
First line is module export content loaded from our WebAssembly, including a memory object and gcd method, second line output is calling high-performance module to calculate greatest common divisor.
WebAssembly.compile and related APIs can reference:
-
JavaScript API - WebAssembly: Specification definition
-
WebAssembly - JavaScript | MDN: Contains examples
Additionally, locally compiled version requires imports env (and function names get underscore _ prefix):
WebAssembly.compile(new Uint8Array(`
0061 736d 0100 0000 000c 0664 796c 696e
6b80 80c0 0200 010a 0260 027f 7f01 7f60
0000 0241 0403 656e 760a 6d65 6d6f 7279
4261 7365 037f 0003 656e 7606 6d65 6d6f
7279 0200 8002 0365 6e76 0574 6162 6c65
0170 0000 0365 6e76 0974 6162 6c65 4261
7365 037f 0003 0403 0001 0106 0b02 7f01
4100 0b7f 0141 000b 072b 0312 5f5f 706f
7374 5f69 6e73 7461 6e74 6961 7465 0002
0b72 756e 506f 7374 5365 7473 0001 045f
6763 6400 0009 0100 0a40 0327 0101 7f20
0004 4003 4020 0120 006f 2202 0440 2000
2101 2002 2100 0c01 0b0b 0520 0121 000b
2000 0b03 0001 0b12 0023 0024 0223 0241
8080 c002 6a24 0310 010b
`.match(/\S{2}/g).map(s => parseInt(s, 16))
)).then(module => {
let imports = {
env: {
memoryBase: 0,
memory: new WebAssembly.Memory({ initial: 256 }),
tableBase: 0,
table: new WebAssembly.Table({ initial: 0, element: 'anyfunc' })
}
};
const instance = new WebAssembly.Instance(module, imports);
console.log(instance.exports);
// 注意下划线前缀
const { _gcd } = instance.exports;
console.log('gcd(328, 648)', _gcd(328, 648));
});
Can get similar output:
{__post_instantiate: ?, runPostSets: ?, _gcd: ?}
gcd(328, 648) 8
Should be Emscripten default adding some unimportant things, functionally equivalent to our simplified version.
V. Pros and Cons and Application Scenarios
Advantages
-
Code volume is very small
Around 300k (after compression) JavaScript logic rewritten with WebAssembly, volume is only around 90k
But using WebAssembly needs to introduce a 50k-100k JavaScript library as infrastructure
-
Security slightly improved
Although source code corresponding WebAssembly text instructions still completely uncovered, reverse engineering cost is higher
-
Performance improvement
Theoretically WebAssembly has execution performance close to native, because skips interpretation phase, and file volume also has advantages in transmission
Of course, premise is in scenarios with large business code volume, and requiring extreme performance, in benchmark and other repeatedly executed scenarios, JIT is not much slower than AOT
Disadvantages
Currently limited capabilities:
-
Only supports several basic data types (i32 / i64 / f32 / f64 / i8 / i16)
-
Cannot directly access DOM and other Web APIs
-
Cannot control GC
Application Scenarios
WebAssembly defines a standard executable binary format for browsers, this way more developers can participate through unified compilation mechanism, jointly build prosperous Web ecosystem, vision is beautiful, but faces some practical problems.
First WebAssembly's original intention is "supporting high-performance applications in Web environment", to break through performance bottlenecks, so possible application scenarios are:
-
Video decoding
-
Image processing
-
3D/WebVR/AR visualization
-
Rendering engine
-
Physics engine
-
Compression/encryption algorithms
-
...and other scenarios with relatively large computation volume
Of course, some support may also be built into browsers in future, without needing to do through "extension plugins" or similar ways. But WebAssembly's true meaning is providing a capability allowing self-extension of high-performance "native" modules, after all, waiting for browsers to provide, then waiting for compatibility to be acceptable may need quite a long time, and with this capability, no need to bitterly wait for mainstream browsers on market to support certain native features, can do it yourself, and there's no compatibility differences. Conversely, may emerge a batch of popular community modules, and gradually be absorbed as browser native support, ecosystem feeds back to Web environment.
References
-
WebAssembly Practice: How to Write Code: Very good introductory guide
-
How to Comment on Browser's Latest WebAssembly Bytecode Technology?
-
WebAssembly: Silver Bullet for Solving JavaScript Chronic Diseases?
-
wasm-arrays: WebAssembly array wrapper library
No comments yet. Be the first to share your thoughts.