At ZuriHac this year, my goal was to use GHC’s relatively new WebAssembly (wasm) backend to do something cool. I accomplished this goal, and learned a ton about wasm and how GHC’s wasm backend works along the way. In this post, I’ll document everything I learned, from the basics of wasm to some nitty-gritty details of how GHC targets wasm from Haskell.
To demonstrate what we cover, we’ll be building this interactive example webpage! Thrilling!
In a future post, I’ll document the actual project I worked on while learning all of this and include a demo of the finished product, so stay tuned!
Installing a GHC wasm cross compiler
Currently, the best resources for installing a GHC wasm cross compiler are the GHC User’s Guide and the ghc-wasm-meta repository on the GHC GitLab. If you follow the instructions in those resources, you should be able to build/install a wasm cross compiler with minimal pain, even outside of Nix. If you’re on Linux, you might even be able to install a cross compiler via GHCup. Otherwise, you’ll need to build a cross compiler GHC from source using the directions in ghc-wasm-meta. This is what I did, since I was on macOS, and it went pretty smoothly.
If you have installed the compiler, and it exists on your PATH, you should be
able to run a --version command:
$ wasm32-wasi-ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.11.20240607Hello, wasm!
We can use the wasm32-wasi-ghc cross compiler just like we would a
non-cross-compiling build of ghc. For example, with the following code in a
Main.hs file:
module Main where
main :: IO ()
main = putStrLn "Wasm? I hardly know 'em!"We can simply run:
$ wasm32-wasi-ghc Main.hsThis results in the usual Main.o1 and Main.hi outputs,
along with an executable program in a Main.wasm file. At this point, we can
execute the Main.wasm program in a shell using the
Wasmtime runtime:
$ wasmtime Main.wasm
Wasm!? I hardly know 'em!Brief overview of wasm modules
A compiled .wasm file contains a single wasm module. Wasm modules consist of
(among other things) a set of functions, imports, and exports. Let’s
explore this a little by looking at an example wasm module in the standard text
format.
Files containing text-formatted wasm typically have a .wat extension. Here’s
our example:
(module
;; Imports
(import "console" "log" (func $log (param i32 i32)))
(import "js" "mem" (memory 1)) ;; 1 page = 64 KiB
;; Data (automatically written to the imported memory at offset 0)
(data (i32.const 0) "Wasm? I hardly know 'em!")
;; Function
(func $logMsg
i32.const 0 ;; offset
i32.const 24 ;; length
call $log
)
;; Exports
(export "logMsg" (func $logMsg))
)Lines preceded by ;; are comments. The order of these items is not important.
As you can see, the text format syntax uses
s-expressions. The root node is
the module keyword.
Imports
An import node specifies a two-level name space and the item being imported.
For example, the first import above must be provided in the console.log name
space and it must be a function which accepts two i32 parameters. This means,
for example, that to instantiate (i.e. compile and run) this module from
JavaScript we will need to provide an object that looks like:
{
console:
{
log: (i1, i2) => ...
}
}Where i1 and i2 are the i32 parameters to $log.
Our next import is a memory 1 import and it must be provided in the js.mem
name space at instantiation. The 1 means that the memory must be at least one
page, which wasm currently defined to be 64KB. This memory is essentially a
JavaScript Uint8Array.
Data
The data node declares static data to be included in the module, very much
like the .data section of x86 assembly. The data declaration above will
cause the string Wasm? I hardly know 'em! to be written to our imported memory
at offset 0 at instantiation time.
Functions
The func node in our example defines a function with a symbolic identifier of
$log. The function takes no parameters, and simply pushes the offset and
length of the string we’ve written to memory on to the stack and then calls the
$log function.
We declare an export of the function just beneath its definition. The export
states that the module exports a symbol called logMsg corresponding to the
$logMsg function. This means once we instantiate this module from JavaScript,
we’ll have a running instance of the module whose exports will contain a
runnable function logMsg that actually dispatches the wasm function.
Converting the wasm text format to bytecode
With our example above in a file named Hello.wat, we can convert it to a
Hello.wasm file containing wasm bytecode using the wat2wasm tool included in
the WebAssembly Binary Toolkit:
$ wat2wasm Hello.watInteracting with wasm from JavaScript
Previously, we ran a compiled .wasm program using the wasmtime runtime. Now,
we want to run a comiled .wasm file from JavaScript, which must run inside a
JavaScript runtime like Node.js or a browser. Most
JavaScript runtimes support working with wasm via the global
WebAssembly
object.
Instantiating a wasm module in Node.js
The WebAssembly JavaScript API provides several methods for instantiating a
wasm module. They all accept the wasm bytecode and the necessary module imports,
and give back a running instance of the module through which we can access the
exports.
Let’s instantiate our compiled Hello.wasm module from the previous example in
Node.js. We’ll do this using the
WebAssembly.instantiate()
function, which takes an
ArrayBuffer
holding the raw bytecode, and the module’s import object. We’ll read the
Hello.wasm file into an ArrayBuffer using the
fs.readFileSync()
function. So, to start, our JavaScript module (call it Hello.mjs) looks like
this:
import fs from "node:fs";
const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
wasm,
{} // empty imports
);
instance.exports.logMsg();Let’s try running this. We expect it to fail since we haven’t provided any of the module’s declared imports:
$ node Hello.mjs
...
[TypeError: WebAssembly.instantiate(): Import #0 module="console" error: module is not an object or function]
...
We need to provide the import that the module expects in the console.log
namespace. We’ll fill it with a simple lambda function that just returns
immediately for now.
We also need to provide the js.mem memory import declared by the module. To do
this, we’ll need to create a
WebAssembly.Memory
object. The constructor of a WebAssembly.Memory expects a memoryDescriptor
object, which specifies at least the initial size of the memory in pages (recall
that 1 page is 64KB). Our JavaScript module now looks like:
import fs from "node:fs";
const memory = new WebAssembly.Memory({ initial: 1 });
const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
wasm,
{
console: { log: (i1, i2) => { return; } },
js: { mem: memory }
}
);
instance.exports.logMsg();If we run this module in node, we don’t get any errors! Let’s finally hook up
the wires so that it actually prints the string that the wasm instance writes to
memory. To do this, we need to make the function that we provide the instance as
its console.log import read the given number of bytes at the given offset in
the memory and print them to the console.
import fs from "node:fs";
const memory = new WebAssembly.Memory({ initial: 1 });
function logString(offset, length) {
const bytes = new Uint8Array(memory.buffer, offset, length);
const string = new TextDecoder("utf8").decode(bytes);
console.log(string);
}
const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
wasm,
{
console: { log: logString },
js: { mem: memory }
}
);
instance.exports.logMsg();Running this in node, we see the string from memory written to the console:
$ node Hello.mjs
Wasm? I hardly know 'em!Instantiating a wasm module in the browser
We only need to modify the JavaScript a tiny bit to instantiate the same wasm
module in the browser. Instead of using WebAssembly.instantiate() on an
ArrayBuffer, we’ll use
WebAssembly.instantiateStreaming()
on a fetch() of
the .wasm file:
const memory = new WebAssembly.Memory({ initial: 1 });
function logString(offset, length) {
const bytes = new Uint8Array(memory.buffer, offset, length);
const string = new TextDecoder("utf8").decode(bytes);
console.log(string);
}
const { instance } = await WebAssembly.instantiateStreaming(
fetch('./Hello.wasm'),
{
console: { log: logString },
js: { mem: memory }
}
);
instance.exports.logMsg();Then we can include this script in our HTML:
<!doctype html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
Open the developer console to see the message.
<script type="module" src="Hello.mjs"></script>
</body>
</html>You can access this page and see the message yourself here.
Running GHC-compiled wasm in the browser
At this point, we know how to generate wasm modules from Haskell using a GHC and
how to instantiate wasm modules in the browser. Let’s combine our knowledge and
try to instantiate a GHC-generated wasm module in the browser. Using the same
Haskell source code as previously, we will create another
JavaScript module which attempts to instantiate Main.wasm:
const { instance } = await WebAssembly.instantiateStreaming(fetch("Main.wasm"), {});Including this script in an HTML document, we’ll see the following error in the console:
Uncaught TypeError: WebAssembly.instantiate(): Import #0 "wasi_snapshot_preview1": module is not an object or functionThis means that the wasm module produced by GHC is expecting an import module
named wasi_snapshot_preview1 to be provided.
WASI
WASI stands for WebAssembly System Interface, and it is a standardized set of APIs that enable wasm modules to interact with the host environment, very much like the interfaces that POSIX defines.
This is where the wasi is coming from in the name wasm32-wasi-ghc. GHC
generates wasm modules which expect a WASI API to be provided by the host
runtime. For example, our program expects to print to standard output via the
putStrLn function. GHC therefore produces a program that tries to use the
given WASI API for printing to standard output. This API is currently provided
automatically by the wasmtime runtime, but not from JavaScript. We need to do
a bit more work to create appropriate WASI API for the browser and provide it as
an import when instantiating from JavaScript.
Connecting the wires
Thankfully, there are some libraries that make this pretty easy. I am aware of three options:
None of these seem extremely actively maintained, and one might be better suited than the others for particular workloads. We’ll somewhat arbitrarily use Runno’s WASI runner in the rest of this post.
All WASI browser API libraries follow the same usage pattern. We construct an API specification that determines what should happen when our module attempts to do things like write to standard output or create files. Let’s add the required import to our JavaScript and set up the WASI browser API. For our example program, this only requires specifying what should happen with standard output:
import { WASI } from "https://cdn.jsdelivr.net/npm/@runno/wasi@0.7.0/dist/wasi.js";
const wasi = new WASI({
stdout: (out) => console.log("[wasm stdout]", out)
});
const wasm = await WebAssembly.instantiateStreaming(fetch("./Main.wasm"), wasi.getImportObject());
wasi.start(wasm, {});The way that this specific WASI API works might feel a little strange. We build
the API (using new WASI()), provide the resulting import object for
instantiation (wasi.getImportObject()), and then give the result of
instantiation back to the API (wasi.start()) so it can do some internal
instance management.
This works! If we include this script in an HTML document, we’ll see our message printed to the console. See it for yourself here.
Accessing the DOM from Haskell (GHC wasm JavaScript FFI)
We’re now successfully running GHC-compiled WASM in the browser. This is absolutely exhilerating (I’m sweating), but we’re greedy and we want to take this even further. Specifically, we want to access the DOM from GHC wasm so that we can drive interesting page logic from Haskell. This has recently been made possible with the new GHC (>=9.10) wasm backend JavaScript FFI.
This works similarly to Haskell’s other FFI capabilities. We can use foreign import javascript to embed a bit of JavaScript code into our program, making it
accessible from Haskell. For example, we can get an HTML element as a value of
type JSVal by its HTML id using something like:
foreign import javascript unsafe "document.getElementById($1)"
js_document_getElementById :: JSString -> IO JSValThe difference between unsafe and safe imports here is as follows:
unsafe: Calls tounsafeimports block the entire runtime waiting for the result, and exceptions during execution of these imports cannot be handled in Haskell. Just likeunsafeC imports, the imported JavaScript cannot call back into Haskell.safe: The JavaScript code is wrapped inasync, thus callingsafeimports does not block the GHC runtime. Instead,safecalls immediately return aThunkcorresponding to the resulting JavaScriptPromise. Evaluating the thunk will block until thePromiseresolves.
For more information on this, see the GHC User’s Guide.
We can convert Haskell functions into callable JSVals using foreign import javascript "wrapper":
foreign import javascript "wrapper"
asEventListener :: (JSVal -> IO ()) -> IO JSValPutting it all together
Let’s write a simple web page that allows us to change the opacity of an image
on the page with an HTML <input type="range">
element. Here’s our HTML:
<!doctype html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<div>
<input type="range" id="opacity-input" name="opacity-input" min="0" max="1" step=0.01 value="0" />
<label for="opacity-input">Opacity</label>
</div>
<img src="./spj.jpg" id="surprise-image" style="height: 75vh; opacity: 0;" alt="May I interest you in some lambda?">
<script type="module" src="Finale.mjs"></script>
</body>
</html>In Haskell, our program will get the input element and register another
Haskell function as an event listener:
module Main where
import Control.Concurrent
import GHC.Wasm.Prim
-- | See the explanation in the post
main = error "not necessary"
foreign export javascript "setup" setup :: IO ()
-- | Adds an @input@ event listener to the input element which sets the opacity
-- of the image to the input value.
setup :: IO ()
setup = do
-- Get the input element as a JSVal
opacityInput <- js_document_getElementById (toJSString "opacity-input")
-- Create a callable JSVal wrapper from our onOpacityInput function
opacityInputCallback <- asEventListener onOpacityInput
-- Set the callable JSVal as the listener for input events
js_addEventListener opacityInput (toJSString "input") opacityInputCallback
onOpacityInput :: JSVal -> IO ()
onOpacityInput event = do
-- Get the input value
inpOpacity <- js_event_target_value event
-- Get the image element
img <- js_document_getElementById (toJSString "surprise-image")
-- Set the image's opacity
js_setOpacity img inpOpacity
foreign import javascript unsafe "document.getElementById($1)"
js_document_getElementById :: JSString -> IO JSVal
foreign import javascript unsafe "$1.target.value"
js_event_target_value :: JSVal -> IO Double
foreign import javascript unsafe "$1.style.opacity = $2"
js_setOpacity :: JSVal -> Double -> IO ()
foreign import javascript unsafe "$1.addEventListener($2, $3)"
js_addEventListener :: JSVal -> JSString -> JSVal -> IO ()
foreign import javascript "wrapper"
asEventListener :: (JSVal -> IO ()) -> IO JSValThe only bits that we haven’t dicussed yet are the main = error ... and
foreign export at the top. To understand where those are coming from, we need
to explain just one more wasm quirk.
Command modules vs. reactor modules
The wasm modules that GHC emits by default are called command modules. Command
modules export a symbol named _start which initializes, runs, and then
finalizes the entire program state when executed. After that finalization, the
other exports of the instance are no longer safe to run since the instance state
will have been destructed. Command modules thus expect to be ran as traditional
short-lived commands. This does not make them easy to fit into a browser
context, where we typically want instances to stay alive the whole time the page
is live in the browser.
Reactor modules were standardized to fill this use case. They export a symbol
called named _initialize which only initializes the instance state. After
that, the instance remains alive and any other exports will be safe to access.
For more information on the distinction between command and reactor modules, see
these docs in the WASI GitHub
repository.
To make GHC emit a reactor module instead of a command module, we must use the
-optl-mexec-model=reactor linker flag. As a reactor module, our program no
longer has an entrypoint, so we also need to manually export any symbols we plan
to call directly from JavaScript with a -optl-Wl,--export=symbol linker flag.
In our program, the Haskell setup function simply sets up the necessary event
listeners and completes, so we manually export it via the the foreign export
and we will include a -optl-Wl,--export=setup flag for compilation.
Lastly, unless we name the module Main, GHC will not generate a linked .wasm
output. Since out module is named Main, it must include a main :: IO ()
function. We could have named our setup function main, but I don’t think the
typical semantics of main fit, so instead we just include a useless main
function to keep GHC happy. What’s worse is that even though GHC forced us to
include this main function, if we compile our module above with:
$ wasm32-wasi-ghc -optl-Wl,--export=setup -optl-mexec-model=reactor Finale.hsWe’ll get an error:
wasm-ld: error: duplicate symbol: main
So we need to exclude Haskell’s main symbol from the output using the
-no-hs-main flag. Our final compilation command is:
$ wasm32-wasi-ghc -no-hs-main -optl-Wl,--export=setup -optl-mexec-model=reactor Finale.hsWelcome to the bleeding edge!
Accommodating GHC’s wasm JSFFI
Any GHC wasm module that uses the JavaScript FFI requires an extra import. All GHC wasm cross compiler distributions include a script that can automatically generate the import object based on the compiled wasm. Here is how we run that script:
$ $(wasm32-wasi-ghc --print-libdir)/post-link.mjs -i Finale.wasm -o ghc_wasm_jsffi.jsThis outputs a JavaScript module to ghc_wasm_jsffi.js which exports an object
that includes all the items that our wasm module (Finale.wasm) is going to
need for its FFI with JavaScript. When we instantiate our module from
JavaScript, we need to provide it as an import. Furthermore, we need to use our
WASI API’s initialize() function instead of start() since we are dealing
with a reactor module now. Here’s the full JavaScript:
import { WASI } from "https://cdn.jsdelivr.net/npm/@runno/wasi@0.7.0/dist/wasi.js";
import ghc_wasm_jsffi from "./ghc_wasm_jsffi.js";
const wasi = new WASI({
stdout: (out) => console.log("[wasm stdout]", out)
});
const jsffiExports = {};
const wasm = await WebAssembly.instantiateStreaming(
fetch('./Finale.wasm'),
Object.assign(
{ ghc_wasm_jsffi: ghc_wasm_jsffi(jsffiExports) },
wasi.getImportObject()
)
);
Object.assign(jsffiExports, wasm.instance.exports);
wasi.initialize(wasm, {
ghc_wasm_jsffi: ghc_wasm_jsffi(jsffiExports)
});
wasi.instance.exports.setup();We need to provide the extra jsffiExports object to ghc_wasm_jsffi, and then
fill it in with the instance exports after instantiation for the FFI to work.
The WASI API we’re using also mandates that we pass any extra imports to the
initialize() function as well.
Finally we have a working page whose logic is pretty much all coming from Haskell! Try it out here.
Summary
We’ve covered a lot! Here are some of the important points:
- GHC generates command modules which expect the
wasi_snapshot_preview1API by default, and these can be ran easily in runtimes likewasmtime. - To run GHC wasm in the browser, we need (at least) to use a WASI browser API to make the module’s system calls have the intended effect in the browser.
- Any non-trivial GHC wasm modules that we intend to run in the browser should
be reactor modules, which we generate by passing the
-optl-mexec-model=reactorlinker flag to GHC. - Reactor modules need to have any symbols that we intend to access from
JavaScript exported by passing the
-optl-Wl,--export=symbol_namelinker flag to GHC. We also need to pass-no-hs-mainto prevent duplicate symbol errors. - Using GHC’s wasm JavaScript FFI requires that we pass an additional import to
our module at instantiation time. This import object can be generated using
the
post-link.mjsscript available inwasm32-wasi-ghc --print-libdir.
We’ve also briefly covered how to use the wasm JavaScript FFI, but the GHC User’s Guide covers it much more comprehensively.
In a future post, we’ll apply our new knowledge and jump right in to solving Jane Street’s problem using a Haskell-driven simulation in the browser.
Footnotes
The
.ofiles that a GHC wasm cross compiler produces are actually linkable wasm modules.↩︎