At ZuriHac this year, my goal was to use GHC’s relatively new WebAssembly (wasm) backend to do something cool. I accomplished this goal, and learned a ton about wasm and how GHC’s wasm backend works along the way. In this post, I’ll document everything I learned, from the basics of wasm to some nitty-gritty details of how GHC targets wasm from Haskell.

To demonstrate what we cover, we’ll be building this interactive example webpage! Thrilling!

In a future post, I’ll document the actual project I worked on while learning all of this and include a demo of the finished product, so stay tuned!

Installing a GHC wasm cross compiler

Currently, the best resources for installing a GHC wasm cross compiler are the GHC User’s Guide and the ghc-wasm-meta repository on the GHC GitLab. If you follow the instructions in those resources, you should be able to build/install a wasm cross compiler with minimal pain, even outside of Nix. If you’re on Linux, you might even be able to install a cross compiler via GHCup. Otherwise, you’ll need to build a cross compiler GHC from source using the directions in ghc-wasm-meta. This is what I did, since I was on macOS, and it went pretty smoothly.

If you have installed the compiler, and it exists on your PATH, you should be able to run a --version command:

$ wasm32-wasi-ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.11.20240607

Hello, wasm!

We can use the wasm32-wasi-ghc cross compiler just like we would a non-cross-compiling build of ghc. For example, with the following code in a Main.hs file:

module Main where

main :: IO ()
main = putStrLn "Wasm? I hardly know 'em!"

We can simply run:

$ wasm32-wasi-ghc Main.hs

This results in the usual Main.o1 and Main.hi outputs, along with an executable program in a Main.wasm file. At this point, we can execute the Main.wasm program in a shell using the Wasmtime runtime:

$ wasmtime Main.wasm
Wasm!? I hardly know 'em!

Brief overview of wasm modules

A compiled .wasm file contains a single wasm module. Wasm modules consist of (among other things) a set of functions, imports, and exports. Let’s explore this a little by looking at an example wasm module in the standard text format. Files containing text-formatted wasm typically have a .wat extension. Here’s our example:

(module
  ;; Imports
  (import "console" "log" (func $log (param i32 i32)))
  (import "js" "mem" (memory 1)) ;; 1 page = 64 KiB

  ;; Data (automatically written to the imported memory at offset 0)
  (data (i32.const 0) "Wasm? I hardly know 'em!")

  ;; Function
  (func $logMsg
    i32.const 0   ;; offset
    i32.const 24  ;; length
    call $log
  )

  ;; Exports
  (export "logMsg" (func $logMsg))
)

Lines preceded by ;; are comments. The order of these items is not important. As you can see, the text format syntax uses s-expressions. The root node is the module keyword.

Imports

An import node specifies a two-level name space and the item being imported. For example, the first import above must be provided in the console.log name space and it must be a function which accepts two i32 parameters. This means, for example, that to instantiate (i.e. compile and run) this module from JavaScript we will need to provide an object that looks like:

{
    console:
        {
            log: (i1, i2) => ...
        }
}

Where i1 and i2 are the i32 parameters to $log.

Our next import is a memory 1 import and it must be provided in the js.mem name space at instantiation. The 1 means that the memory must be at least one page, which wasm currently defined to be 64KB. This memory is essentially a JavaScript Uint8Array.

Data

The data node declares static data to be included in the module, very much like the .data section of x86 assembly. The data declaration above will cause the string Wasm? I hardly know 'em! to be written to our imported memory at offset 0 at instantiation time.

Functions

The func node in our example defines a function with a symbolic identifier of $log. The function takes no parameters, and simply pushes the offset and length of the string we’ve written to memory on to the stack and then calls the $log function.

We declare an export of the function just beneath its definition. The export states that the module exports a symbol called logMsg corresponding to the $logMsg function. This means once we instantiate this module from JavaScript, we’ll have a running instance of the module whose exports will contain a runnable function logMsg that actually dispatches the wasm function.

Converting the wasm text format to bytecode

With our example above in a file named Hello.wat, we can convert it to a Hello.wasm file containing wasm bytecode using the wat2wasm tool included in the WebAssembly Binary Toolkit:

$ wat2wasm Hello.wat

Interacting with wasm from JavaScript

Previously, we ran a compiled .wasm program using the wasmtime runtime. Now, we want to run a comiled .wasm file from JavaScript, which must run inside a JavaScript runtime like Node.js or a browser. Most JavaScript runtimes support working with wasm via the global WebAssembly object.

Instantiating a wasm module in Node.js

The WebAssembly JavaScript API provides several methods for instantiating a wasm module. They all accept the wasm bytecode and the necessary module imports, and give back a running instance of the module through which we can access the exports.

Let’s instantiate our compiled Hello.wasm module from the previous example in Node.js. We’ll do this using the WebAssembly.instantiate() function, which takes an ArrayBuffer holding the raw bytecode, and the module’s import object. We’ll read the Hello.wasm file into an ArrayBuffer using the fs.readFileSync() function. So, to start, our JavaScript module (call it Hello.mjs) looks like this:

import fs from "node:fs";

const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
    wasm,
    {} // empty imports
);

instance.exports.logMsg();

Let’s try running this. We expect it to fail since we haven’t provided any of the module’s declared imports:

$ node Hello.mjs
...
  [TypeError: WebAssembly.instantiate(): Import #0 module="console" error: module is not an object or function]
...

We need to provide the import that the module expects in the console.log namespace. We’ll fill it with a simple lambda function that just returns immediately for now.

We also need to provide the js.mem memory import declared by the module. To do this, we’ll need to create a WebAssembly.Memory object. The constructor of a WebAssembly.Memory expects a memoryDescriptor object, which specifies at least the initial size of the memory in pages (recall that 1 page is 64KB). Our JavaScript module now looks like:

import fs from "node:fs";

const memory = new WebAssembly.Memory({ initial: 1 });

const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
    wasm,
    {
        console: { log: (i1, i2) => { return; } },
        js: { mem: memory }
    }
);

instance.exports.logMsg();

If we run this module in node, we don’t get any errors! Let’s finally hook up the wires so that it actually prints the string that the wasm instance writes to memory. To do this, we need to make the function that we provide the instance as its console.log import read the given number of bytes at the given offset in the memory and print them to the console.

import fs from "node:fs";

const memory = new WebAssembly.Memory({ initial: 1 });

function logString(offset, length) {
    const bytes = new Uint8Array(memory.buffer, offset, length);
    const string = new TextDecoder("utf8").decode(bytes);
    console.log(string);
}

const wasm = fs.readFileSync('./Hello.wasm');
const { instance } = await WebAssembly.instantiate(
    wasm,
    {
        console: { log: logString },
        js: { mem: memory }
    }
);

instance.exports.logMsg();

Running this in node, we see the string from memory written to the console:

$ node Hello.mjs
Wasm? I hardly know 'em!

Instantiating a wasm module in the browser

We only need to modify the JavaScript a tiny bit to instantiate the same wasm module in the browser. Instead of using WebAssembly.instantiate() on an ArrayBuffer, we’ll use WebAssembly.instantiateStreaming() on a fetch() of the .wasm file:

const memory = new WebAssembly.Memory({ initial: 1 });

function logString(offset, length) {
    const bytes = new Uint8Array(memory.buffer, offset, length);
    const string = new TextDecoder("utf8").decode(bytes);
    console.log(string);
}

const { instance } = await WebAssembly.instantiateStreaming(
    fetch('./Hello.wasm'),
    {
        console: { log: logString },
        js: { mem: memory }
    }
);

instance.exports.logMsg();

Then we can include this script in our HTML:

<!doctype html>
<html>
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>

<body>
    Open the developer console to see the message.
    <script type="module" src="Hello.mjs"></script>
</body>
</html>

You can access this page and see the message yourself here.

Running GHC-compiled wasm in the browser

At this point, we know how to generate wasm modules from Haskell using a GHC and how to instantiate wasm modules in the browser. Let’s combine our knowledge and try to instantiate a GHC-generated wasm module in the browser. Using the same Haskell source code as previously, we will create another JavaScript module which attempts to instantiate Main.wasm:

const { instance } = await WebAssembly.instantiateStreaming(fetch("Main.wasm"), {});

Including this script in an HTML document, we’ll see the following error in the console:

Uncaught TypeError: WebAssembly.instantiate(): Import #0 "wasi_snapshot_preview1": module is not an object or function

This means that the wasm module produced by GHC is expecting an import module named wasi_snapshot_preview1 to be provided.

WASI

WASI stands for WebAssembly System Interface, and it is a standardized set of APIs that enable wasm modules to interact with the host environment, very much like the interfaces that POSIX defines.

This is where the wasi is coming from in the name wasm32-wasi-ghc. GHC generates wasm modules which expect a WASI API to be provided by the host runtime. For example, our program expects to print to standard output via the putStrLn function. GHC therefore produces a program that tries to use the given WASI API for printing to standard output. This API is currently provided automatically by the wasmtime runtime, but not from JavaScript. We need to do a bit more work to create appropriate WASI API for the browser and provide it as an import when instantiating from JavaScript.

Connecting the wires

Thankfully, there are some libraries that make this pretty easy. I am aware of three options:

None of these seem extremely actively maintained, and one might be better suited than the others for particular workloads. We’ll somewhat arbitrarily use Runno’s WASI runner in the rest of this post.

All WASI browser API libraries follow the same usage pattern. We construct an API specification that determines what should happen when our module attempts to do things like write to standard output or create files. Let’s add the required import to our JavaScript and set up the WASI browser API. For our example program, this only requires specifying what should happen with standard output:

import { WASI } from "https://cdn.jsdelivr.net/npm/@runno/wasi@0.7.0/dist/wasi.js";

const wasi = new WASI({
    stdout: (out) => console.log("[wasm stdout]", out)
});

const wasm = await WebAssembly.instantiateStreaming(fetch("./Main.wasm"), wasi.getImportObject());

wasi.start(wasm, {});

The way that this specific WASI API works might feel a little strange. We build the API (using new WASI()), provide the resulting import object for instantiation (wasi.getImportObject()), and then give the result of instantiation back to the API (wasi.start()) so it can do some internal instance management.

This works! If we include this script in an HTML document, we’ll see our message printed to the console. See it for yourself here.

Accessing the DOM from Haskell (GHC wasm JavaScript FFI)

We’re now successfully running GHC-compiled WASM in the browser. This is absolutely exhilerating (I’m sweating), but we’re greedy and we want to take this even further. Specifically, we want to access the DOM from GHC wasm so that we can drive interesting page logic from Haskell. This has recently been made possible with the new GHC (>=9.10) wasm backend JavaScript FFI.

This works similarly to Haskell’s other FFI capabilities. We can use foreign import javascript to embed a bit of JavaScript code into our program, making it accessible from Haskell. For example, we can get an HTML element as a value of type JSVal by its HTML id using something like:

foreign import javascript unsafe "document.getElementById($1)"
    js_document_getElementById :: JSString -> IO JSVal

The difference between unsafe and safe imports here is as follows:

  • unsafe: Calls to unsafe imports block the entire runtime waiting for the result, and exceptions during execution of these imports cannot be handled in Haskell. Just like unsafe C imports, the imported JavaScript cannot call back into Haskell.
  • safe: The JavaScript code is wrapped in async, thus calling safe imports does not block the GHC runtime. Instead, safe calls immediately return a Thunk corresponding to the resulting JavaScript Promise. Evaluating the thunk will block until the Promise resolves.

For more information on this, see the GHC User’s Guide.

We can convert Haskell functions into callable JSVals using foreign import javascript "wrapper":

foreign import javascript "wrapper"
    asEventListener :: (JSVal -> IO ()) -> IO JSVal

Putting it all together

Let’s write a simple web page that allows us to change the opacity of an image on the page with an HTML <input type="range"> element. Here’s our HTML:

<!doctype html>
<html>
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>

<body>
    <div>
        <input type="range" id="opacity-input" name="opacity-input" min="0" max="1" step=0.01 value="0" />
        <label for="opacity-input">Opacity</label>
    </div>
    <img src="./spj.jpg" id="surprise-image" style="height: 75vh; opacity: 0;" alt="May I interest you in some lambda?">
    <script type="module" src="Finale.mjs"></script>
</body>
</html>

In Haskell, our program will get the input element and register another Haskell function as an event listener:

module Main where

import Control.Concurrent
import GHC.Wasm.Prim

-- | See the explanation in the post
main = error "not necessary"
foreign export javascript "setup" setup :: IO ()

-- | Adds an @input@ event listener to the input element which sets the opacity
-- of the image to the input value.
setup :: IO ()
setup = do
    -- Get the input element as a JSVal
    opacityInput <- js_document_getElementById (toJSString "opacity-input")

    -- Create a callable JSVal wrapper from our onOpacityInput function
    opacityInputCallback <- asEventListener onOpacityInput

    -- Set the callable JSVal as the listener for input events
    js_addEventListener opacityInput (toJSString "input") opacityInputCallback

onOpacityInput :: JSVal -> IO ()
onOpacityInput event = do
    -- Get the input value
    inpOpacity <- js_event_target_value event

    -- Get the image element
    img <- js_document_getElementById (toJSString "surprise-image")

    -- Set the image's opacity
    js_setOpacity img inpOpacity

foreign import javascript unsafe "document.getElementById($1)"
  js_document_getElementById :: JSString -> IO JSVal

foreign import javascript unsafe "$1.target.value"
  js_event_target_value :: JSVal -> IO Double

foreign import javascript unsafe "$1.style.opacity = $2"
  js_setOpacity :: JSVal -> Double -> IO ()

foreign import javascript unsafe "$1.addEventListener($2, $3)"
  js_addEventListener :: JSVal -> JSString -> JSVal -> IO ()

foreign import javascript "wrapper"
  asEventListener :: (JSVal -> IO ()) -> IO JSVal

The only bits that we haven’t dicussed yet are the main = error ... and foreign export at the top. To understand where those are coming from, we need to explain just one more wasm quirk.

Command modules vs. reactor modules

The wasm modules that GHC emits by default are called command modules. Command modules export a symbol named _start which initializes, runs, and then finalizes the entire program state when executed. After that finalization, the other exports of the instance are no longer safe to run since the instance state will have been destructed. Command modules thus expect to be ran as traditional short-lived commands. This does not make them easy to fit into a browser context, where we typically want instances to stay alive the whole time the page is live in the browser.

Reactor modules were standardized to fill this use case. They export a symbol called named _initialize which only initializes the instance state. After that, the instance remains alive and any other exports will be safe to access. For more information on the distinction between command and reactor modules, see these docs in the WASI GitHub repository.

To make GHC emit a reactor module instead of a command module, we must use the -optl-mexec-model=reactor linker flag. As a reactor module, our program no longer has an entrypoint, so we also need to manually export any symbols we plan to call directly from JavaScript with a -optl-Wl,--export=symbol linker flag. In our program, the Haskell setup function simply sets up the necessary event listeners and completes, so we manually export it via the the foreign export and we will include a -optl-Wl,--export=setup flag for compilation.

Lastly, unless we name the module Main, GHC will not generate a linked .wasm output. Since out module is named Main, it must include a main :: IO () function. We could have named our setup function main, but I don’t think the typical semantics of main fit, so instead we just include a useless main function to keep GHC happy. What’s worse is that even though GHC forced us to include this main function, if we compile our module above with:

$ wasm32-wasi-ghc -optl-Wl,--export=setup -optl-mexec-model=reactor Finale.hs

We’ll get an error:

wasm-ld: error: duplicate symbol: main

So we need to exclude Haskell’s main symbol from the output using the -no-hs-main flag. Our final compilation command is:

$ wasm32-wasi-ghc -no-hs-main -optl-Wl,--export=setup -optl-mexec-model=reactor Finale.hs

Welcome to the bleeding edge!

Accommodating GHC’s wasm JSFFI

Any GHC wasm module that uses the JavaScript FFI requires an extra import. All GHC wasm cross compiler distributions include a script that can automatically generate the import object based on the compiled wasm. Here is how we run that script:

$ $(wasm32-wasi-ghc --print-libdir)/post-link.mjs -i Finale.wasm -o ghc_wasm_jsffi.js

This outputs a JavaScript module to ghc_wasm_jsffi.js which exports an object that includes all the items that our wasm module (Finale.wasm) is going to need for its FFI with JavaScript. When we instantiate our module from JavaScript, we need to provide it as an import. Furthermore, we need to use our WASI API’s initialize() function instead of start() since we are dealing with a reactor module now. Here’s the full JavaScript:

import { WASI } from "https://cdn.jsdelivr.net/npm/@runno/wasi@0.7.0/dist/wasi.js";
import ghc_wasm_jsffi from "./ghc_wasm_jsffi.js";

const wasi = new WASI({
    stdout: (out) => console.log("[wasm stdout]", out)
});

const jsffiExports = {};
const wasm = await WebAssembly.instantiateStreaming(
    fetch('./Finale.wasm'),
    Object.assign(
        { ghc_wasm_jsffi: ghc_wasm_jsffi(jsffiExports) },
        wasi.getImportObject()
    )
);
Object.assign(jsffiExports, wasm.instance.exports);

wasi.initialize(wasm, {
    ghc_wasm_jsffi: ghc_wasm_jsffi(jsffiExports)
});
wasi.instance.exports.setup();

We need to provide the extra jsffiExports object to ghc_wasm_jsffi, and then fill it in with the instance exports after instantiation for the FFI to work. The WASI API we’re using also mandates that we pass any extra imports to the initialize() function as well.

Finally we have a working page whose logic is pretty much all coming from Haskell! Try it out here.

Summary

We’ve covered a lot! Here are some of the important points:

  • GHC generates command modules which expect the wasi_snapshot_preview1 API by default, and these can be ran easily in runtimes like wasmtime.
  • To run GHC wasm in the browser, we need (at least) to use a WASI browser API to make the module’s system calls have the intended effect in the browser.
  • Any non-trivial GHC wasm modules that we intend to run in the browser should be reactor modules, which we generate by passing the -optl-mexec-model=reactor linker flag to GHC.
  • Reactor modules need to have any symbols that we intend to access from JavaScript exported by passing the -optl-Wl,--export=symbol_name linker flag to GHC. We also need to pass -no-hs-main to prevent duplicate symbol errors.
  • Using GHC’s wasm JavaScript FFI requires that we pass an additional import to our module at instantiation time. This import object can be generated using the post-link.mjs script available in wasm32-wasi-ghc --print-libdir.

We’ve also briefly covered how to use the wasm JavaScript FFI, but the GHC User’s Guide covers it much more comprehensively.

In a future post, we’ll apply our new knowledge and jump right in to solving Jane Street’s problem using a Haskell-driven simulation in the browser.

Footnotes


  1. The .o files that a GHC wasm cross compiler produces are actually linkable wasm modules.↩︎