A detailed guide to using GHC's WebAssembly backend

Welcome to the bleeding edge!

August 24, 2024

At ZuriHac this year, my goal was to use GHC’s relatively new WebAssembly (wasm) backend to do something cool. I accomplished this goal, and learned a ton about wasm and how GHC’s wasm backend works along the way. In this post, I’ll document everything I learned, from the basics of wasm to some nitty-gritty details of how GHC targets wasm from Haskell.

To demonstrate what we cover, we’ll be building this interactive example webpage! Thrilling!

In a future post, I’ll document the actual project I worked on while learning all of this and include a demo of the finished product, so stay tuned!

Choreographing a dance with the GHC specializer (Part 2)

A cross-post of my blog post for Well-Typed

June 13, 2024

This is the second of a two-part series of blog posts focused on GHC’s specialization optimization. Part 1 acts as a reference manual documenting exactly how, why, and when specialization works in GHC. In this post, we will finally introduce the new tools and techniques we’ve developed to help us make more precise, evidence-based decisions regarding the specialization of our programs. Specifically, we have:

Added two new automatic cost center insertion methods to GHC to help us attribute costs to overloaded parts of our programs using traditional cost center profiling.
Developed a GHC plugin that instruments overloaded calls and emits data to the event log when they are evaluated at runtime.
Implemented analyses to help us derive conclusions from that event log data.

To demonstrate the robustness of these methods, we’ll show exactly how we applied them to Cabal to achieve a 30% reduction in run time and a 20% reduction in allocations in Cabal’s .cabal file parser.

The intended audience of this posts includes intermediate Haskell developers that want to know more about specialization and ad-hoc polymorphism in GHC, and advanced Haskell developers that are interested in systematic approaches to specializing their applications in ways that minimize compilation cost and executable sizes while maximizing performance gains.

This work was made possible thanks to Hasura, who have supported many of Well-Typed’s initiatives to improve tooling for commercial Haskell users.

I presented a summary of the content in Part 1 of this series on The Haskell Unfolder. Feel free to watch it for a brief refresher on what we have learned so far:

The Haskell Unfolder Episode 23: specialisation

Overloaded functions are common in Haskell, but they come with a cost. Thankfully, the GHC specialiser is extremely good at removing that cost. We can therefore write high-level, polymorphic programs and be confident that GHC will compile them into very efficient, monomorphised code. In this episode, we’ll demystify the seemingly magical things that GHC is doing to achieve this.

Choreographing a dance with the GHC specializer (Part 1)

A cross-post of my blog post for Well-Typed

April 15, 2024

Specialization is an optimization technique used by GHC to eliminate the performance overhead of ad-hoc polymorphism and enable other powerful optimizations. However, specialization is not free, since it requires more work by GHC during compilation and leads to larger executables. In fact, excessive specialization can result in significant increases in compilation cost and executable size with minimal runtime performance benefits. For this reason, GHC pessimistically avoids excessive specialization by default and may leave relatively low-cost performance improvements undiscovered in doing so.

Optimistic Haskell programmers hoping to take advantage of these missed opportunities are thus faced with the difficult task of discovering and enacting an optimal set of specializations for their program while balancing any performance improvements with the increased compilation costs and executable sizes. Until now, this dance was a clunky one involving desperately wading through GHC Core dumps only to come up with a precarious, inefficient, unmotivated set of pragmas and/or GHC flags that seem to improve performance.

In this two-part series of posts, I describe the recent work we have done to improve this situation and make optimal specialization of Haskell programs more of a science and less of a dark art. In this first post, I will

give a comprehensive introduction to GHC’s specialization optimization,
explore the various facilities that GHC provides for observing and controlling it, and
present a simple framework for thinking about the trade-offs of specialization.

In the next post of the series, I will

present the new tools and techniques we have developed to diagnose performance issues resulting from ad-hoc polymorphism,
demonstrate how these new tools can be used to systematically identify useful specializations, and
make sense of their impact in terms of the framework described in this post.

The intended audience of this post includes intermediate Haskell developers who want to know more about specialization and ad-hoc polymorphism in GHC, and advanced Haskell developers who are interested in systematic approaches to specializing their applications in ways that minimize compilation cost and executable sizes while maximizing performance gains.

This work was made possible thanks to Hasura, who have supported many of Well-Typed’s successful initiatives to improve tooling for commercial Haskell users.

I presented a summary of the content in this post on The Haskell Unfolder:

The Haskell Unfolder Episode 23: specialisation

Overloaded functions are common in Haskell, but they come with a cost. Thankfully, the GHC specialiser is extremely good at removing that cost. We can therefore write high-level, polymorphic programs and be confident that GHC will compile them into very efficient, monomorphised code. In this episode, we’ll demystify the seemingly magical things that GHC is doing to achieve this.

Announcing ebird-haskell

Haskell libraries for working with eBird data and the public eBird API

November 27, 2023

I have officially released ebird-haskell: a set of libraries and tools for working with eBird data in Haskell. Specifically, there are three components:

ebird-api: A library that provides a complete description of the public eBird API as a servant API type. It also provides types for the litany of values that the eBird API communicates in, and convenient instances and functions for operating on those types.
ebird-client: A library that provides functions for querying any endpoint of the eBird API, based on the description in the ebird-api library.
ebird-cli: An executable command-line utility that can query any endpoint of the eBird API and pretty-print the response data.

This post serves as announcement of these tools (a “call for users”, if you will) and an informal tutorial to help birders turned Haskell programmers or Haskell programmers turned birders get started.

Reducing Haddock's Memory Usage

A cross-post of my blog post for Well-Typed

October 6, 2023

This is a cross-post of a post I authored for Well-Typed.

Haddock is the documentation generation tool for Haskell. Given a set of Haskell modules, Haddock can generate documentation for those modules in various formats based on specially formatted comments in the source code. Haddock is a very important part of the Haskell ecosystem, as many authors and users of Haskell software rely on Haddock-generated documentation to familiarize themselves with the code.

Recently, Mercury asked us to investigate performance problems in Haddock that were preventing their developers from using it effectively. In particular, when running on Mercury’s code, Haddock would eat up all of the memory on 64GB machines. This made it difficult for their developers to generate documentation to browse locally.

At a high level, the work covered by this post has resulted in Haddock’s memory usage being roughly halved. The full set of Haddock and GHC changes resulting in these improvements will be shipped with GHC 9.8.

All this profiling and debugging work was completed using the eventlog2html and ghc-debug tools, which are excellent for characterising and diagnosing Haskell program performance. However, these are much less widely known than they deserve to be. This post aims to act as a case study demonstrating how we have used the tools to make significant performance improvements to a complex, real-world application.

If you’re interested in finding out more about these tools, read on or check out some of our videos:

A website appears! Here's how you can make your own using Hakyll

Introducing my new personal website, and explaining how I built it

September 26, 2023

Welcome to my new personal website! I’m finally staking my claim on the ✨World Wide Web✨¹. It took me a while, but I’m quite happy with the end result. I will be using this space primarily to blog about software development and post photographs I take. For this introductory post, I thought I’d describe how I built the site. This post will serve partly as documentation for my future self, and partly as a tutorial for anybody interested in implementing their site using similar tools.

Don’t worry, I promise to keep the emoji usage to a minimum. Just testing things out 😉↩︎

Blog

Tags