Choreographing a dance with the GHC specializer (Part 2)

A cross-post of my blog post for Well-Typed

June 13, 2024
ghc1haskell1performance1unfolder1youtube1

This is the second of a two-part series of blog posts focused on GHC’s specialization optimization. Part 1 acts as a reference manual documenting exactly how, why, and when specialization works in GHC. In this post, we will finally introduce the new tools and techniques we’ve developed to help us make more precise, evidence-based decisions regarding the specialization of our programs. Specifically, we have:

  • Added two new automatic cost center insertion methods to GHC to help us attribute costs to overloaded parts of our programs using traditional cost center profiling.
  • Developed a GHC plugin that instruments overloaded calls and emits data to the event log when they are evaluated at runtime.
  • Implemented analyses to help us derive conclusions from that event log data.

To demonstrate the robustness of these methods, we’ll show exactly how we applied them to Cabal to achieve a 30% reduction in run time and a 20% reduction in allocations in Cabal’s .cabal file parser.

The intended audience of this posts includes intermediate Haskell developers that want to know more about specialization and ad-hoc polymorphism in GHC, and advanced Haskell developers that are interested in systematic approaches to specializing their applications in ways that minimize compilation cost and executable sizes while maximizing performance gains.

This work was made possible thanks to Hasura, who have supported many of Well-Typed’s initiatives to improve tooling for commercial Haskell users.

I presented a summary of the content in Part 1 of this series on The Haskell Unfolder. Feel free to watch it for a brief refresher on what we have learned so far:

The Haskell Unfolder Episode 23: specialisation

Overloaded functions are common in Haskell, but they come with a cost. Thankfully, the GHC specialiser is extremely good at removing that cost. We can therefore write high-level, polymorphic programs and be confident that GHC will compile them into very efficient, monomorphised code. In this episode, we’ll demystify the seemingly magical things that GHC is doing to achieve this.


Choreographing a dance with the GHC specializer (Part 1)

A cross-post of my blog post for Well-Typed

April 15, 2024
ghc1haskell1performance1unfolder1youtube1

Specialization is an optimization technique used by GHC to eliminate the performance overhead of ad-hoc polymorphism and enable other powerful optimizations. However, specialization is not free, since it requires more work by GHC during compilation and leads to larger executables. In fact, excessive specialization can result in significant increases in compilation cost and executable size with minimal runtime performance benefits. For this reason, GHC pessimistically avoids excessive specialization by default and may leave relatively low-cost performance improvements undiscovered in doing so.

Optimistic Haskell programmers hoping to take advantage of these missed opportunities are thus faced with the difficult task of discovering and enacting an optimal set of specializations for their program while balancing any performance improvements with the increased compilation costs and executable sizes. Until now, this dance was a clunky one involving desperately wading through GHC Core dumps only to come up with a precarious, inefficient, unmotivated set of pragmas and/or GHC flags that seem to improve performance.

In this two-part series of posts, I describe the recent work we have done to improve this situation and make optimal specialization of Haskell programs more of a science and less of a dark art. In this first post, I will

  • give a comprehensive introduction to GHC’s specialization optimization,
  • explore the various facilities that GHC provides for observing and controlling it, and
  • present a simple framework for thinking about the trade-offs of specialization.

In the next post of the series, I will

  • present the new tools and techniques we have developed to diagnose performance issues resulting from ad-hoc polymorphism,
  • demonstrate how these new tools can be used to systematically identify useful specializations, and
  • make sense of their impact in terms of the framework described in this post.

The intended audience of this post includes intermediate Haskell developers who want to know more about specialization and ad-hoc polymorphism in GHC, and advanced Haskell developers who are interested in systematic approaches to specializing their applications in ways that minimize compilation cost and executable sizes while maximizing performance gains.

This work was made possible thanks to Hasura, who have supported many of Well-Typed’s successful initiatives to improve tooling for commercial Haskell users.

I presented a summary of the content in this post on The Haskell Unfolder:

The Haskell Unfolder Episode 23: specialisation

Overloaded functions are common in Haskell, but they come with a cost. Thankfully, the GHC specialiser is extremely good at removing that cost. We can therefore write high-level, polymorphic programs and be confident that GHC will compile them into very efficient, monomorphised code. In this episode, we’ll demystify the seemingly magical things that GHC is doing to achieve this.