I have officially released ebird-haskell: a set of libraries and tools for working with eBird data in Haskell. Specifically, there are three components:

  1. ebird-api: A library that provides a complete description of the public eBird API as a servant API type. It also provides types for the litany of values that the eBird API communicates in, and convenient instances and functions for operating on those types.
  2. ebird-client: A library that provides functions for querying any endpoint of the eBird API, based on the description in the ebird-api library.
  3. ebird-cli: An executable command-line utility that can query any endpoint of the eBird API and pretty-print the response data.

This post serves as announcement of these tools (a “call for users”, if you will) and an informal tutorial to help birders turned Haskell programmers or Haskell programmers turned birders get started.

What is eBird?

eBird is a massive collection of ornithological science projects developed by the Cornell Lab of Ornithology. The eBird application is a mobile and web application that allows birders to easily contribute their past and present observations to eBird’s database. Using the huge amount of data1 that eBird collects and maintains, scientists are able to make conclusions that inform and improve environmental conservation efforts across the globe. eBird is a great example of a citizen science project.

Accessing eBird data

eBird data is not only made available to scientists. Anyone can run simple queries against the latest data through their public web API (as we will see in this post), or create an eBird account to download the bulk data from their website.

A majority of the endpoints on the public eBird API require an API key, which can be obtained by requesting one here. My request was granted in under an hour.

Getting started with ebird-cli

ebird-cli is essentially a direct command-line interface to the eBird API. It can query every endpoint of the eBird API, and supports every query parameter for each endpoint. In this section, I’ll explain how to install ebird-cli and use it to retrieve data from the eBird API.

Installation

I have not yet gone through the trouble of properly packaging and distributing ebird-cli2 so for now it must be installed with cabal, which itself can be installed using ghcup.

Once you have the prerequisites installed, you can install ebird-cli with the following command (the first is only necessary if you have a new cabal installation or your package index is very out of date):

cabal update
cabal install ebird-cli

By default, the executable will be placed in $HOME/.cabal/bin, so make sure that directory is on your $PATH. For more information on using cabal in this manner, see the user’s guide.

Usage

To ensure your installation is working as expected and begin familiarizing yourself with the tool, try the --help flag:

ebird-cli --help

At the time of writing, this command yields output that looks something like:

ebird-cli - Go birding on your command line!

Usage: ebird-cli [-k|--api-key API_KEY] COMMAND

  Query the official eBird API

Available options:
  -k,--api-key API_KEY     Specify an eBird API key
  -h,--help                Show this help text

Observation commands:
  observations             Get recent observations within a region
  ...

Product commands:
  recent-checklists        Get recent checklists within a region
  ...

Hotspot commands:
  region-hotspots          Get a list of hotspots in one or more regions
  ...

Taxonomy commands:
  taxonomy                 Get any version of the eBird taxonomy
  ...

Region commands:
  region-info              Get information about a region
  ...

As the output above suggests, there are five sections of subcommands. Each section roughly corresponds to a section of the eBird API as described by their documentation. Additionally, each subcommand has its own --help flag that outputs specific usage information for that command. For example, ebird-cli observations --help yields output like the following:

Usage: ebird-cli observations --region REGION_CODE [--back N]
                              [--taxonomy-categories CATEGORIES]
                              [--only-hotspots] [--include-provisional]
                              [--max-results N] [--extra-regions REGION_CODE]
                              [--spp-locale LOCALE]

  Get recent observations within a region

Available options:
  --region REGION_CODE     Specify the regions to fetch observations from (e.g.
                           "US-WY,US-CO,US-ID" or "US-CA-037")
  --back N                 Only fetch observations submitted within the last N
                           days (1 - 30, default: 14)
  --taxonomy-categories CATEGORIES
                           Specify a list of one or more taxonomy categories to
                           include observations of (e.g. "issf" or "hybrid")
                           (default: all categories)
  --only-hotspots          Only include observations from hotspots
  --include-provisional    Include observations which have not yet been reviewed
  --max-results N          Specify the max number of observations to include (1
                           to 10000, default: all)
  --extra-regions REGION_CODE
                           Up to 10 extra regions to fetch observations from
  --spp-locale LOCALE      Specify a locale to use for common names
  -h,--help                Show this help text

A simple example

Let’s use this information to come up with an ebird-cli invocation that will output the 5 most recent observations available from Deschutes County, Oregon. To do this, we need to determine the eBird “region code” corresponding to Deschutes County in Oregon.

Region codes

Region codes are custom values used by eBird to identify geographic regions of varying specificity. The region code corresponding to the entire world is the value “world”. The world region is segmented into countries, for example “US” for the United States or “NL” for The Netherlands. Two region codes separated by a comma forms a new region code that identifies the combination of their geographic regions. For example, “US,NL” identifies a geographic region that includes the United States and The Netherlands.

Countries are further segmented into states. For example, Oregon’s region code is US-OR; and Alberta, Canada’s CA-AB. Finally, states are segmented into counties. For example, Albany County, Wyoming’s region code is US-WY-0013.

Sometimes we need to specify whether we expect the eBird API to give us country, state, or county regions. To do this, we must specify the region type we are interested in. Country regions are simply referred to by the “country” region type, state regions are referred to by the “subnational1” region type, and county regions are referred to by the “subnational2” region type.

Given the above, we know that the region code of Deschutes County, Oregon should be a subnational2 region code that looks something like US-OR-XYZ where XYZ is the county number of Deschutes County. Since it’s not obvious what the county number of Deschutes County is, we can use the subregions command to list all the subnational2 subregions of US-OR, i.e. all the counties in Oregon:

ebird-cli subregions -k YOUR_API_KEY_HERE  --region US-OR --region-type subnational2

To avoid repeatedly pasting your API key for the -k flag, write your API key in a file located at ~/.ebird/key.txt. The key will be automatically read from this file by ebird-cli for any command that requires it.

The output of the above command includes the following entry:

    {
        "code": "US-OR-017",
        "name": "Deschutes"
    }

Knowing that the region code for Deschutes County in Oregon is US-OR-017, we can build the command that will get us the 5 most recent observations in that region:

ebird-cli observations --region US-OR-017 --max-results 5

At the time of writing, this command yields the following output:

[
    {
        "comName": "Ring-necked Duck",
        "howMany": 2,
        "lat": 44.069084,
        "lng": -121.163923,
        "locId": "L26617295",
        "locName": "Stenkamp",
        "locationPrivate": true,
        "obsDt": "2023-11-19 08:06",
        "obsReviewed": false,
        "obsValid": true,
        "sciName": "Aythya collaris",
        "speciesCode": "rinduc",
        "subId": "S154767518"
    },
    {
        "comName": "Bufflehead",
        "howMany": 1,
        "lat": 44.069084,
        "lng": -121.163923,
        "locId": "L26617295",
        "locName": "Stenkamp",
        "locationPrivate": true,
        "obsDt": "2023-11-19 08:06",
        "obsReviewed": false,
        "obsValid": true,
        "sciName": "Bucephala albeola",
        "speciesCode": "buffle",
        "subId": "S154767518"
    },
    {
        "comName": "California Scrub-Jay",
        "howMany": 2,
        "lat": 44.069084,
        "lng": -121.163923,
        "locId": "L26617295",
        "locName": "Stenkamp",
        "locationPrivate": true,
        "obsDt": "2023-11-19 08:06",
        "obsReviewed": false,
        "obsValid": true,
        "sciName": "Aphelocoma californica",
        "speciesCode": "cowscj1",
        "subId": "S154767518"
    },
    {
        "comName": "American Robin",
        "howMany": 4,
        "lat": 44.069084,
        "lng": -121.163923,
        "locId": "L26617295",
        "locName": "Stenkamp",
        "locationPrivate": true,
        "obsDt": "2023-11-19 08:06",
        "obsReviewed": false,
        "obsValid": true,
        "sciName": "Turdus migratorius",
        "speciesCode": "amerob",
        "subId": "S154767518"
    },
    {
        "comName": "Townsend's Solitaire",
        "howMany": 1,
        "lat": 44.069084,
        "lng": -121.163923,
        "locId": "L26617295",
        "locName": "Stenkamp",
        "locationPrivate": true,
        "obsDt": "2023-11-19 08:06",
        "obsReviewed": false,
        "obsValid": true,
        "sciName": "Myadestes townsendi",
        "speciesCode": "towsol",
        "subId": "S154767518"
    }
]

All of these observations happen to have been submitted on the same checklist, which is why their locations and times are all equivalent. While many of these fields are self-explanatory, the documentation for the Observation type in ebird-api may make it more clear what some of these fields are.

Footnotes