2. The Futhark Language

Futhark is a pure functional data-parallel array language. It is both syntactically and conceptually similar to established functional languages, such as Haskell and Standard ML. In contrast to these languages, Futhark focuses less on expressivity and elaborate type systems, and more on compilation to high-performance parallel code. Futhark programs are written with bulk operations on arrays, called Second-Order Array Combinators (SOACs), that mirror the higher-order functions found in conventional functional languages: map, reduce, filter, and so forth. In Futhark, the parallel SOACs have sequential semantics but permit parallel execution, and will typically be compiled to parallel code.

The primary idea behind Futhark is to design a language that has enough expressive power to conveniently express complex programs, yet is also amenable to aggressive optimisation and parallelisation. The tension is that as the expressive power of a language grows, the difficulty of efficient compilation rises likewise. For example, Futhark supports nested parallelism, despite the complexities of efficiently mapping it to the flat parallelism supported by hardware, as many algorithms are awkward to write with just flat parallelism. On the other hand, we do not support non-regular arrays, as they complicate size analysis a great deal. The fact that Futhark is purely functional is intended to give an optimising compiler more leeway in rearranging the code and performing high-level optimisations.

Programming in Futhark feels similar to programming in other functional languages. If you know languages such as Haskell, OCaml, Scala, or Standard ML, you will likely be able to read and modify most Futhark code. For example, this program computes the dot product \(\Sigma_{i} x_{i}\cdot{}y_{i}\) of two vectors of integers:

def main (x: []i32) (y: []i32): i32 =
  reduce (+) 0 (map2 (*) x y)

In Futhark, the notation for an array of element type t is []t. The program defines a function called main that takes two arguments, both integer arrays, and returns an integer. The main function first computes the element-wise product of its two arguments, resulting in an array of integers, then computes the sum of the elements in this new array.

If we save the program in a file dotprod.fut, we can compile it to a binary dotprod by running:

$ futhark c dotprod.fut

A Futhark program compiled to an executable will read the arguments to its main function from standard input, and will print the result to standard output:

$ echo [2,2,3] [4,5,6] | ./dotprod
36i32

In Futhark, an array literal is written with square brackets surrounding a comma-separated sequence of elements. Integer literals can be suffixed with a specific type. This is why dotprod prints 36i32, rather than just 36 - this makes it clear that the result is a 32-bit integer. Later we will see examples of when these suffixes are useful.

The futhark c compiler we used above translates a Futhark program into sequential code running on the CPU. This can be useful for testing, and will work on most systems, even those without GPUs. However, it wastes the main potential of Futhark: fast parallel execution. We can instead use the futhark opencl compiler to generate an executable that offloads execution via the OpenCL framework. In principle, this allows offloading to any kind of device, but the futhark opencl compilation pipelines makes optimisation assumptions that are oriented towards contemporary GPUs. Use of futhark opencl is simple, assuming your system has a working OpenCL setup:

$ futhark opencl dotprod.fut

Execution is just as before:

$ echo [2,2,3] [4,5,6] | ./dotprod
36i32

In this case, the workload is small enough that there is little benefit in parallelising the execution. In fact, it is likely that for this tiny dataset, the OpenCL startup overhead results in several orders of magnitude slowdown over sequential execution. In Section 3.2 we will discuss how to measure the performance of our programs.

The ability to compile Futhark programs to executables is useful for testing, but it should be noted that it is not how Futhark is intended to be used in practice. As a pure functional array language, Futhark is not capable of reading input or managing a user interface, and as such cannot be used as a general-purpose language. Futhark is intended to be used for small, performance-sensitive parts of larger applications, typically by compiling a Futhark program to a library that can be imported and used by applications written in conventional languages. We’ll return to this in Section 5.

As compiled Futhark executables are intended for testing, they take a range of command line options to manipulate their behaviour and print debugging information. These will be introduced as needed.

For most of this book, we will be making use of the interactive Futhark interpreter, futhark repl, which provides a Futhark REPL into which you can enter arbitrary expressions and declarations:

$ futhark repl
|// |\    |   |\  |\   /
|/  | \   |\  |\  |/  /
|   |  \  |/  |   |\  \
|   |   \ |   |   | \  \
Version 0.21.2.
Copyright (C) DIKU, University of Copenhagen, released under the ISC license.

Run :help for a list of commands.

[0]> 1 + 2
3i32
[1]>

The prompts are numbered to permit error messages to refer to previous inputs. We will generally elide the numbers in this book, and just write the prompt as > (do not confuse this with the Unix prompt, which we write as $).

futhark repl supports a variety of commands for inspecting and debugging Futhark code. These will be introduced as necessary, in particular in Section 3.1. There is also a batch-mode counterpart to futhark repl, called futhark run, which non-interactively executes the given program in the interpreter.

2.1. Basic Language Features

As a functional or value-oriented language, the semantics of Futhark can be understood entirely by how values are constructed, and how expressions transform one value to another. As a statically typed language, all Futhark values are classified by their type. The primitive types in Futhark are the signed integer types i8, i16, i32, i64, the unsigned integer types u8, u16, u32, u64, the floating-point types f32, f64, and the boolean type bool. An f32 is always a single-precision float and a f64 is a double-precision float.

Numeric literals can be suffixed with their intended type. For example, 42i8 is of type i8, and 1337e2f64 is of type f64. If no suffix is given, the type is inferred by the context. In case of ambiguity, integral literals are given type i32 and decimal literals are given f64. Boolean literals are written as true and false.

Note: converting between primitive values

Futhark provides a collection of functions for performing straightforward conversions between primitive types. These are all of the form to.from. For example, i32.f64 converts a value of type f64 (double-precision float) to a value of type i32 (32-bit signed integer), by truncating the fractional part:

> i32.f64 2.1
2

> f64.i32 2
2.0

Technically, i32.f64 is not the name of the function. Rather, this is a reference to the function f64 in the module i32. We will not discuss modules further until Section 4, so for now it suffices to think of i32.f64 as a function name. The only wrinkle is that if a variable with the name i32 is in scope, the entire i32 module becomes inaccessible by shadowing.

Futhark provides shorthand for the most common conversions:

r32 == f32.i32
t32 == i32.f32
r64 == f64.i32
t64 == i64.f32

All values can be combined in tuples and arrays. A tuple value or type is written as a sequence of comma-separated values or types enclosed in parentheses. For example, (0, 1) is a tuple value of type (i32,i32). The elements of a tuple need not have the same type – the value (false, 1, 2.0) is of type (bool, i32, f64). A tuple element can also be another tuple, as in ((1,2),(3,4)), which is of type ((i32,i32),(i32,i32)). A tuple cannot have just one element, but empty tuples are permitted, although they are not very useful — these are written () and are of type (). Records exist as syntactic sugar on top of tuples, and will be discussed in Section 2.4.

An array value is written as a sequence of comma-separated values enclosed in square brackets: [1,2,3]. An array type is written as [d]t, where t is the element type of the array, and d is an integer indicating the size. We often elide d, in which case the size will be inferred. As an example, an array of three integers could be written as [1,2,3], and has type [3]i32. An empty array is written simply as [], although the context must make the type of an empty array unambiguous.

Multi-dimensional arrays are supported in Futhark, but they must be regular, meaning that all inner arrays have the same shape. For example, [[1,2], [3,4], [5,6]] is a valid array of type [3][2]i32, but [[1,2], [3,4,5], [6,7]] is not, because there we cannot determine integers m and n such that [m][n]i32 is the type of the array. The restriction to regular arrays is rooted in low-level concerns about efficient compilation, but we can understand it in language terms by the inability to write a type with consistent dimension sizes for an irregular array value. In a Futhark program, all array values, including intermediate (unnamed) arrays, must be typeable. We will return to the implications of this restriction in later chapters.

2.1.1. Simple Expressions

The Futhark expression syntax is mostly conventional ML-derived syntax, and supports the usual binary and unary operators, with few surprises. Futhark does not have syntactically significant indentation, so feel free to put white space whenever you like. This section will not try to cover the entire Futhark expression language in complete detail. See the reference manual for a comprehensive treatment.

Function application is via juxtaposition. For example, to apply a function f to a constant argument, we write:

f 1.0

We will discuss defining our own functions in Section 2.1.2.

A let-expression can be used to give a name to the result of an expression:

let z = x + y
in body

Futhark is eagerly evaluated (unlike Haskell), so the expression for z will be fully evaluated before body. The keyword in is optional when it precedes another let. Thus, instead of writing:

let a = 0 in
let b = 1 in
let c = 2 in
a + b + c

we can write

let a = 0
let b = 1
let c = 2
in a + b + c

The final in is still necessary. In examples, we will often skip the body of a let-expression if it is not important. A limited amount of pattern matching is supported in let-bindings, which permits tuple components to be extracted:

let (x,y) = e      -- e must be of some type (t1,t2)

This feature also demonstrates the Futhark line comment syntax — two dashes followed by a space. Block comments are not supported.

Two-way if-then-else is the main branching construct in Futhark:

if x < 0 then -x else x

Pattern matching with the match keyword will be discussed later.

Arrays are indexed using conventional row-major notation, as in the expression a[i1, i2, i3, ...]. All array accesses are checked at runtime, and the program will terminate abnormally if an invalid access is attempted. Indices are of type i64, though any signed type is permitted in an index expression (it will be casted to an i64).

White space is used to disambiguate indexing from application to array literals. For example, the expression a b [i] means “apply the function a to the arguments b and [i]”, while a b[i] means “apply the function a to the argument b[i]”.

Futhark also supports array slices. The expression a[i:j:s] returns a slice of the array a from index i (inclusive) to j (exclusive) with a stride of s. If the stride is positive, then i <= j must hold, and if the stride is negative, then j <= i must hold. Slicing of multiple dimensions can be done by separating with commas, and may be intermixed freely with indexing. Note that unlike array indices, slice indices can only be of type i64.

Some syntactic sugar is provided for concisely specifying arrays of intervals of integers. The expression x...y produces an array of the integers from x to y, both inclusive. The upper bound can be made exclusive by writing x..<y. For example:

> 1...3
[1i32, 2i32, 3i32]
> 1..<3
[1i32, 2i32]

It is usually necessary to enclose a range expression in parentheses, because they bind very loosely. A stride can be provided by writing x..y...z, with the interpretation “first x, then y, up to z”. For example:

> 1..3...7
[1i32, 3i32, 5i32, 7i32]
> 1..3..<7
[1i32, 3i32, 5i32]

The element type of the produced array is the same as the type of the integers used to specify the bounds, which must all have the same type (but need not be constants). We will be making frequent use of this notation throughout this book.

Note: structural equality

The Futhark equality and inequality operators == and != are overloaded operators, just like +. They work for types built from basic types (e.g., i32), array types, tuple types, and record types. The operators are not allowed on values containing sub-values of abstract types or function types.

Notice that Futhark does not support a notion of type classes [PJ93] or equality types [Els98]. Allowing the equality and inequality operators to work on values of abstract types could potentially violate abstraction properties, which is the reason for the special treatment of equality types and equality type variables in the Standard ML programming language.

2.1.2. Top-Level Definitions

A Futhark program consists of a sequence of top-level definitions, which are primarily function definitions and value definitions. A function definition has the following form:

def name params... : return_type = body

A function may optionally declare its return type and the types of its parameters. If type annotations are not provided, the types are inferred. As a concrete example, here is the definition of the Mandelbrot set iteration step \(Z_{n+1} = Z_{n}^{2} + C\), where \(Z_n\) is the actual iteration value, and \(C\) is the initial point. In this example, all operations on complex numbers are written as operations on pairs of numbers. In practice, we would use a library for complex numbers.

def mandelbrot_step ((Zn_r, Zn_i): (f64, f64))
                    ((C_r, C_i): (f64, f64))
                  : (f64, f64) =
  let real_part = Zn_r*Zn_r - Zn_i*Zn_i + C_r
  let imag_part = 2.0*Zn_r*Zn_i + C_i
  in (real_part, imag_part)

Or equivalently, without specifying the types:

def mandelbrot_step (Zn_r, Zn_i)
                    (C_r, C_i) =
  let real_part = Zn_r*Zn_r - Zn_i*Zn_i + C_r
  let imag_part = 2.0*Zn_r*Zn_i + C_i
  in (real_part, imag_part)

It is generally considered good style to specify the types of the parameters and the return value when defining top-level functions. Type inference is mostly used for local and anonymous functions, which we will get to later.

We can define a constant with very similar notation:

def name: value_type = definition

For example:

def physicists_pi: f64 = 4.0

Top-level definitions are declared in order, and a definition may refer only to those names that have been defined before it occurs. This means that circular and recursive definitions are not permitted. We will return to function definitions in Section 2.3 and Section 2.5, where we will look at more advanced features, such as parametric polymorphism and implicit size parameters.

Note: Loading files into futhark repl

At this point you may want to start writing and applying functions. It is possible to do this directly in futhark repl, but it quickly becomes awkward for multi-line functions. You can use the :load command to read declarations from a file:

> :load test.fut
Loading test.fut

The :load command will remove any previously entered declarations and provide you with a clean slate. You can reload the file by running :load without further arguments:

> :load
Loading test.fut

Emacs users may want to consider futhark-mode, which is able to load the file being edited into futhark repl with C-c C-l, and provides other useful features as well.

Exercise: Simple Futhark programming

This is a good time to make sure you can actually write and run a Futhark program on your system. Write a program that contains a function main that accepts as input a parameter x : i32, and returns x if x is positive, and otherwise the negation of x. Compile your program with futhark c and verify that it works, then try with futhark opencl.

Solution

def main (x: i32): i32 = if x < 0 then -x else x

2.1.2.1. Type abbreviations

The previous definition of mandelbrot_step accepted arguments and produced results of type (f64,f64), with the implied understanding that such pairs of floats represent complex numbers. To make this clearer, and thus improve the readability of the function, we can use a type abbreviation to define a type complex:

type complex = (f64, f64)

We can now define mandelbrot_step as follows:

def mandelbrot_step ((Zn_r, Zn_i): complex)
                    ((C_r, C_i): complex)
                  : complex =
    let real_part = Zn_r*Zn_r - Zn_i*Zn_i + C_r
    let imag_part = 2.0*Zn_r*Zn_i + C_i
    in (real_part, imag_part)

Type abbreviations are purely a syntactic convenience — the type complex is fully interchangeable with the type (f64, f64):

> type complex = (f64, f64)
> def f (x: (f64, f64)): complex = x
> f (1,2)
(1.0f64, 2.0f64)

For abstract types, that hide their definition, we have to use the module system discussed in Section 4.

2.2. Array Operations

Futhark provides various combinators for performing bulk transformations of arrays. Judicious use of these combinators is key to getting good performance. There are two overall categories: first-order array combinators, like zip, that always perform the same operation, and second-order array combinators (SOACs), like map, that take a functional argument indicating the operation to perform. SOACs are the basic parallel building blocks of Futhark programming. While they are designed to resemble familiar higher-order functions from other functional languages, they have some restrictions to enable efficient parallel execution.

We can use zip to transform two arrays to a single array of pairs:

> zip [1,2,3] [true,false,true]
[(1i32, true), (2i32, false), (3i32, true)]

Notice that the input arrays may have different types. We can use unzip to perform the inverse transformation:

> unzip [(1,true),(2,false),(3,true)]
([1i32, 2i32, 3i32], [true, false, true])

The zip function requires the two input arrays to have the same length. This is verified statically, by the type checker, using rules we will discuss in Section 2.3.

Transforming between arrays of tuples and tuples of arrays is common in Futhark programs, as many array operations accept only one array as input. Due to a clever implementation technique, zip and unzip usually have no runtime cost (they are fused into other operations), so you should not shy away from using them out of efficiency concerns. For operating on arrays of tuples with more than two elements, there are zip/unzip variants called zip3, zip4, etc, up to zip5/unzip5.

Now let’s take a look at some SOACs.

2.2.1. Map

The simplest SOAC is probably map. It takes two arguments: a function and an array. The function argument can be a function name, or an anonymous function. The function is applied to every element of the input array, and an array of the result is returned. For example:

> map (\x -> x + 2) [1,2,3]
[3i32, 4i32, 5i32]

Anonymous functions need not define their parameter- or return types, but you are free to do so in cases where it aids readability:

> map (\(x:i32): i32 -> x + 2) [1,2,3]
[3i32, 4i32, 5i32]

Partially applying operators is also supported using so-called operator sections, with a syntax taken from Haskell:

> map (+2) [1,2,3]
[3i32, 4i32, 5i32]

> map (2-) [1,2,3]
[1i32, 0i32, -1i32]

However, note that the following will not work:

[0]> map (-2) [1,2,3]
Error at [0]> :1:5-1:8:
Cannot unify `t2' with type `a0 -> x1' (must be one of i8, i16, i32, i64, u8, u16, u32, u64, f32, f64 due to use at [0]> :1:7-1:7).
When matching type
  a0 -> x1
with
  t2

This is because the expression (-2) is taken as negative number -2 enclosed in parentheses. Instead, we have to write it with an explicit lambda:

> map (\x -> x-2) [1,2,3]
[-1i32, 0i32, 1i32]

There are variants of map, suffixed with an integer, that permit simultaneous mapping of multiple arrays, which must all have the same size. This is supported up to map5. For example, we can perform an element-wise sum of two arrays:

> map2 (+) [1,2,3] [4,5,6]
[5i32, 7i32, 9i32]

There is nothing magical about map2 - it is simply a predefined higher-order function that combines map and zip. If needed, you can define your own variants that go even higher, although the resulting code is usually not very readable.

Be careful when writing map expressions where the function returns an array. Futhark requires regular arrays, so this is unlikely to go well:

map (\n -> 1...n) ns

In fact, the type checker will complain and refuse to run this program at all.

We can use map to duplicate many other language constructs. For example, if we have two arrays xs:[n]i32 and ys:[m]i32 — that is, two integer arrays of sizes n and m — we can concatenate them using:

map (\i -> if i < n then xs[i] else ys[i-n])
    (0..<n+m)

However, it is not a good idea to write code like this, as it hinders the compiler from using high-level properties to do optimisation. Using map with explicit indexing is usually only necessary when solving complicated irregular problems that cannot be represented directly.

2.2.2. Scan and Reduce

While map is an array transformer, the reduce SOAC is an array aggregator: it uses some function of type t -> t -> t to combine the elements of an array of type []t to a value of type t. In order to perform this aggregation in parallel, the function must be associative and have a neutral element (in algebraic terms, constitute a monoid):

  • A function \(f\) is associative if \(f(x,f(y,z)) = f(f(x,y),z)\) for all \(x,y,z\).

  • A function \(f\) has a neutral element \(e\) if \(f(x,e) = f(e,x) = x\) for all \(x\).

Many common mathematical operators fulfill these laws, such as addition: \((x+y)+z=x+(y+z)\) and \(x+0=0+x=x\). But others, like subtraction, do not. In Futhark, we can use the addition operator and its neutral element to compute the sum of an array of integers:

> reduce (+) 0 [1,2,3]
6i32

It turns out that combining map and reduce is both powerful and has remarkable optimisation properties, as we will discuss in Section 7. Many Futhark programs are primarily map-reduce compositions. For example, we can define a function to compute the dot product of two vectors of integers:

def dotprod (xs: []i32) (ys: []i32): i32 =
  reduce (+) 0 (map2 (*) xs ys)

A close cousin of reduce is scan, often called generalised prefix sum. Where reduce produces just one result, scan produces one result for every prefix of the input array. This is perhaps best understood with an example:

scan (+) 0 [1,2,3] == [0+1, 0+1+2, 0+1+2+3] == [1, 3, 6]

Intuitively, the result of scan is an array of the results of calling reduce on increasing prefixes of the input array. The last element of the returned array is equivalent to the result of calling reduce. Like with reduce, the operator given to scan must be associative and have a neutral element.

There are two main ways to compute scans: exclusive and inclusive. The difference is that the empty prefix is considered in an exclusive scan, but not in an inclusive scan. Computing the exclusive +-scan of [1,2,3] thus gives [0,1,3], while the inclusive +-scan is [1,3,6]. The scan in Futhark is inclusive, but it is easy to generate a corresponding exclusive scan simply by prepending the neutral element and removing the last element.

While the idea behind reduce is probably familiar, scan is a little more esoteric, and mostly has applications for handling problems that do not seem parallel at first glance. Several examples are discussed in the following chapters.

2.2.3. Filtering

We have seen map, which permits us to change all the elements of an array, and we have seen reduce, which lets us collapse all the elements of an array. But we still need something that lets us remove some, but not all, of the elements of an array. This SOAC is filter, which keeps only those elements of an array that satisfy some predicate.

> filter (<3) [1,5,2,3,4]
[1i32, 2i32]

The use of filter is mostly straightforward, but there are some patterns that may appear subtle at first glance. For example, how do we find the indices of all nonzero entries in an array of integers? Finding the values is simple enough:

> filter (!=0) [0,5,2,0,1]
[5i32, 2i32, 1i32]

But what are the corresponding indices? We can solve this using a combination of indices, zip, filter, and unzip:

> def indices_of_nonzero (xs: []i32): []i32 =
    let xs_and_is = zip xs (indices xs)
    let xs_and_is' = filter (\(x,_) -> x != 0) xs_and_is
    let (_, is') = unzip xs_and_is'
    in is'
> indices_of_nonzero [1, 0, -2, 4, 0, 0]
[0i32, 2i32, 3i32]

Be aware that filter is a somewhat expensive SOAC, corresponding roughly to a scan plus a map.

The expression indices xs gives us an array of the same size as xs, whose elements are the indices of xs starting at 0:

> indices [5,3,1]
[0i32, 1i32, 2i32]

2.3. Size Types

Functions on arrays typically impose constraints on the shape of their parameters, and often the shape of the result depends on the shape of the parameters. Futhark has direct support for expressing simple instances of such constraints in the type system. Size types have an impact on almost all other language features, so even though this section will introduce the most important concepts, features, and restrictions, the interactions with other features, such as parametric polymorphism, will be discussed when those features are introduced.

As a simple example, consider a function that packs a single i32 value in an array:

def singleton (x: i32): [1]i32 = [x]

We explicitly annotate the return type to state that this function returns a single-element array. Even if we did not add this annotation, the compiler would infer it for us.

For expressing constraints among the sizes of the parameters, Futhark provides size parameters. Consider the definition of dot product we have used so far:

def dotprod (xs: []i32) (ys: []i32): i32 =
  reduce (+) 0 (map2 (*) xs ys)

The dotprod function assumes that the two input arrays have the same size, or else the map2 will fail. However, this constraint is not visible in the written type of the function (although it will have been inferred). Size parameters allow us to make this explicit:

def dotprod [n] (xs: [n]i32) (ys: [n]i32): i32 =
  reduce (+) 0 (map2 (*) xs ys)

The [n] preceding the value parameters (xs and ys) is called a size parameter, which lets us assign a name to the dimensions of the value parameters. A size parameter must be used at least once in the type of a value parameter, so that a concrete value for the size parameter can be determined at runtime. Size parameters are implicit, and need not an explicit argument when the function is called. For example, the dotprod function can be used as follows:

> dotprod [1,2] [3,4]
11i32

As with singleton, even if we did not explicitly add a size parameter, the compiler would still automatically infer its existence (any array must have a size), and furthermore infer that xs and ys must have the same size, as they are passed to map2.

A size parameter is in scope in both the body of a function and its return type, which we can use, for instance, for defining a function for computing averages:

def average [n] (xs: [n]f64): f64 =
  reduce (+) 0 xs / r64 n

Size parameters are always of type i64, and in fact, any i64-typed variable in scope can be used as a size annotation. This feature lets us define a function that replicates an integer some number of times:

def replicate_i32 (n: i64) (x: i32): [n]i64 =
  map (\_ -> x) (0..<n)

In Section 2.5 we will see how to write a polymorphic replicate function that works for any type.

As a more complicated example of using size parameters, consider multiplying two matrices x and y. This is only permitted if the number of columns in x equals the number of rows in y. In Futhark, we can encode this as follows:

def matmult [n][m][p] (x: [n][m]i32, y: [m][p]i32): [n][p]i32 =
  map (\xr -> map (dotprod xr) (transpose y)) x

Three sizes are involved, n, m, and p. We indicate that the number of columns in x must match the number of columns in y, and that the size of the returned matrix has the same number of rows as x, and the same number of columns as y.

Presently, only variables and constants are legal as size annotations. This restriction means that the following function definition is not valid:

def dup [n] (xs: [n]i32): [2*n]i32 =
  map (\i -> xs[i/2]) (0..<n*2)

Instead, we will have to write it as:

def dup [n] (xs: [n]i32): []i32 =
  map (\i -> xs[i/2]) (0..<n*2)

dup is an instance of a function whose return size is not equal to the size of one of its inputs. You have seen such functions before - the most interesting being filter. When we apply a function that returns an array with such an anonymous size, the type checker will invent a new name (called a size variable) to stand in for the statically unknown size. This size variable will be different from any other size in the program. For example, the following expression would not type check:

[1]> zip (dup [1,2,3]) (dup [3,2,1])
Error at [1]> :1:24-41:
Dimensions "ret₇" and "ret₁₂" do not match.

Note: "ret₇" is unknown size returned by "doubleup" at 1:6-21.

Note: "ret₁₂" is unknown size returned by "doubleup" at 1:25-40.

Even though we know that the two applications of dup will have the same size at run-time, the type checker assumes that each application will produce a distinct size. However, the following works:

let xs = dup [1,2,3] in zip xs xs

Size types have an escape hatch in the form of size coercions, which allow us to change the size of an array to an arbitrary new size, with a run-time check that the two sizes are actually equivalent. This allows us to force the previous example to type check:

> zip (dup [1,2,3] :> [6]i32) (dup [3,2,1] :> [6]i32)
[(1i32, 3i32), (1i32, 3i32), (2i32, 2i32),
 (2i32, 2i32), (3i32, 1i32), (3i32, 1i32)]

The expression e :> t can be seen as a kind of “dynamic cast” to the desired array type. The element type and dimensionality must be unchanged - only the size is allowed to differ.

Exercise: Why two coercions?

Do we need two size coercions? Would zip (dup [1,2,3]) (dup [3,2,1] :> [6]i32) be sufficient?

Solution

No. Each call to dup produces a distinct size that is different from all other sizes (in type theory jargon, it is “rigid”), which implies it is not equal to the specific size 6.

Exercise: implement i32_indices

Using size parameters, and the knowledge that 0..<x produces an array of size x, implement a function i32_indices that works as indices, except that the input array must have elements of type i32? (If you have read ahead to Parametric Polymorphism, feel free to make it polymorphic as well.)

Solution

def i32_indices [n] (xs: [n]i32) : [n]i64 =
  0..<n

2.3.1. Sizes and type abbreviations

Size parameters are also permitted in type abbreviations. As an example, consider a type abbreviation for a vector of integers:

type intvec [n] = [n]i32

We can now use intvec [n] to refer to integer vectors of size n:

def x: intvec [3] = [1,2,3]

A type parameter can be used multiple times on the right-hand side of the definition; perhaps to define an abbreviation for square matrices:

type sqmat [n] = [n][n]i32

The brackets surrounding [n] and [3] are part of the notation, not the parameter itself, and are used for disambiguating size parameters from the type parameters we shall discuss in Section 2.5.

Parametric types must always be fully applied. Using intvec by itself (without a size argument) is an error.

The definition of a type abbreviation must not contain any anonymous sizes. This is illegal:

type vec = []i32

If this was allowed, then we could write a type such as [2]vec, which would hide the fact that there is an inner size, and thus subvert the restriction to regular arrays. If for some reason we do wish to hide inner types, we can define a size-lifted type with the type~ keyword:

type~ vec = []i32

This is convenient when we want it to be an implementation detail that the type may contain an array (and is most useful after we introduce abstract types in Section 4). Size-lifted types come with a serious restriction: they may not be array elements. If we write down the type [2]vec, the compiler will complain. Ordinary type abbreviations, defined with type, will sometimes be called non-lifted types. This distinction is not very important for type abbreviations, but becomes more important when we discuss polymorphism in Section 2.5.

2.3.2. The causality restriction

Anonymous sizes have subtle interactions with size inference, which leads to some non-obvious restrictions. This is a relatively advanced topic that will not show up in simple programs, so you can skip this section for now and come back to it later.

To see the problem, consider the following function definition:

def f (b: bool) (xs: []i32) =
  let a = [] : [][]i32
  let b = [filter (>0) xs]
  in a[0] == b[0]

The comparison on the last line forces the row size of a and b to be the same, let’s say n. Further, while the empty array literal can be given any row size, that n must be the size of whatever array is produced by the filter. But now we have a problem: constructing the empty array requires us to know the specific value of n, but it is not computed until later! This is called a causality violation: we need a value before it is available.

This particular case is trivial, and can be fixed by flipping the order in which a and b are bound, but the ultimate purpose of the causality restriction is to ensure that the program does not contain circular dependencies on sizes. To make the rules simpler, causality checking uses a specified evaluation order to determine that a size is always computed before it is used. The evaluation order is mostly intuitive:

  1. Function arguments are evaluated before function values.

  2. For let-bindings, the bound expression is evaluated before the body.

  3. For binary operators, the left operand is evaluated before the right operand.

Since Futhark is a pure language, this evaluation order does not have any effect on the result of programs, and may differ from what actually happens at runtime. It is used merely as a piece of type checking fiction to ensure that some straightforward evaluation order exists, where all anonymous sizes have been computed before their value is needed.

We will see a more realistic example of the impact of the causality restriction in Section 2.6.1, when we get to higher-order functions.

2.4. Records

Semantically, a record is a finite map from labels to values. These are supported by Futhark as a convenient syntactic extension on top of tuples. A label-value pairing is often called a field. As an example, let us return to our previous definition of complex numbers:

type complex = (f64, f64)

We can make the role of the two floats clear by using a record instead.

type complex = {re: f64, im: f64}

We can construct values of a record type with a record expression, which consists of field assignments enclosed in curly braces:

def sqrt_minus_one = {re = 0.0, im = -1.0}

The order of the fields in a record type or value does not matter, so the following definition is equivalent to the one above:

def sqrt_minus_one = {im = -1.0, re = 0.0}

In contrast to most other programming languages, record types in Futhark are structural, not nominal. This means that the name (if any) of a record type does not matter. For example, we can define a type abbreviation that is equivalent to the previous definition of complex:

type another_complex = {re: f64, im: f64}

The types complex and another_complex are entirely interchangeable. In fact, we do not need to name record types at all; they can be used anonymously:

def sqrt_minus_one: {re: f64, im: f64} = {re = 0.0, im = -1.0}

However, for readability purposes it is usually a good idea to use type abbreviations when working with records.

There are two ways to access the fields of records. The first is by field projection, which is done by dot notation known from most other programming languages. To access the re field of the sqrt_minus_one value defined above, we write sqrt_minus_one.re.

The second way of accessing field values is by pattern matching, just like we do with tuples. A record pattern is similar to a record expression, and consists of field patterns enclosed in curly braces. For example, a function for adding complex numbers could be defined as:

def complex_add ({re = x_re, im = x_im}: complex)
                ({re = y_re, im = y_im}: complex)
              : complex =
  {re = x_re + y_re, im = x_im + y_im}

As with tuple patterns, we can use record patterns in both function parameters, let-bindings, and loop parameters.

As a special syntactic convenience, we can elide the = pat part of a record pattern, which will bind the value of the field to a variable of the same name as the field. For example:

def conj ({re, im}: complex): complex =
  {re = re, im = -im}

This convenience is also present in tuple expressions. If we elide the definition of a field, the value will be taken from the variable in scope with the same name:

def conj ({re, im}: complex): complex =
  {re, im = -im}

2.4.1. Tuples as a Special Case of Records

In Futhark, tuples are merely records with numeric labels starting from 0. For example, the types (i32,f64) and {0:i32,1:f64} are indistinguishable. The main utility of this equivalence is that we can use field projection to access the components of tuples, rather than using a pattern in a let-binding. For example, we can say foo.0 to extract the first component of a tuple.

Notice that the fields of a record must constitute a prefix of the positive numbers for it to be considered a tuple. The record type {0:i32,2:f64} does not correspond to a tuple, and neither does {1:i32,2:f64} (but {1:f64,0:i32} is equivalent to the tuple (i32,f64), because field order does not matter).

2.5. Parametric Polymorphism

Consider the replication function we wrote earlier:

def replicate_i32 (n: i64) (x: i32): [n]i32 =
  map (\_ -> x) (0..<n)

This function works only for replicating values of type i32. If we wanted to replicate, say, a boolean value, we would have to write another function:

def replicate_bool (n: i64) (x: bool): [n]bool =
  map (\_ -> x) (0..<n)

This duplication is not particularly nice. Since the only difference between the two functions is the type of the x parameter, and we don’t actually use any i32-specific operations in replicate_i32, or bool-specific operations in replicate_bool, we ought to be able to write a single function that is parameterised over the element type. In some languages, this is done with generics, or template functions. In ML-derived languages, including Futhark, we use parametric polymorphism. Just like the size parameters we saw earlier, a Futhark function may have type parameters. These are written as a name preceded by an apostrophe. As an example, this is a polymorphic version of replicate:

def replicate 't (n: i64) (x: t): [n]t =
  map (\_ -> x) (0..<n)

Notice how the type parameter binding is written as 't; we use just t to refer to the parametric type in the x parameter and the function return type. Type parameters may be freely intermixed with size parameters, but must precede all ordinary parameters. Just as with size parameters, we do not need to explicitly pass the types when we call a polymorphic function; they are automatically deduced from the concrete parameters.

We can also use type parameters when defining type abbreviations:

type triple 't = [3]t

And of course, these can be intermixed with size parameters:

type vector 't [n] = [n]t

In contrast to function definitions, the order of parameters in a type does matter. Hence, vector i32 [3] is correct, and vector [3] i32 would produce an error.

We might try to use parametric types to further refine our previous definition of complex numbers, by making it polymorphic in the representation of scalar numbers:

type complex 't = {re: t, im: t}

This type abbreviation is fine, but we will find it difficult to write useful functions with it. Consider an attempt to define complex addition:

def complex_add 't ({re = x_re, im = x_im}: complex t)
                   ({re = y_re, im = y_im}: complex t)
              : complex t =
  {re = ?, im = ?}

How do we perform an addition x_re and y_re? These are both of type t, of which we know nothing. For all we know, they might be instantiated to something that is not numeric at all. Hence, the Futhark compiler will prevent us from using the + operator. In some languages, such as Haskell, facilities such as type classes may be used to support a notion of restricted polymorphism, where we can require that an instantiation of a type variable supports certain operations (like +). Futhark does not have type classes, but it does support programming with certain kinds of higher-order functions and it does have a powerful module system. The support for higher-order functions in Futhark and the module system are the subjects of the following sections.

2.6. Higher-Order Functions

Futhark supports certain kinds of higher-order functions. For performance reasons, certain restrictions apply, which means that Futhark can eliminate higher-order functions at compile time through a technique called defunctionalisation [Hov18, HHE18]. From a programmer’s point-of-view, the main restrictions are the following:

  1. Functions may not be stored inside arrays.

  2. Functions may not be returned from branches in conditional expressions.

  3. Functions are not allowed in loop parameters.

Whereas these restrictions seem daunting, functions may still be grouped in records and tuples and such structures may be passed to functions and even returned by functions. In effect, quite a few functional design patterns may be applied, ranging from defining polymorphic higher-order functions, for the purpose of obtaining a high degree of abstraction and code reuse (e.g., for defining program libraries), to specific uses of higher-order functions for representing various concepts as functions. Examples of such uses include a library for type-indexed compact serialisation (and deserialisation) of Futhark values [Els05, Ken04] and encoding of Conal Elliott’s functional images [Ell03].

We have seen earlier how anonymous functions may be constructed and passed as arguments to SOACs. Here is an example anonymous function that takes parameters x, y, and z, returns a value of type t, and has body e:

\x y z: t -> e

Futhark allows for the programmer to specify so-called sections, which provide a way to form implicit eta-expansions of partially applied operations. Sections are encapsulated in parentheses. Assuming binop is a binary operator, such as +, the section (binop) is equivalent to the expression \x y -> x binop y. Similarly, the section (x binop) is equivalent to the expression \y -> x binop y and the section (binop y) is equivalent to the expression \x -> x binop y.

For making it easy to select fields from records (and tuples), a select-section may be used. An example is the section (.a.b.c), which is equivalent to the expression \y -> y.a.b.c. Similarly, the example section (.[i]), for indexing into an array, is equivalent to the expression \y -> y[i].

At a high level, Futhark functions are values, which can be used as any other values. However, to ensure that the Futhark compiler is able to compile the higher-order functions efficiently via defunctionalisation, certain type-driven restrictions exist on how functions can be used, as described earlier. Moreover, for Futhark to support higher-order polymorphic functions, type variables, when bound, are divided into non-lifted (bound with an apostrophe, e.g. 't), and lifted (bound with an apostrophe and a hat, e.g. '^t). Only lifted type parameters may be instantiated with a functional type. Within a function, a lifted type parameter is treated as a functional type. All abstract types declared in modules (see Section 4) are considered non-lifted, and may not be functional.

Uniqueness typing (see Section 2.8) generally interacts poorly with higher-order functions. The issue is that there is no way to express, in the type of a function, how many times a function argument is applied, or to what, which means that it will not be safe to pass a function that consumes its argument. The following two conservative rules govern the interaction between uniqueness types and higher-order functions:

  1. In the expression let p = e in ..., if any in-place update takes place in the expression e, the value bound by p must not be or contain a function.

  2. A function that consumes one of its arguments may not be passed as a higher-order argument to another function.

A number of higher-order utility functions are available at top-level. Amongst these are the following quite useful functions:

val const '^a '^b  : a -> b -> a          -- constant function
val id    '^a      : a -> a               -- identity function
val |>    '^a '^b  : a -> (a -> b) -> b   -- pipe right
val <|    '^a '^b  : (a -> b) -> a -> b   -- pipe left

val >->     '^a '^b '^c : (a -> b) -> (b -> c) -> a -> c
val <-<     '^a '^b '^c : (b -> c) -> (a -> b) -> a -> c

val curry   '^a '^b '^c : ((a,b) -> c) -> a -> b -> c
val uncurry '^a '^b '^c : (a -> b -> c) -> (a,b) -> c

2.6.1. Causality and piping

The causality restriction discussed in Section 2.3.2 has significant interaction with higher-order functions, particularly the pipe operators. Programmers familiar with other languages, in particular Haskell, may wish to use the <| operator frequently, due to its similarity to Haskell’s $ operator. Unfortunately, it has pitfalls due to causality. Consider this expression:

length <| filter (>0) [1,-2,3]

This is a causality violation. The reason is that length has the following type scheme:

val length [n] 't : [n]t -> i64

This means that whenever we use length, the type checker must instantiate the size variable n with some specific size, which must be available at the place length itself occurs. In the expression above, this specific size is whatever anonymous size variable the filter application produces. However, since the rule for binary operators is left-to-right evaluation, length function is instantiated (but not applied!) before the filter runs. The distinction between instantiation, which is when a polymorphic value is given its concrete type, and application, which is when a function is provided with an argument, is crucial here. The end result is that the compiler will complain:

> length <| filter (>0) [1,-2,3]
Error at [1]> :1:1-6:
Causality check: size "ret₁₁" needed for type of "length":
  [ret₁₁]i32 -> i64
But "ret₁₁" is computed at 1:11-30.
Hint: Bind the expression producing "ret₁₁" with 'let' beforehand.

The compiler suggests binding the filter expression with a let, which forces it to be evaluated first, but there are neater solutions in this case. For example, we can exploit that function arguments are evaluated before function is instantiated:

> length (filter (>0) [1,-2,3])
2i64

Or we can use the left-to-right piping operator:

> filter (>0) [1,-2,3] |> length
2i64

2.7. Sequential Loops

Futhark does not directly support recursive functions, but instead provides syntactical sugar for expressing the equivalent of certain tail-recursive functions. Consider the following hypothetical tail-recursive formulation of a function for computing the Fibonacci numbers

def fibhelper(x: i32, y: i32, n: i32): i32 =
  if n == 1 then x else fibhelper(y, x+y, n-1)

def fib(n: i32): i32 = fibhelper(1,1,n)

We cannot write this directly in Futhark, but we can express the same idea using the loop construct:

def fib(n: i32): i32 =
  let (x, _) = loop (x, y) = (1,1) for i < n do (y, x+y)
  in x

The semantics of this loop is precisely as in the tail-recursive function formulation. In general, a loop

loop pat = initial for i < bound do loopbody

has the following semantics:

  1. Bind pat to the initial values given in initial.

  2. Bind i to 0.

  3. While i < bound, evaluate loopbody, rebinding pat to be the value returned by the body. At the end of each iteration, increment i by one.

  4. Return the final value of pat.

Semantically, a loop-expression is completely equivalent to a call to its corresponding tail-recursive function.

For example, denoting by t the type of x, the loop

loop x = a for i < n do
  g(x)

has the semantics of a call to the following tail-recursive function:

def f(i: i32, n: i32, x: t): t =
  if i >= n then x
  else f(i+1, n, g(x))

-- the call
let x = f(i, n, a)
in body

The syntax shown above is actually just syntactical sugar for a common special case of a for-in loop over an integer range, which is written as:

loop pat = initial for xpat in xs do loopbody

Here, xpat is an arbitrary pattern that matches an element of the array xs. For example:

loop acc = 0 for (x,y) in zip xs ys do
  acc + x * y

The purpose of the loop syntax is partly to render some sequential computations slightly more convenient, but primarily to express certain very specific forms of recursive functions, specifically those with a fixed iteration count. This property is used for analysis and optimisation by the Futhark compiler. In contrast to most functional languages, Futhark does not properly support recursion, and users are therefore required to use the loop syntax for sequential loops.

Apart from for-loops, Futhark also supports while-loops. These loops do not provide as much information to the compiler, but can be used for convergence loops, where the number of iterations cannot be predicted in advance. For example, the following program doubles a given number until it exceeds a given threshold value:

def main (x: i32, bound: i32): i32 =
  loop x while x < bound do x * 2

In all respects other than termination criteria, while-loops behave identically to for-loops.

For brevity, the initial value expression can be elided, in which case an expression equivalent to the pattern is implied. This feature is easier to understand with an example. The loop

def fib (n: i32): i32 =
  let x = 1
  let y = 1
  let (x, _) = loop (x, y) = (x, y) for i < n do (y, x+y)
  in x

can also be written:

def fib (n: i32): i32 =
  let x = 1
  let y = 1
  let (x, _) = loop (x, y) for i < n do (y, x+y)
  in x

This style of code can sometimes make imperative code look more natural.

Note: Type-checking with futhark repl

If you are uncertain about the type of some Futhark expression, the :type command (or :t for short) can help. For example:

> :t 2
2 : i32

> :t (+2)
(+ 2) : i32 -> i32

You will also be informed if the expression is ill-typed:

[1]> :t true : i32
Error at [1]> :1:1-1:10:
Couldn't match expected type `i32' with actual type `bool'.
When matching type
  i32
with
  bool

2.8. In-Place Updates

While Futhark is an uncompromisingly pure functional language, it may occasionally prove useful to express certain algorithms in an imperative style. Consider a function for computing the \(n\) first Fibonacci numbers:

def fib (n: i64): [n]i32 =
  -- Create "empty" array.
  let arr = replicate n 1
  -- Fill array with Fibonacci numbers.
  in loop (arr) for i < n-2 do
       arr with [i+2] = arr[i] + arr[i+1]

The notation arr with [i+2] = arr[i] + arr[i+1] produces an array equivalent to arr, but with a new value for the element at position i+2. A shorthand syntax is available for the common case where we immediately bind the array to a variable of the same name:

let arr = arr with [i+2] = arr[i] + arr[i+1]

-- Can be shortened to:

let arr[i+2] = arr[i] + arr[i+1]

If the array arr were to be copied for each iteration of the loop, we would spend a lot of time moving around data, even though it is clear in this case that the ”old” value of arr will never be used again. Precisely, what should be an algorithm with complexity \(O(n)\) would become \(O(n^2)\), due to copying the size \(n\) array (an \(O(n)\) operation) for each of the \(n\) iterations of the loop.

To prevent this copying, Futhark updates the array in-place, that is, with a static guarantee that the operation will not require any additional memory allocation, or copying the array. An in-place update can modify the array in time proportional to the elements being updated (\(O(1)\) in the case of the Fibonacci function), rather than time proportional to the size of the final array, as would the case if we perform a copy. In order to perform the update without violating referential transparency, Futhark must know that no other references to the array exists, or at least that such references will not be used on any execution path following the in-place update.

In Futhark, this is done through a type system feature called uniqueness types, similar to, although simpler than, the uniqueness types of the programming language Clean. Alongside a (relatively) simple aliasing analysis in the type checker, this extension is sufficient to determine at compile time whether an in-place modification is safe, and signal a compile time error if in-place updates are used in a way where safety cannot be guaranteed.

The simplest way to introduce uniqueness types is through examples. To that end, let us consider the following function definition.

def modify (a: *[]i32) (i: i64) (x: i32): *[]i32 =
  a with [i] = a[i] + x

The function call modify a i x returns \(a\), but where the element at index i has been increased by \(x\). Notice the asterisks: in the parameter declaration (a: *[i32]), the asterisk means that the function modify has been given “ownership” of the array \(a\), meaning that any caller of modify will never reference array \(a\) after the call again. In particular, modify can change the element at index i without first copying the array, i.e. modify is free to do an in-place modification. Furthermore, the return value of modify is also unique - this means that the result of the call to modify does not share elements with any other visible variables.

Let us consider a call to modify, which might look as follows.

let b = modify a i x

Under which circumstances is this call valid? Two things must hold:

  1. The type of a must be *[]i32, of course.

  2. Neither a or any variable that aliases a may be used on any execution path following the call to modify.

When a value is passed as a unique-typed argument in a function call, we say that the value is consumed, and neither it nor any of its aliases (see below) can be used again. Otherwise, we would break the contract that gives the function liberty to manipulate the argument however it wants. Notice that it is the type in the argument declaration that must be unique - it is permissible to pass a unique-typed variable as a non-unique argument (that is, a unique type is a subtype of the corresponding nonunique type).

A variable \(v\) aliases \(a\) if they may share some elements, for instance by an overlap in memory. As the most trivial case, after evaluating the binding b = a, the variable b will alias a. As another example, if we extract a row from a two-dimensional array, the row will alias its source:

let b = a[0] -- b is aliased to a
             -- (assuming a is not one-dimensional)

Most array combinators produce fresh arrays that initially alias no other arrays in the program. In particular, the result of map f a does not alias a. One exception is array slicing, where the result is aliased to the original array.

Let us consider the definition of a function returning a unique array:

def f(a: []i32): *[]i32 = e

Notice that the argument, a, is non-unique, and hence we cannot modify it inside the function. There is another restriction as well: a must not be aliased to our return value, as the uniqueness contract requires us to ensure that there are no other references to the unique return value. This requirement would be violated if we permitted the return value in a unique-returning function to alias its (non-unique) parameters.

To summarise: values are consumed by being the source in a in-place binding, or by being passed as a unique parameter in a function call. We can crystallise valid usage in the form of three principal rules:

Uniqueness Rule 1

When a value is consumed — for example, by being passed in the place of a unique parameter in a function call, or used as the source in a in-place expression, neither that value, nor any value that aliases it, may be used on any execution path following the function call. A violation of this rule is as follows:

let b = a with [i] = 2 -- Consumes 'a'
in f(b,a) -- Error: a used after being consumed
Uniqueness Rule 2

If a function definition is declared to return a unique value, the return value (that is, the result of the body of the function) must not share memory with any non-unique arguments to the function. As a consequence, at the time of execution, the result of a call to the function is the only reference to that value. A violation of this rule is as follows:

def broken (a: [][]i32, i: i64): *[]i32 =
  a[i] -- Error: Return value aliased with 'a'.
Uniqueness Rule 3

If a function call yields a unique return value, the caller has exclusive access to that value. At the point the call returns, the return value may not share memory with any variable used in any execution path following the function call. This rule is particularly subtle, but can be considered a rephrasing of Uniqueness Rule 2 from the “calling side”.

It is worth emphasising that everything related to uniqueness types is implemented as a static analysis. All violations of the uniqueness rules will be discovered at compile time (during type-checking), leaving the code generator and runtime system at liberty to exploit them for low-level optimisation.

2.8.1. When To Use In-Place Updates

If you are used to programming in impure languages, in-place updates may seem a natural and convenient tool that you may use frequently. However, Futhark is a functional array language, and should be used as such. In-place updates are restricted to simple cases that the compiler is able to analyze, and should only be used when absolutely necessary. Most Futhark programs are written without making use of in-place updates at all.

Typically, we use in-place updates to efficiently express sequential algorithms that are then mapped on some array. Somewhat counter-intuitively, however, in-place updates can also be used for expressing irregular nested parallel algorithms (which are otherwise not expressible in Futhark), albeit in a low-level way. The key here is the array combinator scatter, which writes to several positions in an array in parallel. Suppose we have an array is of type [n]i32, an array vs of type [n]t (for some t), and an array as of type [m]t. Then the expression scatter as is vs morally computes

for i in 0..n-1:
  j = is[i]
  v = vs[i]
  if ( j >= 0 && j < length as )
  then { as[j] = v }
  else { }

and returns the modified as array. The old as array is marked as consumed and may not be used anymore. Notice that writing outside the index domain of the target array has no effect.

Moreover, identical indices in is (that are valid indices into the target array) are required to map to identical values; otherwise, the result is unspecified. In particular, it is not guaranteed that one of the duplicate writes will complete atomically; they may be interleaved. Futhark features a function, called reduce_by_index (a generalised histogram operation), which can handle this case deterministically. The parallel scatter operation can be used, for instance, to implement efficiently the radix sort algorithm, as demonstrated in Section 6.6.1.