## The Perils of Implementing Functor in Java

I’ve recently been spending a lot of time re-imagining the core constructs I would want from the Java programming language to consider it truly fit for enjoyable functional programming. One of the most glaringly missing interfaces in Java 8 is something like Haskell's `Functor`

type class - a theoretical super-interface of `Function`

, `Iterable`

and friends, `Optional`

, and many, many other parameterized types - and its corresponding `#fmap`

function, a conceptual implementation of which in Java might look like:

```
@FunctionalInterface
public interface Functor<A> {
<B> Functor<B> fmap(Function<? super A, ? extends B> f);
}
```

The implementation is straightforward enough: any parameterized type that implements `Functor`

must provide an implementation of `#fmap`

that can take a function from its parameter to a different parameter, and produce a new `Functor`

around that new parameter. An example for `Optional`

might be something like:

```
@Override
public <B> Optional<B> fmap(Function<? super A, ? extends B> f) {
return map(f);
}
```

That’s right - `Optional`

*already supports this operation*. It’s `Optional#map`

, and, in fact, `fmap`

is really just a generic `#map`

that isn’t coupled to a particular concrete type like `Optional`

or `Stream`

. However, although both `Stream`

and `Optional`

(and again, conceptually *many* others) define their own respective `#map`

operations with effectively identical signatures, there is no common interface that they conform to that guarantees this capability.

This is where a `Functor`

interface comes in handy - once this core interface is defined ("core" albeit only in the conceptual sense, since Java doesn’t support ad-hoc polymorphism, fundamentally excluding core constructs like `Stream`

and `Optional`

from `Functor`

membership without byte-code manipulation), more useful concepts like `BiFunctor`

s (for things like tuples) and `ProFunctor`

s (for things like variadic functions 2-arity and up) can emerge.

Looking again at the function `#fmap`

takes, it can be seen to be covariant from `A`

to `B`

: it expects the current `Functor`

’s parameter `A`

, and produces some other parameter `B`

that will be wrapped by a new `Functor`

. The important point about covariance in this example is that the unknown type by the current `Functor`

is `B`

, not `A`

(remember, `A`

is the `Functor`

’s own parameter - it’s known). This isn’t a problem because `#fmap`

’s function is in charge of producing the unknown `B`

type at each call site, since the method itself is parametrically polymorphic. However, this will be important later, so keep this in mind.

Now, the concept of `Functor`

s - and more specifically, `#fmap`

- exists in many other languages, although - as is usually the case - Haskell arguably implements it best. This is `#fmap`

’s beautiful signature in Haskell:

```
fmap :: Functor f => (a -> b) -> f a -> f b
{- and some examples: -}
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just a) = Just (f a)
instance Functor [] where
fmap _ [] = []
fmap f xs = map f xs
```

Briefly explained^{[1]}, this function states that it takes a function that produces some value of type `b`

from some value of type `a`

, and takes an instance `f`

of `Functor`

parameterized with `a`

, and produces the same instance `f`

of `Functor`

, parameterized with `b`

. This is apparent from both `Maybe`

(anemically implemented as `Optional`

in Java) and `[]`

(Java's closest counterpart being perhaps `LinkedList`

, although this is a terribly generous stretch).

I mentioned above that another example of a `Functor`

is Java 8’s `Function`

interface - but `Function`

has two parameters. How can this be reconciled with `Functor`

, which has one parameter? Which parameter is it? Traditionally (which is to say, I’ve never seen an implementation that diverged from this, but I can’t speak in proofs), a `Function<A, B>`

(where `A`

is the input and `B`

is the output) is a `Functor<B>`

- so it is a `Functor`

over its return type, not its argument. This actually makes a lot of sense in practical usage, but what does this mean for `#fmap`

? It might be implemented as follows (using the interface `MonadicFunction`

, since, of course, `Function`

is Java’s and its signature is therefore sealed):

```
@FunctionalInterface
public interface MonadicFunction<A, B> extends Functor<B> {
B apply(A a);
@Override
default <C> MonadicFunction<A, C> fmap(MonadicFunction<B, C> fn) {
return a -> fn.apply(apply(a));
}
}
```

Seems fine, right? It certainly compiles, and it acts as a form of function composition: we started with two functions, and now have one that executes both in succession. However, we have a problem.

Look again at Haskell’s `fmap`

. Now look at our `Functor`

's `#fmap`

. Now back at Haskell’s. See the difference?

The arguments are reversed.

Because `fmap`

in Haskell is a function, and not a method, it needs to receive both the function to resolve `B`

from `A`

as well as the `Functor`

instance it operates on. This is a key distinction between the way functions receive arguments and the way methods receive arguments: with functions, the ordering of the proverbial "subject" of the function and its arguments can be defined arbitrarily due to the blurring of the lines of the importance of each (they’re both just arguments, after all). With objects and methods, however, the method needs to be invoked on a particular object, and that object needs to be resolvable deterministically (this is referred to as method dispatch, and is a separate can of worms for a future blog post). In Java, the object a method is invoked on is implicitly resolved (by preceeding the `.`

, as in `a.toString()`

), which is where the symmetry between Haskell’s `fmap`

and `Functor#fmap`

in Java begins to break down.

Since in the Java implementation `#fmap`

is defined on a `Functor`

interface, the `Functor`

itself is the implied first argument to `#fmap`

, and the mapping function is the second argument (or only argument, as it were). In Haskell, the mapping function is the fist argument and `Functor`

is the second argument.

If it’s not yet obvious how this is a problem, let’s look at how a function is an instance of `Functor`

in Haskell:

```
{- (->) is "function" -}
instance Functor ((->) r) where
fmap = (.)
```

What is that `(.)`

? It’s the *compose* operator, and looks like this:

```
(.) :: (b -> c) -> (a -> b) -> a -> c
f . g = \x -> f (g x)
```

It’s possibly the most commonly used operator in Haskell^{[2]}, due to function composition being such a fundamental operation in functional languages (arguably the core building block to code reuse).

Here's an example of `(.)`

in action:

```
doSomeMath :: Num a => a -> a
doSomeMath x = ((+3) . (*2)) x
```

What does that function do? Does it double the result of adding 3 to its argument? Or does it double the argument and add 3 to the result? It does the latter, because `(.)`

is right-infix (evaluates right to left). This feels normal when reading code, but throws a giant wrench into the symmetry between our `#fmap`

in Java implemented on `MonadicFunction`

and `fmap`

in Haskell implemented for `->`

(aliased to `(.)`

).

Here’s why: because Haskell’s `fmap`

receives the mapping function as the first argument and the `Functor`

instance as the second argument, and because the function being mapped over is the `Functor`

, `fmap`

can be implemented as simply an alias for `(.)`

and everything works. In a mathematical notation of functions `f`

, `g`

, and their composition `f(g(x))`

, this makes the function to the left of `(.)`

`f`

, and the function to the right of `(.)`

`g`

.

However, because `Functor#fmap`

's mapping function is covariant from `Functor<A>`

to `Functor<B>`

(and therefore from `MonadicFunction<A, B>`

to `MonadicFunction<A, C>`

), *and because we're already starting with* `MonadicFunction<A, B>`

, the reverse compositional operation must be performed to maintain type compatibility.

So for a `MonadicFunction`

`f`

, `f.fmap(g)`

is actually evaluated as `g(f(x))`

, whereas in Haskell, `fmap f g`

is expectedly evaluated as `f(g(x))`

. This is an unfortunate and frustrating asymmetry that will likely cause confusion at first, due to the previously established convention of function composition evaluating right to left, and the fact that as a method on `Functor`

, `#fmap`

is inherently conceptually left infix.

We can’t even logically decouple `#fmap`

from `Functor`

and make it a standalone Method object (arguably the closest thing to a first-class function we can get to in Java, and most other OO languages), leaving `Functor`

as marker interface, without coming up with some way of unifying the general interfaces and delegating to subtype specific implementations - which is, of course, the whole purpose of `Functor`

and `#fmap`

in the first place. And here, our mental stack overflows.

So I suppose the only option is to acknowledge the reverse composition of `#fmap`

in Java - or else, wait for the human brain to support tail-call elimination. Who knows, at this rate, it might even come before HotSpot’s.

^{[1] Less-briefly explained, Haskell’s fmap is actually much more powerful and useful than the best we can do with Java due to Java’s limitation of non-parametric type parameters: i.e. given two type parameters, T and V, V cannot be parametrized with T, resulting in V<T>. This is a [serious] limitation of Java's support for parametric polymorphism, although not necessarily a non sequitur thanks to overloading (Java allows increased return type specificity in subtype implementations without clashing: i.e. Optional#fmap returning Optional<B> more specifically than Functor<B>, Stream#fmap returning Stream<B> more specifically than Functor<B>, etc.).}

^{[2] According to the statistics I just pulled out of thin air.}

A tale of type variance and the asymmetry between method dispatch and function invocation.