Fold for cheap pattern-matching

A few days ago, Hakim Baaloudj reacted to a somewhat provocative message from Xavier Van de Woestyne, stating that all design patterns proposed in the book "Design Patterns: Elements of Reusable Object-Oriented Software" could be reduced to "if only my language had lambdas and modules". This led him to ask the following question in a conversation space that we all frequent:

"And one of the first thoughts I had was that we'd also need pattern matching (especially for the visitor design pattern), how can we avoid having to use the visitor's encoding, just by using modules and lambdas?"

I don't have a particular opinion on the book (or on design patterns in general), but here is a revised version of the response I sent to him. The article is obviously not groundbreaking, and it simply sketches out the encoding of visitors through fold. Moreover, it should only be taken as a response to the question "how to encode the visitor without pattern matching" and not as a plea against the usefulness of pattern matching. Indeed, I am instinctively convinced that, akin to the explicit ability to delineate algebraic types, pattern matching objectively enhances a language.

The main idea behind the `fold` function

We can offer an exceedingly broad definition of the fold function, which aligns with the domain of recursion schemes (referred to as a catamorphism), extensively expounded upon in this paper, encapsulating the concept of generic reduction. However, I believe that, for the purposes of demonstration, delving into extensive theory is unnecessary. It suffices to describe, in the case of the catamorphism, a case analysis. Indeed, one could summarise the fold function as one that will traverse the various branches of a type. Let's consider an initial example with the option type, described by the following type:

type 'a option = 
  | Some of 'a
  | None

The fold function will need to handle the case where the value is Some x or if the value is None. Nothing could be simpler, old chap! We shall describe a function of type: 'a option -> ('a -> 'b) -> (unit -> 'b) -> 'b.

let fold opt when_some when_none =
  match opt with
  | None   -> when_none ()
  | Some x -> when_some x

So, if our value happens to be Some x, we employ when_some x, and if it turns out to be None, we utilise when_none () (employing a function unit -> 'b to defer computation in the event of None). With this function, one can readily re-encode pattern matching cases (admittedly, at the expense of a bit more verbosity). For example, this peculiar function:

let my_f opt = 
  match opt with
  | Some ((3 | 10) as x) -> x
  | Some x -> 0 - x
  | None -> 0

Could possibly be rewritten in terms of fold in this manner:

let my_f_with_fold opt = fold opt
  (fun x  -> 
     if (x = 3 || x = 10) then x else 0 - x)
  (fun () -> 0)

In terms of outcome, the two functions are identical. However, while the approach with fold is, let's be honest, considerably more verbose than the one using pattern matching, but fold can also operate on abstract types (types whose structure/shape is unknown, or solely crystallised through its API). fold is indeed highly generalisable and it's a fascinating exercise to implement it for a plethora of sum types (also introducing recursion, such as List or for trees), and alongside recursion schemes, its properties are expansively documented in this paper.

However, more astute readers may have noticed a slight trickery! Indeed, we use pattern matching to suggest that we can do without pattern matching, which seems rather bold! No worries, let's reimplement Option in terms of objects and completely do away with pattern matching.

Objects and absence of pattern matching

Since OCaml boasts a perfectly decent object-oriented language, we don't need to switch languages, which is rather splendid. Firstly, let's describe the interface of our option using an OOP interface:

class type ['a] option_obj = object
  method fold : ('a -> 'b) -> (unit -> 'b) -> 'b
end

Now, we shall implement the constructors some and none so that they adhere to this interface:

module Option_obj : sig
  val some : 'a -> 'a option_obj
  val none : unit -> 'a option_obj 
  val fold : 'a option_obj -> ('a -> 'b) -> (unit -> 'b) -> 'b
end

The major difference with the constructors Some and None lies in the fact that none must take unit as an argument, due to the value restriction, but this is a detail that doesn't make its usage overly complex. Now, we can implement the structure of the module Option_obj, which, in fact, is conceptually very close to what we had done with pattern matching.

module Option_obj = struct 
  
  class ['a] some (x: 'a) = 
    object (_: 'a #option_obj)
      method fold when_some _when_none = when_some x
    end
  
  class ['a] none = 
    object (_: 'a #option_obj)
      method fold _when_some when_none = when_none ()
    end
    
  let fold (opt : 'a option_obj) when_some when_none = 
    opt#fold when_some when_none
  
  let some x = new some x
  let none () = new none
end

One can observe the natural symmetry between disjunctions via sum types and through subtyping relations. Our fold function (in the Option_obj module) has the same API as the one we defined using pattern matching, but this time, we're not using pattern matching at all!

The major difference from the usual definition of a visitor is that when we support lambdas, we can drastically reduce the need for boilerplate to add behaviour to our case analysis. Indeed, without lambdas, for each case analysis, we would need to add an interface unifying both cases through an interface (and thus concretize it with an implementation).

So, do we truly need pattern matching

We've seen that it's possible to do without pattern matching using subtyping relations and the presence of lambdas significantly simplifies the definition of visitors. However, something about the premise bothers me a bit. Indeed, we're using an encoding to build a missing feature of the language: pattern matching. But, just as we can encode cheap pattern matching, we can encode cheap lambdas via objects:

class type ['a, 'b] lam = object
  method apply : 'a -> 'b
end

Allowing us to entirely sidestep lambdas by redefining our type 'a option_obj in this fashion:

class type ['a] option_obj = object
  method fold : ('a, 'b) lam -> (unit, 'b) lam -> 'b
end

Enabling us, albeit with a considerable verbosity, to have visitors without lambdas and without the need for visitor unification. But who would want to write that? Especially in OCaml?

Conclusion

I perfectly understand that the support for lambdas in a language significantly reduces the necessity for complicated encodings, which rely on complex inheritance lattices, on many aspects. However, we have also seen that it is possible to encode lambdas in a similar manner to how we have used for pattern matching, prompting me to ask the following question: "if we can encode lambdas, in what way is pattern matching less necessary than lambdas"?

My conclusion is therefore a bit more extreme than Xavier's, considering that any functionality, comprehensible, which makes syntactical sense, deserves to be added to a language, and that as fulfilled OCaml programmers, we should not impose Church encodings on ourselves when our language offers us interesting syntactic constructions! However, understanding these encodings (and their generalizations, i.e., recursion schemes) sometimes allows for building generic intuitions and sometimes for supporting features when the language grammar does not offer first-class support, a bit like profunctors, which impose point-free style but allow for handling "kinds of functions" generically.

Bibliography

Design Patterns: Elements of Reusable Object-Oriented Software
- Erich Gamma
- Richard Helm
- Ralph Johnson
- John Vlissides
Functional programming with bananas, lenses, envelopes and barbed wire
- Erik Meijer
- Maarten Fokkinga
- Ross Paterson
A tutorial on the universality and expressiveness of fold
- Graham Hutton