[ACCEPTED]-What is a catamorphism and can it be implemented in C# 3.0?-recursion-schemes

Accepted answer
Score: 32

LINQ's Aggregate() is just for IEnumerables. Catamorphisms in 21 general refer to the pattern of folding 20 for an arbitrary data type. So Aggregate() is to IEnumerables what 19 FoldTree (below) is to Trees (below); both are catamorphisms 18 for their respective data types.

I translated 17 some of the code in part 4 of the series into C#. The code 16 is below. Note that the equivalent F# used 15 three less-than characters (for generic 14 type parameter annotations), whereas this 13 C# code uses more than 60. This is evidence 12 why no one writes such code in C# - there 11 are too many type annotations. I present 10 the code in case it helps people who know 9 C# but not F# play with this. But the code 8 is so dense in C#, it's very hard to make 7 sense of.

Given the following definition 6 for a binary tree:

using System;
using System.Collections.Generic;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Shapes;

class Tree<T>   // use null for Leaf
    public T Data { get; private set; }
    public Tree<T> Left { get; private set; }
    public Tree<T> Right { get; private set; }
    public Tree(T data, Tree<T> left, Tree<T> rright)
        this.Data = data;
        this.Left = left;
        this.Right = right;

    public static Tree<T> Node<T>(T data, Tree<T> left, Tree<T> right)
        return new Tree<T>(data, left, right);

One can fold trees and 5 e.g. measure if two trees have different 4 nodes:

class Tree
    public static Tree<int> Tree7 =
        Node(4, Node(2, Node(1, null, null), Node(3, null, null)),
                Node(6, Node(5, null, null), Node(7, null, null)));

    public static R XFoldTree<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> tree)
        return Loop(nodeF, leafV, tree, x => x);

    public static R Loop<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> t, Func<R, R> cont)
        if (t == null)
            return cont(leafV(t));
            return Loop(nodeF, leafV, t.Left, lacc =>
                   Loop(nodeF, leafV, t.Right, racc =>
                   cont(nodeF(t.Data, lacc, racc, t))));

    public static R FoldTree<A, R>(Func<A, R, R, R> nodeF, R leafV, Tree<A> tree)
        return XFoldTree((x, l, r, _) => nodeF(x, l, r), _ => leafV, tree);

    public static Func<Tree<A>, Tree<A>> XNode<A>(A x, Tree<A> l, Tree<A> r)
        return (Tree<A> t) => x.Equals(t.Data) && l == t.Left && r == t.Right ? t : Node(x, l, r);

    // DiffTree: Tree<'a> * Tree<'a> -> Tree<'a * bool> 
    // return second tree with extra bool 
    // the bool signifies whether the Node "ReferenceEquals" the first tree 
    public static Tree<KeyValuePair<A, bool>> DiffTree<A>(Tree<A> tree, Tree<A> tree2)
        return XFoldTree((A x, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> l, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> r, Tree<A> t) => (Tree<A> t2) =>
            Node(new KeyValuePair<A, bool>(t2.Data, object.ReferenceEquals(t, t2)),
                 l(t2.Left), r(t2.Right)),
            x => y => null, tree)(tree2);

In this second example, another tree 3 is reconstructed differently:

class Example
    // original version recreates entire tree, yuck 
    public static Tree<int> Change5to0(Tree<int> tree)
        return Tree.FoldTree((int x, Tree<int> l, Tree<int> r) => Tree.Node(x == 5 ? 0 : x, l, r), null, tree);

    // here it is with XFold - same as original, only with Xs 
    public static Tree<int> XChange5to0(Tree<int> tree)
        return Tree.XFoldTree((int x, Tree<int> l, Tree<int> r, Tree<int> orig) =>
            Tree.XNode(x == 5 ? 0 : x, l, r)(orig), _ => null, tree);

And in this 2 third example, folding a tree is used for 1 drawing:

class MyWPFWindow : Window 
    void Draw(Canvas canvas, Tree<KeyValuePair<int, bool>> tree)
        // assumes canvas is normalized to 1.0 x 1.0 
        Tree.FoldTree((KeyValuePair<int, bool> kvp, Func<Transform, Transform> l, Func<Transform, Transform> r) => trans =>
            // current node in top half, centered left-to-right 
            var tb = new TextBox();
            tb.Width = 100.0; 
            tb.Height = 100.0;
            tb.FontSize = 70.0;
                // the tree is a "diff tree" where the bool represents 
                // "ReferenceEquals" differences, so color diffs Red 
            tb.Foreground = (kvp.Value ? Brushes.Black : Brushes.Red);
            tb.HorizontalContentAlignment = HorizontalAlignment.Center;
            tb.VerticalContentAlignment = VerticalAlignment.Center;
            tb.RenderTransform = AddT(trans, TranslateT(0.25, 0.0, ScaleT(0.005, 0.005, new TransformGroup())));
            tb.Text = kvp.Key.ToString();
            // left child in bottom-left quadrant 
            l(AddT(trans, TranslateT(0.0, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
            // right child in bottom-right quadrant 
            r(AddT(trans, TranslateT(0.5, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
            return null;
        }, _ => null, tree)(new TransformGroup());

    public MyWPFWindow(Tree<KeyValuePair<int, bool>> tree)
        var canvas = new Canvas();
        canvas.Background = Brushes.Blue;
        canvas.LayoutTransform=new ScaleTransform(200.0, 200.0);
        Draw(canvas, tree);
        this.Content = canvas;
        this.Title = "MyWPFWindow";
        this.SizeToContent = SizeToContent.WidthAndHeight;
    TransformGroup AddT(Transform t, TransformGroup tg) { tg.Children.Add(t); return tg; }
    TransformGroup ScaleT(double x, double y, TransformGroup tg) { tg.Children.Add(new ScaleTransform(x,y)); return tg; }
    TransformGroup TranslateT(double x, double y, TransformGroup tg) { tg.Children.Add(new TranslateTransform(x,y)); return tg; }

    static void Main(string[] args)
        var app = new Application();
        //app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7,Example.Change5to0(Tree.Tree7))));
        app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7, Example.XChange5to0(Tree.Tree7))));
Score: 11

I've been doing more reading, including 12 a Micorosft Research paper on functional programming with catamorphisms ("bananas"), and it seems 11 that catamorphism just refers to any function that takes 10 a list and typically breaks it down to a 9 single value (IEnumerable<A> => B), like Max(), Min(), and in the general 8 case, Aggregate(), would all be a catamorphisms for 7 lists.

I was previously under the impression 6 that it refefred to a way of creating a 5 function that can generalize different folds, so 4 that it can fold a tree and a list. There may 3 actually still be such a thing, some kind 2 of functor or arrow maybe but right now that's beyond 1 my level of understanding.

Score: 5

Brian's answer in the first paragraph is 33 correct. But his code example doesn't really 32 reflect how one would solve similar problems 31 in a C# style. Consider a simple class node:

class Node {
  public Node Left;
  public Node Right;
  public int value;
  public Node(int v = 0, Node left = null, Node right = null) {
    value = v;
    Left = left;
    Right = right;

With 30 this we can create a tree in main:

var Tree = 
    new Node(4,
      new Node(2, 
        new Node(1),
        new Node(3)
      new Node(6,
        new Node(5),
        new Node(7)

We define 29 a generic fold function in Node's namespace:

public static R fold<R>(
  Func<int, R, R, R> combine,
  R leaf_value,
  Node tree) {

  if (tree == null) return leaf_value;

      fold(combine, leaf_value, tree.Left),
      fold(combine, leaf_value, tree.Right)

For 28 catamorphisms we should specify the states 27 of data, Nodes can be null, or have children. The 26 generic parameters determine what we do 25 in either case. Notice the iteration strategy(in 24 this case recursion) is hidden inside the 23 fold function.

Now instead of writing:

public static int Sum_Tree(Node tree){
  if (tree == null) return 0;
  var accumulated = tree.value;
  accumulated += Sum_Tree(tree.Left);
  accumulated += Sum_Tree(tree.Right);
  return accumulated; 

We 22 can write

public static int sum_tree_fold(Node tree) {
  return Node.fold(
    (x, l, r) => x + l + r,

Elegant, simple, type checked, maintainable, etc. Easy 21 to use Console.WriteLine(Node.Sum_Tree(Tree));.

It's easy to add new functionality:

public static List<int> In_Order_fold(Node tree) {
  return Node.fold(
    (x, l, r) => {
      var tree_list = new List<int>();
      tree_list.InsertRange(0, l);
      return tree_list;
    new List<int>(),
public static int Height_fold(Node tree) {
  return Node.fold(
    (x, l, r) => 1 + Math.Max(l, r),

F# wins 20 in the conciseness category for In_Order_fold but that's 19 to be expected when the language provides 18 dedicated operators for constructing and 17 using lists.

The dramatic difference between 16 C# and F# seems to be due to F#'s use of 15 closures, to act as implicit data structures, for 14 triggering the tail call optimization. The 13 example in Brian's answer also takes in 12 to account optimizations in F#, for dodging 11 reconstructing the tree. I'm not sure C# supports 10 the tail call optimization, and maybe In_Order_fold could 9 be written better, but neither of these 8 points are relevant when discussing how 7 expressive C# is when dealing with these 6 Catamorphisms.

When translating code between 5 languages, you need to understand the core 4 idea of the technique, and then implement 3 the idea in terms of the language's primitives.

Maybe 2 now you'll be able to convince your C# co-workers 1 to take folds more seriously.

Score: 3

I understand that it's a generalization 18 of folds (i.e., mapping a structure of 17 many values to one value, including a 16 list of values to another list).

I wouldn't 15 say one value.It maps it into another structure.

Maybe 14 an example would clarify.let's say summation 13 over a list.

foldr (\x -> \y -> x + y) 0 12 [1,2,3,4,5]

Now this would reduce to 15. But 11 actually,it can be viewed mapping to a purely 10 syntactic structure 1 + 2 + 3 + 4 + 5 + 0. It 9 is just that the programming language(in 8 the above case,haskell) knows how to reduce 7 the above syntactic structure to 15.

Basically,a 6 catamorphism replaces one data constructor 5 with another one.In case of above list,

[1,2,3,4,5] = 1:2:3:4:5:[] (: is 4 the cons operator,[] is the nil element) the 3 catamorphism above replaced : with + and 2 [] with 0.

It can be generalized to any recursive 1 datatypes.

Score: 2

Brian had great series of posts in his blog. Also 11 Channel9 had a nice video. There is no LINQ syntactic 10 sugar for .Aggregate() so does it matter 9 if it has the definition of LINQ Aggregate 8 method or not? The idea is of course the 7 same. Folding over trees... First we need 6 a Node... maybe Tuple could be used, but 5 this is more clear:

public class Node<TData, TLeft, TRight>
    public TLeft Left { get; private set; }
    public TRight Right { get; private set; }
    public TData Data { get; private set; }
    public Node(TData x, TLeft l, TRight r){ Data = x; Left = l; Right = r; }

Then, in C# we can make 4 a recursive type, even this is unusual:

public class Tree<T> : Node</* data: */ T, /* left: */ Tree<T>, /* right: */ Tree<T>>
    // Normal node:
    public Tree(T data, Tree<T> left, Tree<T> right): base(data, left, right){}
    // No children:
    public Tree(T data) : base(data, null, null) { }

Now, I 3 will quote some of Brian's code, with slight 2 LINQ-style modifications:

  1. In C# Fold is called Aggregate
  2. LINQ methods are Extension methods that have the item as first parameter with "this"-keyword.
  3. Loop can be private


public static class TreeExtensions
    private static R Loop<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> t, Func<R, R> cont)
        if (t == null) return cont(leafV(t));
        return Loop(nodeF, leafV, t.Left, lacc =>
                Loop(nodeF, leafV, t.Right, racc =>
                cont(nodeF(t.Data, lacc, racc, t))));
    public static R XAggregateTree<A, R>(this Tree<A> tree, Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV)
        return Loop(nodeF, leafV, tree, x => x);

    public static R Aggregate<A, R>(this Tree<A> tree, Func<A, R, R, R> nodeF, R leafV)
        return tree.XAggregateTree((x, l, r, _) => nodeF(x, l, r), _ => leafV);

Now, the usage 1 is quite C#-style:

[TestMethod] // or Console Application:
static void Main(string[] args)
    // This is our tree:
    //     4 
    //  2     6 
    // 1 3   5 7 
    var tree7 = new Tree<int>(4, new Tree<int>(2, new Tree<int>(1), new Tree<int>(3)),
                            new Tree<int>(6, new Tree<int>(5), new Tree<int>(7)));

    var sumTree = tree7.Aggregate((x, l, r) => x + l + r, 0);
    Console.WriteLine(sumTree); // 28

    var inOrder = tree7.Aggregate((x, l, r) =>
            var tmp = new List<int>(l) {x};
            return tmp;
        }, new List<int>());
    inOrder.ForEach(Console.WriteLine); // 1 2 3 4 5 6 7

    var heightTree = tree7.Aggregate((_, l, r) => 1 + (l>r?l:r), 0);
    Console.WriteLine(heightTree); // 3

I still like F# more.

More Related questions