A Baptism of Functional Thinking Pattern

Preface

All contexts below are Haskell, no loop structures, can only use recursion to do homework

1. Recursion

First feel the syntactic charm of functional language from a simple recursion problem

Finding the maximum element in an array, can be done like this:

-- if-else
maximum' xs = if null xs then error "Empty list"
  else if length xs == 1 then head xs
    else max (head xs) (maximum' (tail xs))

The maximum element in the array is the larger one between the first element and the maximum of remaining elements

Doesn't look very pretty, let's modify:

-- Pattern matching
maximum'' [] = error "Empty list"
maximum'' (x:xs) = if null xs then x
  else max x (maximum' xs)

Starting to feel it a bit, modify again:

-- Guard
maximum''' [] = error "Empty list"
maximum''' (x:xs)
  | null xs = x
  | otherwise = max x (maximum''' xs)

After switching to guard if-else disappears, seems still not clear enough, modify again:

maximum'''' [] = error "Empty list"
maximum'''' [x] = x
maximum'''' (x:xs) = max x (maximum'''' xs)

Now semantics is quite natural:

1. Exception handling: Empty List has no maximum, error
2. Boundary condition: Maximum of single-element List is that element
3. Recursive definition: Maximum of multi-element List is the larger value between first element and maximum of remaining elements

Recursive algorithm itself has strong descriptiveness, cooperating with pattern matching can make semantics clearer and more natural

2. Functional Thinking Pattern

Okay, now we're thrown into a world without loop statements, recursion is the only way out

Simple Scenarios

First consider a simple problem, how to implement length function? Input a List, output its length

length :: Foldable t => t a -> Int

According to habitual imperative thinking, traverse List and count, done. So, how to describe List's length with recursion (i.e. provide its recursive definition)?

length' [] = 0
length' (_:xs) = 1 + length' xs

Very authentic writing, semantics as follows:

1. Boundary condition: Length of empty List is 0
2. Recursive definition: Length of non-empty List is 1 plus tail length

Vaguely getting the idea, recursive algorithm is giving recursive definition of problem's solution, generally need to give an exit, i.e. boundary condition

Some problems are recursively defined themselves, such as Fibonacci sequence:

fib 0 = 0
fib 1 = 1
fib n
  | n < 0 = error "Negative n is invalid"
  | otherwise = fib (n - 1) + fib (n - 2)

While other problems' recursive definitions are not so clear, such as above length function, seems counting one by one更符合 thinking habit, at this point need to give its recursive definition, then get recursive algorithm

P.S. There's also dead recursion without exit, such as repeat function:

repeat' :: t -> [t]
repeat' x = x : (repeat' x)

If support limiting length, it's replicate function:

replicate' times x
  | times <= 0 = []
  | otherwise = x : (replicate' (times - 1) x)

Not Very Simple Scenarios

List operations often use take function to take first n items of List:

take :: Int -> [a] -> [a]

In imperative style, naturally traverse and collect first n items. But we're now in functional context, no data structure can serve as container, nowhere to put. So result is naturally a new List pieced together through operations:

take'' n xs
  | n <= 0 || null xs = []
  | otherwise = (head xs) : (take'' (n - 1) (tail xs))

Semantics is still clear:

1. Boundary condition: take result of first 0 items or empty List is empty List
2. Recursive definition: First n items is first item consed with first n-1 items of tail

But writing is still not authentic enough:

take' _ [] = []
take' n _
  | n <= 0 = []
take' n (x:xs) = x : take' (n - 1) xs

Next try an interesting scenario, implement reverse function:

reverse :: [a] -> [a]

This time requires us to create a "subversive" List, pure data structure operation:

reverse' xs
  | length xs <= 1 = xs
  | otherwise = last xs : (reverse' (init xs))

Above function says, List reversal result is last element consed with reversal result of remaining elements

Another data structure operation, such as zip:

zip :: [a] -> [b] -> [(a, b)]

Combine two Lists into tuple List, longer part needs to be trimmed:

zip' [] _ = []
zip' _ [] = []
zip' (x:xs) (y:ys) = (x, y) : (zip' xs ys)

It says, combination result of two Lists is tuple composed of respective first elements consed with combination result of remaining parts

Playing with data structure recursion seems not very interesting, try a pure logic one, such as elem function:

elem :: (Foldable t, Eq a) => a -> t a -> Bool

How to judge if List contains specified element? Compare one by one:

elem'' _ [] = False
elem'' n (x:xs) = x == n || (elem'' n xs)

First empty List definitely doesn't have, for non-empty List, check first element, if not match then check remaining elements. Sounds not like recursion, reorganize language: containment of non-empty List is containment of first element or containment of remaining elements

Reorganize code too, more authentic version:

elem''' _ [] = False
elem''' n (x:xs)
  | x == n = True
  | otherwise = elem'' n xs

Slightly Complex Scenarios

This time facing first small checkpoint, how to swap two elements in List?

swap :: Int -> Int -> [a] -> [a]

Try our familiar "routine":

t = a
a = b
b = t

This seems not work in functional environment, so is there any other way? Of course:

swap _ _ [] = []
swap i j xs
  | i == j = xs
  | i > j = swap j i xs
  | otherwise  = (take i xs) ++ [xs !! j] ++ (drop (i + 1) (take j xs)) ++ [xs !! i] ++ (drop (j + 1) xs)

Above function says, a line is divided into 3 segments by 2 points, result of swapping two elements in List is first segment consed with second point, consed with middle segment, then consed with first point and last segment

Of course, this problem has nothing to do with recursion, so saying recursion is only choice is only for scenarios needing traversal, not all problems need to be nailed with this hammer

Note, among them drop a (take b xs) is used to take elements with index between (a, b] in List, more authentic writing is:

sublist _ _ [] = []
sublist a b xs
  | a > b = []
  | a == b = [xs !! a]
  | otherwise = (drop a . take (b + 1)) xs

So modified swap looks like this:

swap _ _ [] = []
swap i j xs
  | i == j = xs
  | i > j = swap j i xs
  | otherwise  = (take i xs) ++ [xs !! j] ++ (sublist (i + 1) (j - 1) xs) ++ [xs !! i] ++ (drop (j + 1) xs)
  where sublist _ _ [] = []
        sublist a b xs
          | a > b = []
          | a == b = [xs !! a]
          | otherwise = (drop a . take (b + 1)) xs

If pursuing stronger expressiveness, can also change remaining head and tail 2 segments to sublist

P.S. Functional language pursues small and beautiful combination, so no slice and such things, because can be combined through drop, take, specially providing one seems redundant

Complex Scenarios

Know quick sort, write one

Quick sort's core thinking is:

Men stand on left, women stand on right, teacher stands in middle

Okay, start arranging seats. Mathematical description as follows:

Input set: A = {x | x <- a}
Partition rules:
  Left = {x | x <- a, x < pivot}
  P = {pivot}
  Right = {x | x<- a, x >= pivot}
Recursive definition: sort(A) = sort(Left) U P U sort(Right)

Among them pivot is axis (generally choose first element, if pursuing stability, randomly choose one), i.e. "teacher" mentioned above, elements on left are all smaller than him, elements on right are all larger (or equal)

Goal of each quick sort pass is to find pivot's final position, this purpose is similar to bubble/selection sort: select one maximum/minimum element each pass. Problem-solving thinking is similar to merge sort: reduce problem scale through set partition, merge sort first partially ordered then make whole ordered, while quick sort is equivalent to first whole ordered, then make each part ordered

What we're familiar with classic quick sort algorithm is two pointers moving alternately, approaching pivot's final position from left and right ends, position where pointers coincide is partition point to find in this pass, divide set into two, then sort separately

Okay, then, hurry up get two pointers, start moving

Wait a minute, we're in functional world now, set operations are simple enough, what pointers needed

Look at quick sort's recursive definition again, isn't it just wanting to know which elements are greater than pivot, which elements are less than pivot, easy to say:

left (x:xs) = [a | a <- xs, a < x]
right (x:xs) = [a | a <- xs, a >= x]

Two lines of List Comprehension done. So quick sort implementation becomes very elegant:

quickSort' [] = []
quickSort' (x:xs) = (quickSort' left) ++ [x] ++ (quickSort' right)
  where left = [a | a <- xs, a < x]
        right = [a | a <- xs, a >= x]

Then, if insist on using two pointers classic way to implement? Of course can:

-- 2 pointers, common strategy for finding p's final position
-- p L ... H
quickSort'' [] = []
quickSort'' list @(x:xs) = (quickSort'' left) ++ [x] ++ (quickSort'' right)
  where assign i v xs = (take i xs) ++ [v] ++ (drop (i + 1) xs)
        pass p l h ltr xs
          -- Assign *l to p, partition ends
          | l >= h = l : (assign l p xs)
          -- l moves right
          | ltr == True = if low < p then pass p (l + 1) h True xs
              else pass p l (h - 1) False (assign h low xs)
          -- r moves left
          | ltr == False = if high > p then pass p l (h - 1) False xs
              else pass p (l + 1 ) h True (assign l high xs)
          where low = xs !! l
                high = xs !! h
        nextList = pass x 0 (length list - 1) False list
        (left, right) = let (x:xs) = nextList in ((take x xs), (drop (x + 1) xs))

assign used inside is quite interesting:

assign i v xs = (take i xs) ++ [v] ++ (drop (i + 1) xs)

Data is immutable, so again complete simple operation in this "novel" way: 1 point divides a line into 2 segments, dig out this point replace with other value, then cons left and right segments back up

Okay, now perverted desire is satisfied: we proved a feasible algorithm has nothing to do with functional/imperative environment, although wrote many more lines. Re-examine above differences between these two thinking patterns:

Imperative: Let me tell you, get two pointers, approach from left and right ends respectively, doing this can find partition point, then sort partitioned subsets separately

Functional: Sorting is dividing set from axis into two, then sort left and right sides separately. Among them, left all less than axis, right all greater than (equal to) axis

From problem description perspective, functional thinking focuses more on definition of problem's solution, while imperative focuses more on how to clearly explain every detailed step. Thinking pattern difference is roughly: former first abstract then concrete, latter first concrete then abstract

Of course, imperative language can also go from abstract to concrete (first output algorithm skeleton, then fill in), so saying it's just thinking pattern difference. Like interference brought by "main title, subtitle, list items" in PowerPoint template, convenient and efficient loop structures largely affected our thinking pattern:

This problem, feels need to traverse, let's write a loop first, then see what to do in loop body

Problem-solving method is equally important as specific steps, functional expressiveness lies in accurate description of method. Sounds very familiar, yes, when I was in 6th grade, math teacher said:

This problem is very easy to solve with system of linear equations in two variables, not easy to think with arithmetic method

From abstract to concrete, is a forward thinking process, wanting to reach concrete in one step, is much harder

P.S. How about, a JS quick sort?:

function quickSort([x, ...xs]) {
  return !arguments[0].length ?
    [] :
    quickSort(xs.filter(a => a < x)).concat([x]).concat(quickSort(xs.filter(a => a >= x)))
}

Why does !arguments[0].length look ugly, isn't arrow function good?

Not good, because JS has no function overloading/pattern matching, also no xxs @(x:xs) and such ways to preserve original reference, only resort to this. And arrow function cannot access arguments object, specifically see [4. About arguments object](/articles/箭头函数-es6 笔记 6/#articleHeader8), if must use, currently only uglier:

const quickSort = xxs => {
  if (!xxs.length) return []
  let [x, ...xs] = xxs
  return quickSort(xs.filter(a => a < x)).concat([x]).concat(quickSort(xs.filter(a => a >= x)))
}

3. Ceremony Complete

Okay, next time when talking about two pointers moving around, understand functional thinking a bit? Is really finding partition point that important?