Python - Recursive Algorithms

QTM 350: Data Science Computing

Davi Moreira

Introduction


This topic material is based on Professor Mike Gelbart Algorithms and Data Structures course. It was adapted for our purposes.

Outline:

  • Recursion intro: random tree graphics
  • Recursion intro: the details
  • Binary search, revisited
  • Multiple recursive calls

Learning objectives

  • Explain what recursion is, and why both the base case(s) and recursive step(s) are essential.
  • Recognize problems for which a recursive solution is appropriate.
  • Write a recursive solution for a simple problem.
  • Compare and contrast recursion and iteration.
In [2]:
import numpy as np
import matplotlib.pyplot as plt

Recursion intro: random tree graphics

In [12]:
def move_pen(x, y, angle, length):
    """
    Moves a point from initial location of x and y according to
    angle and length specified.

    Parameters
    ----------
    x : float
        initial location of pen on the x axis
    y : float
        initial location of pen on the y axis
    angle : float
        direction from initial location (0 means up)
    length: distance from the initial location desired to move

    Returns
    -------
    new_x : float
        new location on the x axis 
    new_y : float
        new location on the y axis 

    Examples
    --------
    >>> move_pen(0, 0, 90,2)
    (2.0, 0)
    >>> move_pen(0, 0, 180,2)
    (0, -2.0)

    """
    new_x = x + length * np.sin(angle * np.pi / 180.0)
    new_y = y + length * np.cos(angle * np.pi / 180.0)
    return new_x, new_y

def branch(x, y, angle=0, length=1, marker="."):
    """
    Plot a shape at the end of a tree branch.

    Parameters
    ----------
    x : float
        initial location on the x axis
    y : float
        initial location on the y axis
    angle : float, optional (default = 0)
        not used, needed to conform to the same interface as vee
    length: float, optional (default = 1)
        not used, needed to conform to the same interface as vee
    marker : str, optional (default = ".")
        the shape at the end of the branch
    """
    plt.plot(x, y, marker=marker, markersize="15",
             markerfacecolor="None", markeredgecolor="k")

def square(*args):
    return branch(*args, marker="s")

def hexagon(*args):
    return branch(*args, marker="h")

def star(*args, length=1):
    return branch(*args, marker="*")

def vee(x=0, y=0, angle=0, length=1):
    """
    Plots a tree with random shaped nodes given
    a trunk location, branch angle and branch length.

    Parameters
    ----------
    x : float, optional (default = 0)
        location of pen on the x axis
    y : float, optional (default = 5)
        location of pen on the y axis
    angle : float, optional (default = 0)
        direction from location desired to move in degrees 
    length: float,  (default = 1)
        distance from the location desired to move
    """
#     plt.figure(figsize=(4,3))
    
    # possible shapes
    shapes = [square, hexagon, star, vee, vee]

    # branch left
    shape = np.random.choice(shapes)
    new_angle = angle - 20/np.pi
    new_x, new_y = move_pen(x, y, new_angle, length)
    plt.plot([x, new_x], [y, new_y], 'k')
    shape(new_x, new_y, new_angle, length)

    # branch right
    shape = np.random.choice(shapes)
    new_angle = angle + 20/np.pi
    new_x, new_y = move_pen(x, y, new_angle, length)
    plt.plot([x, new_x], [y, new_y], 'k')
    shape(new_x, new_y, new_angle, length)

    plt.xticks(())
    plt.yticks(())
    plt.xlim((-2, 2))
    plt.ylim((0, 10))
#     plt.draw()
#     plt.pause(0.05)

np.random.seed(27)

It defines a small visualization framework for generating tree-like structures with different shapes at the branch ends, using matplotlib for plotting. Here's a breakdown of each function and its role in the framework:

Function move_pen(x, y, angle, length)

This function calculates a new point (x, y) given an initial point, a direction in degrees (angle), and a distance (length). The angle is adjusted from a Cartesian coordinate where 0 degrees is the positive x-axis, to one where 0 degrees points upwards by using sine for the x-coordinate and cosine for the y-coordinate. The function converts the angle from degrees to radians before performing these calculations.

Parameters:

  • x and y: Initial coordinates.
  • angle: Direction to move from the initial point, in degrees.
  • length: Distance to move from the initial point.

Returns:

  • new_x and new_y: New coordinates after moving.

Example Usage:

  • move_pen(0, 0, 90, 2) returns (2.0, 0), moving 2 units to the right from the origin.
  • move_pen(0, 0, 180, 2) returns (0, -2.0), moving 2 units downward from the origin.

Function branch(x, y, angle=0, length=1, marker=".") This is a generic plotting function used to place a marker at a specified coordinate. The function is designed to be flexible, allowing different markers which could be used to represent different shapes or features at the points of a tree structure.

Parameters:

  • x and y: Coordinates to place the marker.
  • marker: Shape of the marker (default is a point).

Functions square, hexagon, star

These functions are specific implementations of branch for different markers:

  • square: Uses a square marker.
  • hexagon: Uses a hexagon marker.
  • star: Uses a star marker.

Function vee(x=0, y=5, angle=0, length=1)

This function simulates branching in a tree. It starts at a given location, branches into two directions (left and right by adjusting the angle), and uses the move_pen function to determine the new endpoints for these branches. Each endpoint then recursively generates further branches using random shapes chosen from a list that includes square, hexagon, star, and vee.

Parameters:

  • x and y: Starting coordinates for the branching.
  • angle: Starting direction for branching.
  • length: Length of each branch.

The function randomly selects different shapes to be placed at the end of each branch, creating a varied tree-like structure. This randomness, along with the angle adjustments, helps simulate the natural variation in tree shapes.

Fundamental concept of recursion: A function that calls itself.

How each function contributes to the recursive drawing?

  • move_pen Function: It calculates new positions based on angles and distances. It simulates the 'recursive step' in moving to a new position in the tree.

  • Shape Functions (square, hexagon, star, vee): These functions place different markers at the end points, which can represent different recursive calls or branching points in a tree.

  • vee Function: This is the main recursive function. It makes two recursive calls to itself, one for the left branch and one for the right branch, each time with modified angles and positions.

In [19]:
vee()
No description has been provided for this image

Recursion intro: the details

  • We are going to take a very step-by-step approach to recursion.
  • It might be boring in the middle, but it's going somewhere.

Step 1: functions with outputs but no inputs

Consider this function:

In [20]:
def h():
    return 0

We can call it:

In [21]:
h()
Out[21]:
0

Now let's add a function g that calls h:

In [22]:
def g():
    return h() + 1
In [27]:
g()
Out[27]:
1
  • What's happening here? g is calling h, which returns $0$.
  • It then adds this result, $0$, to $1$ and gets $1$.
  • So it returns $1$.

Now let's add a function f that calls g:

In [23]:
def f():
    return g() + 1
In [24]:
f()
Out[24]:
2
  • What's happening here? f is calling g.
  • We already know g returns $1$, so no need to go through that again in ours heads (but Python does).
  • So f adds $1+1$, and returns $2$.

What's the idea here? That functions can call each other, and we can methodically "unpack" what happens when they call each other. And furthermore, if we already know what g does, we can forget about the fact that it calls h.

Step 2: functions with inputs but no outputs

Now let's make it more interesting by letting the functions take arguments. Consider these functions:

In [25]:
def h(n):
    print("You called h with argument", n)
    print("Ending h with argument", n)
In [26]:
h(3)
You called h with argument 3
Ending h with argument 3
In [27]:
h(2)
You called h with argument 2
Ending h with argument 2
In [28]:
def g(n):
    print("You called g with argument", n)
    h(n-1)  
    print("Ending g with argument", n)
In [29]:
g(1)
You called g with argument 1
You called h with argument 0
Ending h with argument 0
Ending g with argument 1
In [30]:
g(2)
You called g with argument 2
You called h with argument 1
Ending h with argument 1
Ending g with argument 2
In [31]:
g(3)
You called g with argument 3
You called h with argument 2
Ending h with argument 2
Ending g with argument 3

Ok, now let's add another function, f.

In [32]:
def f(n):
    print("You called f with argument", n)
    g(n-1)  
    print("Ending f with argument", n)
In [33]:
f(5)
You called f with argument 5
You called g with argument 4
You called h with argument 3
Ending h with argument 3
Ending g with argument 4
Ending f with argument 5
In [34]:
f(10)
You called f with argument 10
You called g with argument 9
You called h with argument 8
Ending h with argument 8
Ending g with argument 9
Ending f with argument 10

FYI (optional): what we're seeing above is a depiction of the functions' call stack.

Step 3: functions with inputs and outputs

Above, each function is passing data into the next function.

Below, we try passing data out of each function as well.

In [35]:
def h(n):
    print("Starting h with argument", n)
    value = 0
    print("Ending h with value", value)
    return value
In [36]:
h(5)
Starting h with argument 5
Ending h with value 0
Out[36]:
0

h is a weird function that ignores its input and just returns 0.

Below, g calls h will a smaller value of n, and then adds 1 to the result.

In [37]:
def g(n):
    print("Starting g with argument", n)
    value = h(n-1) + 1
    print("Ending g with value", value)
    return value
In [38]:
g(10)
Starting g with argument 10
Starting h with argument 9
Ending h with value 0
Ending g with value 1
Out[38]:
1

Let's look at the order of the print statements:

  1. We start g
  2. We start h
  3. We end h
  4. We end g

This is critical to understanding what happens next.

In [39]:
def f(n):
    print("Starting f with argument", n)
    value = g(n-1) + 1
    print("Ending f with value", value)
    return value
In [40]:
result = f(11)
result
Starting f with argument 11
Starting g with argument 10
Starting h with argument 9
Ending h with value 0
Ending g with value 1
Ending f with value 2
Out[40]:
2

Let's remove some of the print statements to make things more compact:

In [41]:
def f(n):
    print("Starting f with argument", n)
    return g(n-1) + 1


def g(n):
    print("Starting g with argument", n)
    return h(n-1) + 1


def h(n):
    print("Starting h with argument", n)
    return 0
In [42]:
f(11)
Starting f with argument 11
Starting g with argument 10
Starting h with argument 9
Out[42]:
2

So far, this isn't interesting because we're not using n for anything. We get the same result, $2$, for any value of n:

In [43]:
f(1000)
Starting f with argument 1000
Starting g with argument 999
Starting h with argument 998
Out[43]:
2

Now look at this code (we're getting close to the punch line):

In [44]:
def f(n):
    print("Starting f with argument", n)
    if n == 0:
        return 0
    else:
        return g(n-1) + 1


def g(n):
    print("Starting g with argument", n)
    if n == 0:
        return 0
    else:
        return h(n-1) + 1


def h(n):
    print("Starting h with argument", n)
    if n == 0:
        return 0
    else:
        return i(n-1) + 1

# ...
# ...
In [45]:
f(1)
Starting f with argument 1
Starting g with argument 0
Out[45]:
1
In [46]:
g(1)
Starting g with argument 1
Starting h with argument 0
Out[46]:
1

This code actually uses n. Here's what it f outputs

n f(n)
0 0
1 1
2 2
3+ error
In [47]:
f(0)
Starting f with argument 0
Out[47]:
0
In [48]:
f(1)
Starting f with argument 1
Starting g with argument 0
Out[48]:
1
In [49]:
f(2)
Starting f with argument 2
Starting g with argument 1
Starting h with argument 0
Out[49]:
2
In [50]:
f(3)
Starting f with argument 3
Starting g with argument 2
Starting h with argument 1
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[50], line 1
----> 1 f(3)

Cell In[44], line 6, in f(n)
      4     return 0
      5 else:
----> 6     return g(n-1) + 1

Cell In[44], line 14, in g(n)
     12     return 0
     13 else:
---> 14     return h(n-1) + 1

Cell In[44], line 22, in h(n)
     20     return 0
     21 else:
---> 22     return i(n-1) + 1

NameError: name 'i' is not defined

And now the key insight: all of the functions above are exactly the same. They check if n is zero. If so, they return zero. Otherwise, they call a function identical to themselves with argument n-1, add 1 to the result, and return it.

So why write

In [51]:
def f(n):
    if n == 0:
        return 0
    else:
        return g(n-1) + 1

# def g(n):
#     ...
#
# def h(n):
#     ...

a million times when you can just write

In [52]:
def f(n):
    if n == 0: # base case
        return 0
    else:
        return f(n-1) + 1
In [53]:
f(1)
Out[53]:
1
In [54]:
f(2)
Out[54]:
2
In [55]:
f(3)
Out[55]:
3

BAM! Recursion.

  • It is just incidental that all the functions have the same name, f.
  • It may be helpful for you to think of them as separate.
  • They do live in separate universes, with separate variables, like n and all other variables.
  • The fact that they are all named f is just a matter of convenience, to avoid writing an infinite amount of code.

Let's put our print string back in...

In [57]:
def f(n):
    print("Starting f with argument", n)
    if n == 0:
        print("Ending f with n =", n, "and returning 0 (base case)")
        return 0
    else:
        value = f(n-1) + 1
        print("Ending f with n =", n, "and returning", value)
        return value
In [60]:
f(2)
Starting f with argument 2
Starting f with argument 1
Starting f with argument 0
Ending f with n = 0 and returning 0 (base case)
Ending f with n = 1 and returning 1
Ending f with n = 2 and returning 2
Out[60]:
2
  • Does this look familiar? It's the same "call stack" we saw before with our f, g, and h. Except now all the functions are named f.
  • The if n == 0 part of the code is called the base case. This is like the function h earlier. It is the function that does not call itself, and it's what causes the recursion to terminate.
  • The call to itself, in this case f(n-1) is the recursive case.
In [61]:
f(5)
Starting f with argument 5
Starting f with argument 4
Starting f with argument 3
Starting f with argument 2
Starting f with argument 1
Starting f with argument 0
Ending f with n = 0 and returning 0 (base case)
Ending f with n = 1 and returning 1
Ending f with n = 2 and returning 2
Ending f with n = 3 and returning 3
Ending f with n = 4 and returning 4
Ending f with n = 5 and returning 5
Out[61]:
5

Binary search, revisited

  • Here is a case where recursion works beautifully.
In [62]:
def binary_search(data, key):
    """
    Searches the key in data using binary search. 
    Returns True if found and False otherwise. 

    Parameters
    ----------
    data : list
        the elements to search within
    key : int
        the key to search for

    Returns
    -------
    bool
        is the key contained in the data?

    Examples
    --------
    >>> binary_search([1, 7, 35, 45, 67], 3)
    False
    >>> binary_search([1, 7, 35, 45, 67], 7)
    True
    """
    if len(data) == 1:
        return data[0] == key

    mid = len(data)//2
    if key < data[mid]:
        return binary_search(data[:mid], key)
    else:
        return binary_search(data[mid:], key)

The function binary_search is designed to determine if a specific key is present within a sorted list called data. The binary_search function is an elegant demonstration of how recursive thinking can be applied to efficiently solve problems by dividing them into smaller, more manageable tasks. Here’s a breakdown:

  • Parameters:

    • data: A sorted list of elements within which the function searches for the key.
    • key: The value to search for within the list.
  • Returns:

    • A boolean (True or False), indicating whether the key is found in the list.
  1. Base Case: The function first checks if the list contains only one element. If so, it compares that element to the key. If they are equal, it returns True; otherwise, it returns False. This base case is essential for terminating the recursion.

  2. Finding the Middle: The function calculates the midpoint of the list using integer division (len(data)//2). This midpoint divides the list into two halves.

  3. Recursive Search:

    • If the key is less than the element at the midpoint (data[mid]), the function recursively calls itself with the first half of the list (from the beginning to mid-1).
    • If the key is greater than or equal to the element at the midpoint, the function recurses on the second half of the list (from mid to the end).

Important Considerations:

  • Sorted List Requirement: Binary search requires that the input list be sorted. If data is not sorted, the function will not perform correctly.

  • Recursive Calls: Each recursive call effectively reduces the size of the search space by half, leading to a logarithmic time complexity of O(log n), where n is the number of elements in the list.

In [63]:
data = [-12, 4, 7, 9, 45, 987, 987, 1000, 2000]

# Test cases for binary search
# key is the first element in the list
assert binary_search(data, -12)

# key is the last element in the list
assert binary_search(data, 2000)

# key is the middle element in the list
assert binary_search(data, 45)

# key occurs multiple times in the list
assert binary_search(data, 987)

# key is larger than the largest element in the list
assert not binary_search(data, 3000)

# key is smaller than the smallest element in the list
assert not binary_search(data, -18)
  • So elegant!

Multiple recursive calls

Consider the function below.

In [64]:
def f(n):
    if n == 1 or n == 2:
        return 1
    else: 
        return f(n-1) + f(n-2)

What is returned by f(3)? What is returned by f(4)?

In [65]:
f(3)
Out[65]:
2
In [66]:
f(4)
Out[66]:
3
In [67]:
def f(n):
    print("Starting f with argument", n)
    if n == 1 or n == 2:
        return 1
    else:
        return f(n-1) + f(n-2)
In [73]:
f(5)
Starting f with argument 5
Starting f with argument 4
Starting f with argument 3
Starting f with argument 2
Starting f with argument 1
Starting f with argument 2
Starting f with argument 3
Starting f with argument 2
Starting f with argument 1
Out[73]:
5
f(5)
├── f(4)
│   ├── f(3)
│   │   ├── f(2) → returns 1
│   │   └── f(1) → returns 1
│   └── f(2) → returns 1
└── f(3)
    ├── f(2) → returns 1
    └── f(1) → returns 1

Note this is equivalent to:

In [75]:
def f(n):
    print("Starting f with argument", n)
    if n == 1 or n == 2:
        return 1
    val1 = f(n-1)
    val2 = f(n-2)
    return val1 + val2

f(4)
Starting f with argument 4
Starting f with argument 3
Starting f with argument 2
Starting f with argument 1
Starting f with argument 2
Out[75]:
3

In other words, there's no significance to the two recursive calls being on the same line of code.

Let's draw out the execution tree.

  • First, f(4) is called.
  • The first thing it does is call f(n-1) = f(3)
            f(4)
           /     
        f(3)  
  • The first thing f(3) does is call f(n-1)=f(2).
            f(4)
           /    
        f(3)  
        /   
     f(2)  
  • f(2) triggers the base case and immediately returns $1$.
            f(4)
           /    
        f(3)  
        /1    
     f(2)=1  
  • The return value $1$ is passed back up to f(3).
  • f(3) is trying to call f(n-1) + f(n-2), and the first of those has just returned with value $1$.
  • So it now proceeds to call the second one, namely f(n-2) or f(1).
            f(4)
           /      
         f(3)  
        /1   \
     f(2)=1  f(1)
  • f(1) triggers the base case and immediately returns $1$.
  • The return value $1$ is passed back up to f(3).
            f(4)
           /     
         f(3)
        /1   \1
     f(2)=1  f(1)=1         
  • f(3) is now done, because it has computed $f(n-1) + f(n-2) = 1 + 1 = 2$
            f(4)
           /2      
         f(3)=2 
        /1   \1
     f(2)=1  f(1)=1         
  • It passes this value, $2$ up to its caller, namely f(4).
  • f(4) is trying to call f(n-1) + f(n-2), and the first of those has just returned with value $2$.
  • So it now proceeds to call the second one, namely f(n-2) or f(2).
             f(4)
           /2    \ 
        f(3)=2  f(2)
        /1   \1
     f(2)=1  f(1)=1         
  • f(2) triggers the base case and immediately returns $1$.
             f(4)
           /2    \1 
        f(3)=2  f(2)=1
        /1   \1
     f(2)=1  f(1)=1         
  • f(4) is now done, because it has computed $f(n-1) + f(n-2) = 2 + 1 = 3$
            f(4)=3
           /2    \1
        f(3)=2  f(2)=1
        /1   \1
     f(2)=1  f(1)=1         
  • The function above implements the Fibonacci sequence: $F_n=F_{n-1}+F_{n-2}$.
  • In this case we need 2 base cases to get things started, because $F_n$ depends on the previous 2 steps.
  • Recursive functions don't need to be sequences, as we'll see.
  • The time complexity here is something like $O(2^n)$
    • (Slightly less than $2^n$ steps but close enough.)
    • It is generally true, rougly speaking, that if a recursive function calls itself $k$ times and the "depth" of the recursion is $n$, then the time complexity is $k^n$. More on this next class.
  • This is not a good implementation for the Fibonacci numbers!

Revisiting the binary search code above:

def binary_search(data, key):
    if len(data) == 1:
        return data[0] == key   

    mid = len(data)//2
    if key < data[mid]:
        return binary_search(data[:mid], key)
    else:
        return binary_search(data[mid:], key)

Question: Is this function making multiple recursive calls and thus running in $O(2^n)$ time?







Answer: <!---No! Because of the if/else, it's only calling itself once.--->

  • Does that mean it runs in $O(n)$ time?
In [ ]:
!jupyter nbconvert _03-py-recursive-algorithms.ipynb --to html --template classic --output 03-py-recursive-algorithms.html

Thank you!