Can't figure out how the return command works

Hi,
I’m trying to figure out how the return command works. I understand that once a solution is found, the return command will stored the solution in the caller function and terminate the loop.

But this is not the case. The mergesort function blow is supposed to continuously split a list till there is only one element remaining in the list. The if statement says once the list length less than 2 (i.e. a list of one element), return the list and store the value in the caller left=mergesort(L[:middle]).
When I run a test with L=[1,3,4,2,5,6,7,8]. The printout is as below. I don’t understand the last two lines : “left is None”. Does that mean after the list is reduced to one element, the if…else statement still continues? Why the if…else statement terminates after two runs?

def mergesort(L):
 """Returns a new sorted list with the same elements as L"""
 print(L)
 if len(L) < 2:
  return L[:]
  
 else:
  middle = int(len(L) / 2)
  left = mergesort(L[:middle])
   #right = mergesort(L[middle:])
print('left is ',left)
  #together = merge(left,right)
  #print('merged', together)
  #return together
output from test
L=[1,3,4,2,5,6,7,8]
mergesort(L)

result pintout:
[1, 3, 4, 2, 5, 6, 7, 8]
[1, 3, 4, 2]
[1, 3]
[1]
left is  [1]
left is  None
left is  None

I’m trying to figure out how the return command works. I understand
that once a solution is found, the return command will stored the
solution in the caller function and terminate the loop.

Well, it leaves the function containing the return statement.

But this is not the case. The mergesort function blow is supposed to continuously split a list till there is only one element remaining in the list. The if statement says once the list length less than 2 (i.e. a list of one element), return the list

Technically, L[:] is a copy of the list, not the original.

and store the value in the caller left=mergesort(L[:middle]).
When I run a test with L=[1,3,4,2,5,6,7,8]. The printout is as below. I don’t understand the last two lines : “left is None”. Does that mean after the list is reduced to one element, the if…else statement still continues? Why the if…else statement terminates after two runs?

Look at the code:

 def mergesort(L):
  """Returns a new sorted list with the same elements as L"""
  print(L)
  if len(L) < 2:
   return L[:]
  else:
   middle = int(len(L) / 2)
   left = mergesort(L[:middle])
    #right = mergesort(L[middle:])
 print('left is ',left)
   #together = merge(left,right)
   #print('merged', together)
   #return together

The print('left is ',left) statement is indented at the same level as
the def keyword at the top of the function. That means it is no long
inside the function, and also that the function definition stops.

Indentation is critical in Python.

The lines you commented out:

   #together = merge(left,right)
   #print('merged', together)
   #return together

are not longer part of the function, and are anyway commented out so
they do not do anything. As a result, the function definition is
effectively just this:

 def mergesort(L):
  """Returns a new sorted list with the same elements as L"""
  print(L)
  if len(L) < 2:
   return L[:]
  else:
   middle = int(len(L) / 2)
   left = mergesort(L[:middle])

So, what happens when len(L) >= 2? The programme takes the else
branch and hits the end of the function. At that point it leaves the
function, via an implicit return statement.

Now, all Python functions always return a value, and if you do a “bare”
return it is the same as return None. This is where the None in
your results comes from.

There used to be a return together statement in that branch which
would have returned the merged halves of the list.

Cheers,
Cameron Simpson cs@cskk.id.au

Thanks a lot for the help. When len(L) >2, the program will call mergesort() function again, but this time it run the function with a new list, which is half length of the original L list (see middle variable).

This time the new list is assigned to L list and go through the if…else statement again. If the new L list has only one element, it hits the return condition. In other words, a list will always hit the if statement and ends with return command.

I have added indentation to the print command so that it’s at the same level as if…else. I still get the same printout. That means one thing to me: 1) after the if…else statement is complete, it will continue to execute the print command. However, I don’t understand why the print('left is ',left) is run three times? Since the if…else statement stops at the return command, and value is stored in the caller: left = mergesort(L[:middle]), the left should not be None.

def mergesort(L):
 """Returns a new sorted list with the same elements as L"""
 print(L)
 if len(L) < 2:
  return L[:]
  
 else:
  middle = int(len(L) / 2)
  left = mergesort(L[:middle])
   
 print('left is ',left)

When trying to comprehend recursion, I strongly recommend printing out lots and lots of things at every level, and knowing what level of function call you’re in. Or alternatively, use a tool like this to see what’s going on:

https://pythontutor.com/python-debugger.html#mode=edit

Thanks a lot for the help. When len(L) >2, the program will call
mergesort() function again, but this time it run the function with a
new list, which is half length of the original L list (see middle
variable).

Yes, but the code you cited had that logic commented out i.e. absent.

This time the new list is assigned to L list and go through the if…else statement again. If the new L list has only one element, it hits the return condition. In other words, a list will always hit the if statement and ends with return command.

Correct.

I have added indentation to the print command so that it’s at the same
level as if…else. I still get the same printout. That means one thing
to me: 1) after the if…else statement is complete, it will continue
to execute the print command. However, I don’t understand why the
print('left is ',left) is run three times? Since the if…else
statement stops at the return command, and value is stored in the
caller: left = mergesort(L[:middle]), the left should not be None.

Ok, there’s 2 things going on here. Let’s look at your current
closer-to-working code:

 def mergesort(L):
  """Returns a new sorted list with the same elements as L"""
  print(L)
  if len(L) < 2:
   return L[:]
  else:
   middle = int(len(L) / 2)
   left = mergesort(L[:middle])
  print('left is ',left)

It occurred to me while making coffee this morning that I’d been less
than clear about the implied return in Python functions. So first I
want to try that again:

At the end of any Python function is an implicit return None
statement. If code execution falls off the end of a function, that
implied return happens. So effectively your code has this
implementation:

 def mergesort(L):
  """Returns a new sorted list with the same elements as L"""
  print(L)
  if len(L) < 2:
   return L[:]
  else:
   middle = int(len(L) / 2)
   left = mergesort(L[:middle])
  print('left is ',left)
  return None

And let’s restructure this slightly to make things even more obvious.
Since execution leaves the function early when len(L) < 2, you do not
need the else, instead you can write this:

 def mergesort(L):
  """Returns a new sorted list with the same elements as L"""
  print(L)
  if len(L) < 2:
   return L[:]
  middle = int(len(L) / 2)
  left = mergesort(L[:middle])
  print('left is ',left)
  return None

which does exactly the same thing.

Do you see now why this function will return None when len(L) >= 2?

The flow above is why you’re getting a None result for longer lists,
this is thing 1 of the 2 things I alluded to earlier.

The other thing is that a merge sort is normally a recursive function:

  • for a very short list (0 or 1 elements), return the list immediately -
    it cannot be “unsorted” with so few elements.
  • for other longer lists, divide the list into 2 smaller pieces, sort
    each piece, then merge the two sorted lists together into a single
    sorted list and return that list

The reason for that approach is that it is pretty easy to merge two
lists which you know in advance are each sorted to make a new sorted
list: just take the smallest element from the start of each list until
there are no more elements.

This algorithm is called a recursive algorithm because the “sort_ each
piece” step above is done by calling the mergesort function again for
each piece. So the function calls itself with the sublists. Eventually
(because each sublist is always smaller) the sublist must have 0 or 1
elements and the “trivial case” return can happen.

Most recursive algorithms function by taking a large problem such as
“sort this arbitrarily large list” and dividing it into smaller versions
of the same problem until the smaller version is small enough to do
trivially.

Because of this, with a larger list, mergesort is called several
times, and therefore your print() statement is run several times
because it is part the function. When the function returns None (via
the implicit return None statement), your print() statement will
print:

 left is None

Let’s fix up your implementation of the second (recursive) step above:
“divide the list into 2 smaller pieces, sort each piece, then merge the
two sorted lists together into a single sorted list and return that
list”
:

 # divide the list into 2 smaller pieces
 middle = int(len(L) / 2)
 # sort the left piece
 left = mergesort(L[:middle])
 # sort the right piece
 right = mergesort(L[middle:])
 # merge the 2 pieces
 merged = list(merge(left, right))
 # return the merged list
 return merged

You can get the merge() function from the heapq module with this
statement:

 from heapq import merge

at the top of your script. Alternatively, if this is a class assignment
you might be expected to implement the merge yourself (just as you’re
implementing a merge sort instead of using Python’s built in sorted()
function).

And let’s run it with a 5 element list [5,4,3,2,1], showing the
recursive calls:

 mergesort([5,4,3,2,1])
   mergesort([5,4])      # left piece
     mergesort([5])      # left piece
     mergesort([4])      # right piece
   mergesort([3,2,1])    # right piece
     mergesort([3])      # left piece
     mergesort([2,1])    # right piece
       mergesort([2])    # left peice
       mergesort([1])    # right piece

That’s how many times the mergesort() function gets called, and you’d
get a print() on every call where len(L)>=2 (because that’s where
you put your print(), in the “else” branch).

The indentation above indicates that one instance of mergesort() calls
another with the smaller list piece.

Cheers,
Cameron Simpson cs@cskk.id.au

Thank you so much, Cameron for the help! I would never figure out the implicit “return” command by myself.

Indeed, I’m trying to learn the merge sort function. Below is the complete function. The merge function looks good to me. However, I get lost with the mergesort(). This time, I uncommented the right = mergesort(L[middle:]) part, and I get totally different results.
Previously, I only have left=mergesort(L[:middle]), and the function only returns one valid ouptut: L=[1], but with right=mergesort(L[middle:]) added, I get multiple values assigned to left and right varialbe. See below.
intermediate left is [1]
intermediate right is [2]
intermediate left is None
intermediate left is [3]
intermediate right is [4]
intermediate right is None
intermediate left is None
intermediate left is [5]
intermediate right is [6]
intermediate left is None
intermediate left is [7]
intermediate right is [8]
intermediate right is None
intermediate right is None

I know this is what the program is expected to do, i.e., split a list into lists of a single item. However, I don’t understand how the recursion works here. Why one line of code can totally change the recursion order?

def merge(left,right):
 """Assumes left and right are sorted lists.
Returns a new sorted list containing the same elements
as (left + right) would contain."""
 result = []
 i,j = 0, 0
 while i < len(left) and j < len(right):
  if left[i] <= right[j]:
    result.append(left[i])
    i = i + 1
    #print(result)
  else:
    result.append(right[j])
    j = j + 1
    #print(result)
 while (i < len(left)):
  result.append(left[i])
  i = i + 1
 while (j < len(right)):
  result.append(right[j])
  j = j + 1
 #print(result)
 return result

def mergesort(L):
 """Returns a new sorted list with the same elements as L"""
 #print("L=",L)
 if len(L) < 2:
  #print("L=",L)
  return L[:]
  
 else:
  middle = int(len(L) / 2)
  left = mergesort(L[:middle])
  print('intermediate left is ',left)
  right = mergesort(L[middle:])
  print('intermediate right is ',right)

 print( "left is ",left)
  #together = merge(left,right)
  #print('merged', together)
  #return together 

Thank you so much, Cameron for the help! I would never figure out the
implicit “return” command by myself.

It’s a lot like a main programme: it exits when you reach the bottom of
the script. Functions exit when you reach the bottom of the function,
but functions return a value, so the placeholder value None is what is
returned if you don’t supply the value yourself.

Indeed, I’m trying to learn the merge sort function. Below is the complete function. The merge function looks good to me.

To me as well. It can be written more concisely, but the logic all looks
correct.

However, I get lost with the mergesort(). This time, I uncommented the
right = mergesort(L[middle:]) part, and I get totally different
results.

Previously, I only have left=mergesort(L[:middle]), and the function only returns one valid ouptut: L=[1], but with right=mergesort(L[middle:]) added, I get multiple values assigned to left and right varialbe. See below.

intermediate left is [1]
intermediate right is [2]
intermediate left is None
intermediate left is [3]
intermediate right is [4]
intermediate right is None
intermediate left is None
intermediate left is [5]
intermediate right is [6]
intermediate left is None
intermediate left is [7]
intermediate right is [8]
intermediate right is None
intermediate right is None

I know this is what the program is expected to do, i.e., split a list into lists of a single item. However, I don’t understand how the recursion works here. Why one line of code can totally change the recursion order?

It doesn’t change the order. It interleaves a “right branch” into the
previous order.

Let’s look at your “else” branch again:

 else:
  middle = int(len(L) / 2)
  left = mergesort(L[:middle])
  print('intermediate left is ',left)
  right = mergesort(L[middle:])
  print('intermediate right is ',right)

Suppose the last 2 lines were commented out. With only the “left” branch
and starting with a list [1,2,3,4,5,6,7,8] the merge sort will split
of the left [1,2,3,4] and call mergesort on that, which in turn
splits off [1,2] and calls mergesort on that, which splits off [1]
and calls mergesort on that.

The print() calls happen after the call and print the result of the
call. So you’d see:

 intermediate left is  [1]

from the innermost call with the single element list. Because you still
do not have a return, all the other calls still return None so they
are hard to tell apart. But you’d get:

 intermediate left is  None # from [1,2]
 intermediate left is  None # from [1,2,3,4]

next as the calls returned.

You’re making 1 call to mergesort every time you halve the list,
until it is short.

With the addition of the “right” branch" there’s a whole subtree of
calls for the right branch between the first line above and the next 2,
and each of those subcalls has its own “left” and “right” branches.

All of this is hard to see in the flat linear printout you’ve got.

Let’s modify the function, entirely to make the calls more obvious.
Change the function header to this:

 def mergesort(L, indent=""):

and change the “else” branch like this:

 else:
  subindent = indent + "  "
  middle = int(len(L) / 2)
  print(indent, 'mergesort left ', L[:middle])
  left = mergesort(L[:middle], subindent)
  print(indent, 'intermediate left ', L[:middle], '->', left)
  print(indent, 'mergesort right ', L[middle:])
  right = mergesort(L[middle:], subindent)
  print(indent, 'intermediate right ', L[middle:], '->', right)

Here we’re providing a little string-of-spaces to go at the start of the
print() calls which indents the text. We compute subindent = indent + " "
to pass to the inner calls so that their printouts are indented an
additional 2 spaces.

We’re also doing a print() before each call to mergesort() so that
you can see the start of the call as well as its result:

  print(indent, 'mergesort left ', L[:middle], '...')
  left = mergesort(L[:middle], subindent)
  print(indent, 'intermediate left ', L[:middle], '->', left)

Because these lines will usually be separated in the output, we print
the original L[:middle] list in both print()s for better context.

When I try that here I get a nice little indented listing showing the
call and the returned result, and making the output ordering clear.

Try this at your end and see what you get, and if the output makes more
sense.

Remember that you’ve still got the merge() and return together
commented out, so all the calls with longer lists return None instead
of the merged sublists.

Cheers,
Cameron Simpson cs@cskk.id.au

THANK YOU THANK YOU THANK YOU, Mr. Simpson!
The print codes help a lot. At the beginning, I was so overwhelmed by the concept of recursion environment (and had to take a long walk outside). After a few simulations, I was able to understand the whole picture. To sum up, once the single element left =mergesort(xxx) calculation is complete, it will return to the parent context and try to run left=mergesort() again and get output as None. However, after this, it will continue to execute right=mergesort(xxxx), using L and middle defined in the parent context. This in turn opens a new branch of mergesort() recursion. If I uncomment the ‘together’ variable and return code, I see the value of ‘together’ is passed to left/right variable and build the basis of next together calculation.

I have to say that as a beginner, it’s really hard to understand the recursion context, but I see the potential of recursive calculation.

For note taking purpose. I am copying the final codes I used below.

def mergesort(L, indent=""):
  if len(L) < 2:
    return L[:]
  else:
    subindent = indent + "  "
    middle = int(len(L) / 2)
    print(indent, 'mergesort left ', L[:middle]," middle: ",middle)
    left = mergesort(L[:middle], subindent)
    print(indent, 'intermediate left ', L[:middle], '->', left," middle: ",middle)
    print(indent, 'mergesort right ', L[middle:]," middle: ",middle)
    right = mergesort(L[middle:], subindent)
    print(indent, 'intermediate right ', L[middle:], '->', right," middle: ",middle)
    together = merge(left,right)
    print(indent, "together->",together)
    return together

def merge(left,right):
 """Assumes left and right are sorted lists.
Returns a new sorted list containing the same elements
as (left + right) would contain."""
 result = []
 i,j = 0, 0
 while i < len(left) and j < len(right):
  if left[i] <= right[j]:
    result.append(left[i])
    i = i + 1
    #print(result)
  else:
    result.append(right[j])
    j = j + 1
    #print(result)
 while (i < len(left)):
  result.append(left[i])
  i = i + 1
 while (j < len(right)):
  result.append(right[j])
  j = j + 1
 return result

run
L=[1,2,3,4]
mergesort(L, indent=“”)
print(mergesort(L, indent=“”))

The print codes help a lot. At the beginning, I was so overwhelmed by
the concept of recursion environment (and had to take a long walk
outside).

Have you ever played with a Tower of Hanoi puzzle?

It is also amenable to a recursive solution.

After a few simulations, I was able to understand the whole picture.

Good!

I have to say that as a beginner, it’s really hard to understand the
recursion context, but I see the potential of recursive calculation.

Recursion is a common example of the larger notion of subdividing a
problem into smaller problems. With recursion, the smaller problem is
just a smaller version of the same problem, and so one method it to call
the same function again with smaller data. Thus recursion.

Sometimes the subdivision isn’t the same problem. In a sense, your
separate merge() function solves a different (but simpler) problem.
So your sort method calls itself (recursion) to obtain 2 smaller sorted
lists, then calls merge() to merge to sorted lists, which is a simpler
problem than “sort an unordered list”.

For note taking purpose. I am copying the final codes I used below.

Thank you!

Cheers,
Cameron Simpson cs@cskk.id.au