Building a Binary Search Tree from Array: Why must the Function Return the Node?

Slurms_MacKenzie · July 31, 2025, 1:45pm

Hello all,

below is a bit of code that builds a valid binary search tree from an array of values. It doesn’t work without the final line of the insert() routine, but I can’t fathom why. According to my understanding, we’re only working with references, so one should be free to store a reference to a new node as an attribute of some descendent of the root node, and be able to access that new node as long as a reference to the root (or some other ancestor of the new node) is maintained.

I don’t get why the insert() routine needs return node; it seems to me that it should be possible to build the tree with only the calls in which node is None returning a node; when you append a node to a linked list, every previous list element doesn’t need to know about it?

def buildBST(arr):


    def insert(node, val):

        if node is None:
            return Node(val)

        if val < node.val:
            node.left = insert(node.left, val)
        else:
            node.right = insert(node.right, val)

        return node


    root = None

    for val in arr:
        root = insert(root, val)

    return root

tjreedy · July 31, 2025, 2:36pm

You should have explained what “doesn’t work” means, and included full traceback if there was one. However, without return node, the function returns None when the input was not None, so I believe that the next call with that None will return a new root.

Slurms_MacKenzie · July 31, 2025, 3:21pm

My apologies. and yes, that’s it!

Amazingly I either did not know or had forgotten that a function returns None by default; I was of the thinking that without an explicit return statement it would just not return a value.

The result I’m getting is that root ends up being a node with no children and a value equal to the last element in the array, which makes sense given your explanation.

Thanks!

franklinvp · July 31, 2025, 3:30pm

If you were to compute all the values of the function F:\mathbb{N}\to\mathbb{N} defined recursively by F_0=0, F_1=1 and for all n\in\mathbb{N}\ F_{n+2}=F_{n+1}+F_n, you need to:

Compute the first value that you can compute from the data available. At first that would be F_2=F_1+F_0=1+0=1. Further ahead the values computed F_m and F_{m+1}, for some m, need to be given to the computation of F_{m+2}. Look at the lines node.left = insert(node.left, val) and the one for right.
Get the next n for which you you can/will compute F_n.

The insert method is doing both, inserting and telling the next position, the root of the next sub-tree that needs processing. You could rename the method to make this more clear.