Need Help with Function to Calculate Factorial

nolrahirose · October 11, 2024, 9:35pm

Hello everyone,

I’m trying to write a function in Python that calculates the factorial of a given number, but I’m running into some issues. Here’s what I have so far:

def factorial(n):
    if n < 0:
        return "Invalid input"
    elif n == 0:
        return 1
    else:
        result = 1
        for i in range(1, n + 1):
            result *= i
        return result

The function seems to work for positive integers and zero, but I’m not sure if it’s the most efficient way to calculate factorials.

Is there a more efficient way to implement this?
Are there any edge cases I should consider?
How can I handle very large numbers, as factorials grow quickly?

I appreciate any help or suggestions you can provide!

Thanks in advance!

elis.byberi · October 11, 2024, 10:10pm

Efficient:

import math

def factorial(n):
    if n < 0:
        return "Invalid input"
    return math.factorial(n)

Very large numbers:

import math

def stirling_approximation(n):
    if n < 0:
        return "Invalid input"
    if n == 0:
        return 1
    return math.sqrt(2 * math.pi * n) * (n / math.e) ** n

JamesParrott · October 11, 2024, 10:16pm

I think you’ve hit on the main optimisation already for a single call, and made it iterative, not recursive.

The nice thing about the recursive approach though, is it can easily speed subsequent calls with a cache.

Also multiplication is associative (and commutative too). So does it matter which order the factors are multiplied?

The main *= loop is what we call “embarassingly parallel”, so if you have more cores available, or specialist hardware, you could split it up across multiple threads or processes.

Ultimately for large inputs, you will require Stirling’s formula. And even with Stirling’s formula, factorials grow so quickly, you will quickly exceed the upper limit of normal native-C fixed bit-length integer types , and will swiftly come to rely on Python’s arbitrary length integer implementation. There may be other implementations and even different representations of numbers entirely, that could be faster.

Stefan2 · October 11, 2024, 11:20pm

How very large? And doesn’t your solution handle that already?

MRAB · October 12, 2024, 12:45am

I’d just like to point out that returning an error string is a bad idea and not Pythonic. The function should either return a valid result or raise an exception.

AlSweigart · October 12, 2024, 2:36pm

Your code is perfectly fine as the solution to a programming exercise.

If you need to calculate factorials as part of a real-world production system, use math.factorial() so you don’t reinvent the wheel. (This is a lesson you learn as you get experience as a software engineer.)

If you are interested in writing “more maintainable” code, you should have the function raise ValueError instead of return a string. This way, the function always returns an int. You also want to add type hints to your function.

If you want the code to more maintainable by shortening it, you could get rid of the n == 0 case:

def factorial(n):
    if n < 0:
        raise ValueError('arg must be a positive int')
    else:
        result = 1
        for i in range(1, n + 1):
            result *= i
        return result

The for loop does a needless * 1 multiplication if n is 1 but it makes the code simpler. You are exchanging performance for an increase in code readability, which is a fair thing to do sometimes. (The entire value of code is not just how few nanoseconds it takes to run. This is something you learn as you get experience as a software engineer.)

Since the exception sends the execution out of the function, you could make that else an if, which gets rid of one level of indentation for the rest of the code:

def factorial(n):
    if n < 0:
        raise ValueError('arg must be a positive int')
    
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

You can also trade memory for performance. For example, if you precalculate a bunch of factorials, you can re-use those calculation results or use the last one as a starting point if you need to calculate a larger factorial than what you have cached:

CACHE = {0: 1, 1: 1, 2: 2, 3: 6, 4: 24, 5: 120, 6: 720, 7: 5040, 8: 40320, 9: 362880, 10: 3628800}
CACHE_MAX_N = max(CACHE.keys())
FACT_OF_MAX_N = CACHE[CACHE_MAX_N]

def factorial(n):
    if n < 0:
        raise ValueError('arg must be a positive int')

    if n in CACHE:
        return CACHE[n]

    result = FACT_OF_MAX_N
    for i in range(CACHE_MAX_N + 1, n + 1):
        result *= i
    return result

How many values you want to cache is a judgement call on what’s more important for your application: fast speed or low memory usage. But keep in mind that factorials grow so large so fast that even 2000! can’t be converted to string to display it in Python with increasing the default limit:

>>> math.factorial(2000)
Traceback (most recent call last):
  File "<python-input-3>", line 1, in <module>
    math.factorial(1600)
    ~~~~~~~~~~~~~~^^^^^^
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit

All of these issues come up and the answer is always “it depends”. It depends on what size factorials you’ll be calculating (like, 40! or 4000000000000!) and whether you need them fast or memory is limited. Or maybe the schedule doesn’t give you time to come up with some optimum solution so you should just stick to the already-existing math.factorial() (in my experience, this is the most common case and best solution.)

But like I said, your code is perfectly fine too.