Is it necessary to refactor my entire class to test private methods?

Hey there folks,
I am currently tasked with testing a class/script that only exposes one public method, called aggregate().
this method calls about 10 individual private methods, that each load some parquet files, and format them as needed.
It looks something like this:

class Aggregation:
    def __init__(self):
    ...
    def __format_foo(self, df):
        ...
    def __aggregate_foo(self, data: dict[str, pd.DataFrame]):
        for tag, df in data:
            self.__format_foo(df)
            ...
        ...
    def aggregate(self):
        data = self.__get_foo_data()
        df = self.__aggregate_foo(data)
        ...

My question now is, how can I test this code? I know that code that needs to be tested should usually be available in public methods, but in this case it would hurt encapsulation, as its not needed by any other classes.
If I were to only test aggregate() as a whole, I feel like it would be hard to test the behaviour of each individual method, to then know where exactly errors might occur.

Are there any experienced software-engineering individuals, that can help a newbie like me out in this situation?

Thank you very much! :slight_smile:

Well, my first recommendation would be: Don’t use that form of private method :slight_smile: It’s a LOT easier to do things like testing if, instead, you name your method like this:

   def _aggregate_foo(self, data: dict[str, pd.DataFrame]):

It’s still private, but now you don’t need to worry about name mangling.

When it comes to testing things in private methods, it all depends on WHAT you’re testing. Unit tests should be allowed to reach into a class’s internals and test specific behaviour, as they will tend to be fairly tightly coupled with the code itself. End-to-end testing would only ever use public methods, though, so it would be calling the aggregate() method without concerning itself with the implementation details.

The purpose of testing is not to determine where the bugs are. That’s part of debugging. The purpose of testing is to verify the absence of bugs (at least for the use cases you’ve thought of so far, notwithstanding interactions with other parts of the system). You do this by making sure (by trying it - in an organized, automated way) that, when you use the code according to the specification, it does what the specification says should happen. (The alternate view here is that you create the specification by writing the test.) Using the code according to the specification entails using the public interface. That’s what we mean by “public”.

So that’s what we test.

That said: designing this as a class, as described, makes no sense. A well designed interface will make dependencies explicit (e.g., at least a name of a folder that contains the files); and it will not involve instantiating a class in order to call one specific public method, once. Well-designed interfaces look more like:

def aggregate_files(folder):
    return pd.concat(_make_dataframe(_load(file)) for file in _find_files(folder))

Also: none of the above is really Python-specific. It’s just proper OOP practice, understood pragmatically in the context of proper software engineering practice.

As a Python-specific thing, though: the language does not actually provide the kind of protection that private etc. keywords do in other languages, by design. Method names with leading double underscores get mangled, but the mangling scheme can easily be worked around to defeat the protection at runtime, and there are no compile-time checks. It’s only intended to prevent accidents.

By convention, Pythonistas usually use a single leading underscore instead. That avoids name mangling (which provides less value than it first appears, while sometimes turning out counterproductive later), while providing a useful signal that methods are not part of the public interface (along with the lack of a docstring). This also applies to ordinary functions as well as methods, and even to class names (yes, there are all kinds of valid reasons to write an entire class that is not part of the public interface). Such names do have some special significance; in particular, they will be excluded from the default __all__ for a module or package (i.e., will not be imported by default using a star import).

Alright thank you very much!
I do not think that refactoring the entire script/class is in the scope of my current task.
I was aware that there technically aren’t any private classes in Python, but just didn’t know where it would be a good idea to test it that way, as it also seemed very unpythonic to me.
I will definitely give the video you linked a watch and see for myself, whether not writing classes might be a good idea. For now, I guess that I have to test it the ‘ugly’ way, don’t I?
Thank you :slight_smile:

Thanks, I wasn’t quite aware of that. :slight_smile:
Currently, we’re only concerned with unittests (at least for now).
So are you saying, that it would be still be considered good practice, if I were to access the code’s “private” methods?
So far I have already been patching quite a lot of imports, as I am trying to create a deterministic environment for my tests. This leads to my test class already looking kind of bloated as is.

Do you think it would be a good idea, to write tests for each “private” method individually, or should I combine some tests depending if they are logically intertwined?

Yes, that’s still good practice. In C++ terminology, your unit tests could be considered “friend classes”.

The single underscore convention (which is broader than just Python, incidentally - I’ve no idea how broad, exactly) basically means “hey, you probably shouldn’t be using this, it’s private”, and more specifically, “this is NOT covered by any sort of backward compatibility guarantees”. So external callers should avoid them, since they could change at any time (even in a bugfix release); but your unit tests are part of your own project. If you change a private method, you probably had to change the test anyway, because you’re changing intended behaviour. So it’s fine to directly call private methods.

Welcome to testing, I’m afraid.

One thing I will say, though: Tests need to actually be useful. The purpose of unit tests (and other forms of automated testing) is to save you time, effort, or hassle. You are not here to serve the test suite; it is here to serve you. As you write a test, think about whether it’s more likely for the test to break as part of normal iterative coding, or as the result of a bug. For example, snapshot testing a React component is, in my opinion, worse than useless - it gives the illusion that you have perfect test coverage, but since any edit will invalidate the test, it becomes completely normal to just regenerate your snapshots every edit. So they protect you against… accidentally editing the wrong file, maybe? In contrast, a unit test of a library function to ensure that it behaves correctly in a particular edge case is incredibly useful, as it’s an easy way to check boundary conditions, and is part of the function’s contract.

With these private methods, they’re not part of any sort of external contract, but it’s not easy to test every part of the public methods cleanly. So with a well-designed internal method set like you’re hinting at here, it should be possible to test each of the separate parts (for example, I would expect that you could unit test _aggregate_foo while mocking _format_foo and thus independently test the two), making the tests quite useful.

You’ll still have a lot of boilerplate in the tests, unfortunately. But maybe a small number of well-chosen tests will be less hassle than a large number of “shotgun” tests.

Unit tests? Test each one individually. Each one has a purpose, and if designed properly, has clearly-defined inputs and outputs (and possibly effects). You should be able to test accordingly.

But if a private method exists solely as part of something else - these are usually inner functions rather than their own methods - then don’t bother testing them, they lack individual purpose and are simply tested as part of their whole.