Using a module for configuration settings

wigging · October 30, 2024, 4:33am

I have a Python package that defines various configuration settings in a config.py module. The layout of the package is shown below:

mypackage/
├── src/
│   └── mypackage/
│       ├── __init__.py
│       ├── adder.py
│       ├── config.py
│       └── logger.py
├── README.md
└── pyproject.toml

The contents of config.py, adder.py, and logger.py are shown below. For this example, the config.py module only contains three variables, but the actual module may contain 10 to 20 variables that are used for configuration.

# --- adder.py ---

from . import config

def adder(x) -> float:
    """Add value to config price."""
    a = x + config.PRICE
    return a

# --- config.py ---

HOST = "localhost"
PORT = 8080
PRICE = 5.89

# --- logger.py ---

from . import config

def log_config():
    """Display the configuration settings."""
    host = config.HOST
    port = config.PORT
    price = config.PRICE
    print("host", host)
    print("port", port)
    print("price", price)

A simple example of using the package is shown next:

import mypackage as pkg

pkg.log_config()

new_price = pkg.adder(10)
print("new price", new_price)

This prints the following:

host localhost
port 8080
price 5.89
new price 15.89

By defining the configuration in the config.py module, I can import the settings into the other files for use by the functions in those files. This approach is fine if I never want to change the configuration settings defined in config.py. However, I want users of the package to be able to provide their own configuration file which would override the default config values.

A possible solution is something like the example shown below. In the example, a configuration class is created with the path to a config file. That class is passed as an input parameter to the other functions. But I don’t want to pass a config object to every function or class that is imported from the package.

import mypackage as pkg

config = pkg.Config("~/Users/home/config.json")

pkg.log_config(config)

new_price = pkg.adder(10, config)
print("new price", new_price)

What I would like to accomplish in this package is the following:

Use default configuration settings in the package if a configuration file is not available.
Override the default configuration values in the package if the user provides a configuration file. The configuration file could be a config.json, or settings.toml, or something else.

So how can I implement these features in the example package that I provided above? Is it even possible to do this with a module or is there a better approach that I should use?

JamesParrott · October 30, 2024, 8:24am

This design seems complicated. How many variables will go into config.py, that can’t simply be an argument the user passes to a function, e.g. main(host='localhost', port=8080, price = 5.89)?

Otherwise, hardcoding the defaults in a config.py isn’t terrible. But you’ve got a pyproject.toml file anyway, so its pretty straight forward to get hatch or whatever your build back end is to ship a data file (e.g. .toml or .json) along with your .py files etc.

The user might not appreciate digging through their venv or system Python installation looking for wherever the default config.json ends up, or will simply not bother. So at the very least I’d let them specify the path to their config file at the main entry point or base class of your library. But then why not just let them specify all the args and key word args they want there instead.

dpdani · October 30, 2024, 12:35pm

I think you might find this useful, even if you don’t necessarily need validation

wigging · October 30, 2024, 12:46pm

In the example, the config.py module only has three variables but the actual module may have 10 to 20 variables that define various configuration values. So I don’t want to pass all those variables as arguments to a function.

I would also like to note that I don’t want to require the user to provide a config file. The idea is that the package will use some default configuration settings if the user does not provide a config file.

wigging · October 30, 2024, 12:49pm

I’m aware of Pydantic, but I would like to avoid adding a dependency to the project that is just used for configuration. I would prefer to do this with standard Python features.

sirosen · October 30, 2024, 1:51pm

There are a number of reasonable patterns for handling configs, so it’s hard to give precise feedback without knowing more about your use case. Configuration of an application looks very different from a library. Here are some broad recommendations:

Unless you are certain that there will only be one config per user rather than per project, do not put a config in the user’s home – have them put it in the working directory instead or allow them to provide a path. Layering or overloading configs is messy, so if there’s going to be per project config, start from there. I would suggest .mytool.toml as a default config path, and allowing users to specify a config path.

Define config as an object, not a module, and explicitly instantiate configs with some kind of loader (even just a function which takes a path). Not only is this better for writing tests, if your tool is a library it allows users to explicitly switch and set configs.

If this is a library, rather than an application, consider avoiding the filesystem altogether. Config can be an object, which users instantiate and pass in.

Consider the pattern of having a primary object, which can be configured, and whose methods are the main interface for the package. For an example, check out the responses mocking/testing library. This would allow you to bind config to an instance, but still have simple methods as interfaces. Methods of the default instance can be exposed for a functional interface.

wigging · October 30, 2024, 2:54pm

What do you mean by “application” and “library”? I think of an application as being something like a command line tool, GUI desktop software, or a web app. And when you say library, that makes me think you are referring to a Python package. Am I understanding these terms correctly?

sirosen · October 31, 2024, 3:32am

Yes, that’s what I meant by the terms, more or less. A Python package can be an application though. httpie, tox, and sphinx are all applications written in Python.

If you’re building a library, I’m a big fan of the way responses does this. You have instances which can be configured programmatically, but there’s a default instance available module-level methods.
Implicitly loading config from files would likely make the library uncomfortable to use, as it’s hard to control or change in the fly.

wigging · October 31, 2024, 4:07am

Is github.com/getsentry/responses the responses package that you keep referring to? If it is, where is it defining the default configuration?

sirosen · October 31, 2024, 1:00pm

Yes, that’s the library to which I’m referring.

The primary object you use in that library is a ResponsesMock, which has various configurable settings. If you use the package-level methods like responses.add, you’re implicitly using the default ResponsesMock.

As far as I know, there’s no dedicated object called a “config” because one is not necessary in this case. But the same model is applicable if you have a dedicated Config object model, and attach that to the primary interfaces for your library.

One way:

from foolib import cool_method, Config

# with defaults
cool_method("hello world")

# with custom config
myconf = Config(phasers="stun")
cool_method("hello world", config=myconf)

Another way (similar to responses):

from foolib import cool_method, CoolModel

# with defaults
cool_method("hello world")

# with custom config
myinstance = CoolModel(phasers="stun")
myinstance.cool_method("hello world")

Or you can combine these, and define some model object for users to use, as a representation of your library, and a config class.

These approaches make config possible to manipulate in-process, and allow a single process to have multiple interactions, potentially in parallel, with different configs.

wigging · November 1, 2024, 3:09pm

Here is an example that I think adheres to your suggestion.

In a package I might have a function like what is shown below. It has an optional settings argument that uses default values if the user does not provide a Settings object.

# multiplier.py

from .settings import Settings

def multiplier(x: int, settings: Settings | None = None) -> float:
    """Multiply the price and quantity by a value."""

    if settings is None:
        s = Settings()
        result = s.price * x + s.quantity * x
    else:
        result = settings.price * x + settings.quantity * x

    return result

The configuration settings we discussed earlier is represented by the Settings class in the package as shown below. This only contains two settings but the actual class may contain many more. The class could also have a method that would read a YAML, JSON, or TOML file and use the settings defined in that file to overwrite the default settings.

# settings.py

class Settings:
    price: float = 12.89
    quantity: int = 4

An example of using the multiplier and settings object is given below.

import mypackage as pkg

# default settings are used
m = pkg.multiplier(3)
print("m is", m)

# custom settings from user are used
s = pkg.Settings()
s.price = 49.05
s.quantity = 10

mm = pkg.multiplier(2, settings=s)
print("mm is", mm)

I agree that something like this is better than just defining the configuration (settings) as global variables in a module. But I’m curious about how you would write the config or settings object. Should it just be a class, should it be a data class, or something else?

sirosen · November 2, 2024, 12:59am

I’d probably make it a dataclass if it’s a dedicated config / settings object. I’ve seen folks use frozen dataclasses for similar cases, which is nice in that it establishes that the only supported interface for setting values is instantiation.

All that matters IMO is that it’s an easy to read and document container for values.

wigging · November 2, 2024, 1:09pm

If I used a dataclass then how would I load the settings/config values from a file like YAML, JSON, TOML? As I showed in the Settings class above, I would like to have default values but also have the ability to load overwrite those values by reading a config/settings file.

sirosen · November 2, 2024, 1:21pm

Define a loader whose job is to do that conversion.
Options are various, e.g.,

myconf = Config.from_toml("foo.toml")

# or

with open("foo.toml", "rb") as f:
    foo = tomllib.load(f)

myconf = Config.from_dict(foo)

# or

loader = ConfigLoader()
loader.add_source("foo.toml")
myconf = loader.load()

There are probably some other good patterns I’m forgetting here, but I hope that helps!

wigging · November 2, 2024, 2:09pm

If Config is a dataclass then how is from_toml() overwriting the default attribute values? I thought dataclasses are static but I guess you could use a class method like this:

import tomllib
from dataclasses import dataclass

@dataclass
class Config:
    price: float = 12.89
    quantity: int = 4

    @classmethod
    def load_toml(cls, file: str):

        with open(file, "rb") as f:
            conf = tomllib.load(f)

        cls.price = conf["price"]
        cls.quantity = conf["quantity"]

The idea is that you can do conf = Config() to use the default values or you can do conf = Config.load_toml("~/User/home/config.toml") to load the values from a file. I haven’t tried to run this code so it might not work but hopefully it conveys what I’m trying to do.