Standardized Interface for Configuration Management in Python

This idea proposes the introduction of a standardized interface within the Python standard library for managing configuration files across multiple formats, such as JSON, YAML, TOML, INI, and .env. The interface will provide consistent methods for loading, merging, validating, and accessing configurations in a Pythonic way. This proposal also includes schema validation to ensure the correctness and structure of configuration data before it is used by an application.

Motivation

Configuration files are ubiquitous in Python applications, serving as the central point for managing settings, preferences, and environment-specific configurations. Currently, Python developers must rely on a variety of external libraries, each with its own API and functionality, to handle different configuration formats. This leads to inconsistencies in codebases, increased complexity, and a higher likelihood of errors.

A standardized interface within the Python standard library would address these issues by offering a unified approach to configuration management. By including support for schema validation, the interface would further enhance the reliability and maintainability of Python applications.

Rationale

Existing Solutions and Their Limitations

  • External Libraries: Python offers several libraries such as configparser (for INI files), json, PyYAML, toml, and python-dotenv. However, each library has its own API and does not integrate well with others, requiring developers to write custom code to handle configuration merging, environment-specific overrides, and validation.

  • Fragmentation: The lack of a unified interface means that configuration management is often fragmented and inconsistent across projects, making it difficult to maintain and extend codebases.

Benefits of a Standardized Interface

  • Consistency: A single, unified API for configuration management across multiple formats would reduce the cognitive load on developers and promote consistent coding practices.

  • Ease of Use: By abstracting away the complexities of different configuration formats, the standardized interface would simplify configuration handling in Python applications.

  • Validation: Built-in schema validation would catch configuration errors early, improving the robustness of applications.

  • Environment-Specific Handling: The interface would support easy switching between environments (e.g., development, staging, production), ensuring that the correct configurations are loaded and validated.

Specification

1. ConfigManager Class

The ConfigManager class will serve as the central interface for managing configurations. It will provide methods to load configurations from various formats, merge them, and access configuration values.

Example Usage

from config_manager import ConfigManager

# Determine the environment (e.g., 'development', 'production')
env = os.getenv('ENV', 'development')

# Initialize the configuration manager
config = ConfigManager()

# Load base configuration (common settings)
config.load('config/base_config.yaml')

# Load environment-specific configuration
if env == 'development':
    config.load('config/config.dev.yaml')
    config.load_from_env_file('.env.dev')
elif env == 'production':
    config.load('config/config.prod.yaml')
    config.load_from_env_file('.env.prod')

# Load additional settings from environment variables
config.load_from_env()

# Access configuration values
db_host = config.get('database.host')

# Validate configurations
if not config.validate():
    raise ValueError("Invalid configuration settings")

# Application logic here...

2. Supported Formats

ConfigManager will support the following formats out of the box:

  • JSON: Using the json standard library.
  • INI: Using the configparser standard library.
  • .env Files: Using a built-in parser similar to python-dotenv.
  • YAML: Using the PyYAML library.
  • TOML: Using the toml library.

3. Schema Validation

Schema validation will be implemented using jsonschema for JSON-like schemas. The ConfigManager will allow developers to set a schema and validate the loaded configuration against it.

Example Schema

config.set_schema({
    "type": "object",
    "properties": {
        "database": {
            "type": "object",
            "properties": {
                "host": {"type": "string"},
                "port": {"type": "integer", "minimum": 1024, "maximum": 65535},
                "username": {"type": "string"},
                "password": {"type": "string"}
            },
            "required": ["host", "port", "username", "password"]
        },
        "logging": {
            "type": "object",
            "properties": {
                "level": {"type": "string", "enum": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]}
            },
            "required": ["level"]
        }
    },
    "required": ["database", "logging"]
})

4. Merging and Overrides

ConfigManager will support hierarchical configuration loading, allowing base configurations to be loaded first and then overridden by environment-specific settings. Environment variables will be given the highest priority, enabling dynamic configuration at runtime.

5. Error Handling

If the configuration does not validate against the schema, ConfigManager will raise a ValidationError, providing detailed information about what went wrong.

6. Dynamic Reloading (Optional)

For applications that require dynamic reloading of configurations (e.g., servers), ConfigManager could support reloading configurations when the underlying files change. This feature would be particularly useful for long-running applications.

Backward Compatibility

The introduction of ConfigManager will not affect existing codebases, as it will be a new addition to the standard library. Existing projects can continue to use their current configuration management practices, but they will have the option to adopt ConfigManager for a more unified approach.

Implementation

The initial implementation will focus on integrating existing libraries (json, configparser, PyYAML, toml, jsonschema) under a unified interface. Future versions could extend this functionality or improve performance based on community feedback.

Reference Implementation

A reference implementation will be developed as a Python package to gather feedback from the community before potentially integrating it into the standard library. This package will be available on PyPI for testing and evaluation.

Unresolved Questions

  • Dynamic Schemas: How to best support dynamic schemas that can change based on environment or runtime conditions.
  • Extending Format Support: Whether additional configuration formats should be supported natively or via plugins.
  • Integration with Existing Tools: How ConfigManager should interact with existing configuration management tools and libraries.

Conclusion

This PEP proposes a standardized interface for configuration management in Python, addressing the need for consistency, ease of use, and robustness in handling configuration files across different formats and environments. By integrating schema validation and supporting environment-specific configurations, ConfigManager aims to streamline the development and maintenance of Python applications, making configuration management more intuitive and less error-prone.

I thought the pyproject.toml was already trying to do this?
(Being One standardised interface that’s native to Python, and therefore ought to be the default that is)

The problem I have with actually using pyproject.toml that way is a lack of conversion tools, and some redundancy in the format.

If you want to help Python move towards the standardised interface, I think creating good conversion tools would be more useful than creating yet another configuration interface. I know I’d certainly appreciate a tool to help me move away from setup.py.

Another current problem I have with the current configurations is redundancy. For example: Some projects specify the line-length in the Black settings, some in Ruff, some both. My IDE has to read it from however it is supplied, which it sometimes can’t. If I understand your proposal correctly, it would make that worse. I’d rather have 1 prefered way of specifying linter settings that can be read by all linters than 20 ways of specifying linter settings that can all be read by one class.

2 Likes

If there isn’t one already, this would make a great third party library on PyPi. All those formats loosely map to dictionaries.

Everything’s doable with core libraries, except Yaml (and writing Toml). Implementing Yaml support from scratch without an existing library, is a huge can of worms. Almost as large as the Yaml spec itself YAML Ain’t Markup Language (YAML™) revision 1.2.2 (it prints to 65 pages of A4).

2 Likes

This hierarchal multi-format configuration system that you’re proposing sounds a lot like hydra.

I think having pyyaml in that list is an instant non starter. Not only is not standard library but it contains Cython and it requires the user to choose between its insecure grant shell access mode and safe also insecure easiest DoS attack you’ve ever written mode.

2 Likes

Reminds me of xkcd: Standards.

Mapping most of these winds up getting to dict which most things have load/dump methods.

Is the idea that it would figure out the file’s format they deserialize it?

2 Likes

Looks like pydantic-settings or dynaconf.

You know there is a solution to XKCD927 problem, right? You have to port all configuration in the Python interpreter itself and some most important modules (pip, setuptools, packaging) to your new configuration system and thus show its superiority. Is it too much work? Yes, then you are firmly in the centre of that XKCD.