Add a minimal CLI functionality to urllib.request

It would be useful if urllib.request.urlopen could be executed via the command line using python -m urllib.request, in a similar way to how curl -f works, to perform simple HTTP requests that fail on HTTP errors. For example, running:

python -m urllib.request http://localhost:8080/health

This functionality would be helpful in lightweight Docker containers, such as Python slim images, where curl or other similar tools are not installed by default.

Proposal:

  • Enable urllib.request to support direct command line execution for GET requests, similar to other -m modules in Python.
  • Allow the return of a non-zero exit code if the request fails (e.g., for non-2xx HTTP responses), so that it can be easily integrated with health checks like in ECS or Docker Compose.

Use Case:

In containerized environments, specifically with Python slim images, it is a good practice to avoid installing extra tools to keep the image minimal and fast to build. A built-in python -m urllib.request command would allow checking HTTP endpoints without additional installations.

Alternative:

It’s possible to run inline Python code:

python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')" 

But it requires escaping quotes in some contexts like ECS task definitions (json), making it harder to maintain and understand.

"healthCheck": {
    "command": [
        "CMD-SHELL",
        "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8080/health')\""
    ],
    "interval": 30,
    "timeout": 5,
    "retries": 3,
    "startPeriod": 10
}

Additional Information:

Installing curl in a Python 3.12 slim image

FROM python:3.12-slim-bookworm
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

added approximately 5.13 MB, ie +4.3%. And it took 14s in my enviroment (poor bandwith).

2 Likes

I suggest to first read discussion about adding a CLI to other modules. Much of the raised concerns will equally apply here.

2 Likes

Thank you for the feedback. I found the timeit discussion and briefly reviewed it.

I understand the concerns about adding CLI functionality to modules, but I believe the use case I’m proposing is very specific and minimal. The idea is not to replace curl or create a full-featured CLI for urllib, but simply to provide a way to perform an HTTP GET request and fail on HTTP errors, mimicking what curl -f does.

This could even be introduced as python -m urllib.healthchecker, keeping it focused on basic health checks for services. The goal here is to reduce both image size by around 4% and build time in hundreds of Python-based microservices, where every megabyte and second spent building the image counts.

Given the simplicity and the fact that it targets a common use case (health checks), I believe this could be a valid enhancement without introducing significant complexity or maintenance burden.

The discussion you linked isn’t about adding a CLI…
I’m having doubts about this proposal.

Here’s a few other recent discussions about adding/improving CLIs in stdlib modules:

2 Likes

I think there is value in this proposal.

Differently than other modules, the urllib.request functionality here is key in environments where there still no tools to do a GET, exactly because of that; probably it can be used as a first step to download other more complex and powerful infrastructure (e.g. the very first command in the install instructions for httpie uses curl).

That said, aren’t there other alternatives to curl that fulfill this niche?

1 Like

Which discussion is that? Thanks!

The discussion about the random module linked above. I probably had this comment in mind:

I’m not particularly sold by the arguments against just using python -c. It’s rarely going to be more than a couple of backslashes and a semicolon which is no worse than the standard docker run sh -c '...' stuff you’d do to run multiple commands in one container. Even in the most esoteric quote/escape sequence ridden system, the worst possible scenario is that you put your 2 lines into a /opt/health-check script and call that.

And I’d be surprised if this doesn’t open the floodgates for feature creep requests. Even in the unlikely event that people accept the scope being limited to health checks, I’d at least expect requests for different machine readable outputs and possibly certificate handling.

Having said that though, I think this proposal makes more sense than the equivalent re and random proposals.

3 Likes

Actually, it’s a bit more complicated. The correct inline Python would need to handle exceptions to avoid printing a traceback:

python -c """                                                                                                         
import urllib.request, sys
try:
    urllib.request.urlopen('http://localhost:8080/health')
except:
    sys.exit(1)
"""

Note that this requires triple quotes and proper indentation, making it more cumbersome to embed as a command directive in JSON, such as in the ECS example I gave earlier.

Without that exception handling, the behavior isn’t as clean as curl -f, since it would print a full traceback on failure, which would be noisy and potentially pollute logs in the case of health checks that typically run every few seconds.

While it’s possible to achieve this with python -c, a dedicated CLI tool would simplify the process and make it more robust, avoiding extra boilerplate.

That’s what redirection’s for, isn’t it?

python3 -c 'import urllib.request; urllib.request.urlopen("http://localhost:8080/health")' 2>/dev/null || echo handle the failure...

An alternative proposal to all suggestions along these lines is that the proposed functionality could be provided by something that could be installed with pip and then it is just two steps:

python -m pip install pydownload
pydownload http://localhost:8080/health

There are a number of advantages to doing it this way like easier to update to support more URLs or redirection or security etc. Python already comes with pip so that the bootstrapping problem is solved and it means that everything is infinitely extensible just by adding new PyPI packages for whatever CLI(s) anyone would want.

Someone could make a collection of lightweight CLI tools that are thin wrappers around Python’s stdlib functionality. Then it could be included as standard in your docker build or easily installed with pip.

6 Likes

I like this idea at a high level but think it’s too slippery a slope. Soon someone will want to disable/enable redirects, pass specific headers, do POSTs… and each step gets us basically to curl.

A curl-like cli that can be pip installed couple be cool ultimately but I don’t think it would scale well in the stdlib.

3 Likes

A docker solution is to use multi stage builds. In your build container, you can install whatever you want, do the download. Later on, you can copy just the artifact from your build container into your final container. You can use the most appropriate tools in your build container (wget, curl, a python script, whatever) without having to worry about size or clean up.

Even with multi-stage builds, the curl binary still ends up in the final image, adding that extra 4% of space.

Anyway, I agree there are existing solutions like redirection or installing third-party tools with pip (or using wget), but for simple Python-based applications, this requires external dependencies and package installations. When scaled to thousands of image builds, this adds up in terms of time, resources, and environmental impact.

Embedding a multiline boilerplate is error-prone and could easily be replaced by a few lines in the standard library, which already has the necessary component. Python is a pragmatic language, and IMHO this is a missed opportunity to leverage its “batteries included” philosophy.

A minimal CLI tool integrated into the stdlib would avoid unnecessary complexity and overhead.

While my proposal is still under consideration, I’ve created a package called healthyurl:

It provides health check capabilities using only Python’s standard library and is available on PyPI .

2 Likes

I’d say that healthyurl just reiterates what I’m saying. Worst case scenario is that you add one tiny little script to your container. The size of curl is irrelevant after that.

Nothing in the build stage ends up in the final image, unless you explicitly COPY it.

There is already plenty of heavily tested code out there to download things.

Of course, if you need the command for the healthcheck, it needs to be copied to the final image.

That’s because it is the Debian and Ubuntu section, the first command on that page overall is the universal one., which uses python -m pip install