Allowing indented code for `-c`

Jon Crall posted this feature proposal on the Ideas mailing list and a GitHub issue in April. It needs wider discussion, so I’m re-posting it here. I’ll add my own comment below.

The Python CLI should automatically dedent the argument given to “-c”.

I raised this issue on the Python-Ideas mailing list and it got some positive feedback, so I’m moving forward with it here and in a proof-of-concept PR.

Pitch

I have what I think is a fairly low impact quality of life improvement to suggest for the python CLI.

When I’m not working in Python I tend to be working in bash. But often I want to break out and do something quick in Python. I find the python -c CLI very useful for this. For one liners it’s perfect. E.g.

NEW_VAR=$(python -c "import pathlib; print(pathlib.Path('$MYVAR').parent.parent)")

And even if I want to do something multi-line it’s pretty easy

NEW_VAR=$(python -c "
import pathlib
for _ in range(10):
    print('this is a demo, bear with me')
")

But the problem is when I’m writing bash inside a function or some other nested code, I would like to have nice indentation in my bash file, but if I write something like this:

mybashfunc(){
    python -c "
    import pathlib
    for _ in range(10):
        print('this is a demo, bear with me')
    "
}

I get IndentationError: unexpected indent.

This means I have to write the function ugly like this:

mybashfunc(){
    python -c "
import pathlib
for _ in range(10):
    print('this is a demo, bear with me')
"
}

Or use a helper function like this:

codeblock()
{
    __doc__='
    copy-pastable implementation
    Prevents indentation errors in bash
    '
    echo "$1" | python -c "import sys; from textwrap import dedent; print(dedent(sys.stdin.read()).strip('\n'))"
}

mybashfunc(){
    python -c $(codeblock "
    import pathlib
    for _ in range(10):
        print('this is a demo, bear with me')
    ")
}

Or more recently I found that this is a low-impact workaround:

mybashfunc(){
    python -c "if 1:
    import pathlib
    for _ in range(10):
        print('this is a demo, bear with me')
    "
}

But as a certain Python dev may say: “There must be a better way.”

Would there be any downside to the Python CLI automatically dedenting the input string given to -c? I can’t think of any case off the top of my head where it would make a previously valid program invalid. Unless I’m missing something this would strictly make previously invalid strings valid.

Thoughts?

Previous discussion

On the mailing list there were these responses:

Lucas Wiman said:

Very strong +1 to this. That would be useful and it doesn’t seem like there’s a downside. I often make bash functions that pipe files or database queries to Python for post-processing. I also sometimes resort to Ruby because it’s easy to write one-liners in Ruby and annoying to write one-liners in python/bash.

I suppose there’s some ambiguity in the contents of multi-line “”“strings”“”. Should indentation be stripped at all in that case? E.g.

python -c "
    '''
    some text
    ''''
"

But it seems simpler and easier to understand/document if you pre-process the input like using an algorithm like this:

  • If the first nonempty line has indentation, and all subsequent lines either start with the same indentation characters or are empty, then remove that prefix from those lines.

I think that handles cases where editors strip trailing spaces or the first line is blank. So e.g.:

python -c "
    some_code_here()
"

Then python receives something like “\n some_code_here\n”

python -c "
    some_code here()

    if some_some_other_code():
        still_more_code()
"

Then python receives something like “\n some_code_here\n\n if …”

This wouldn’t handle cases where indentation is mixed and there is a first line, e.g.:

python -c "first_thing()
    if second_thing():
        third_thing()

"
That seems simple enough to avoid, and raising a syntax error is reasonable in that case.

Best wishes,
Lucas

There was also positive feedback from Cameron Simpson and Barry Scott suggested a PR with an implementation might move this forward. I have a proof-of-concept PR and this is the corresponding issue for it.

5 Likes

This is not limited to Bash and Ruby of course; you can embed code in any other language – Rust, Markdown, even Python itself.
Sometimes it’s easier, like with YAML literal block or Python’s textwrap.dedent.
And, of course, adding a if 1: header solves the issue – once you know the trick. It’s far from obvious though.

The implementation raised a question for me: Why pre-process the string, only to immediately pass it to a parser that already handles indentation?

Python’s top-level syntax for files is currently:

file: [statements] ENDMARKER

A new rule could be added for -c:

script:
    | [statements] ENDMARKER 
    | INDENT [statements] DEDENT ENDMARKER

Such a rule would then need to be added to compile, exec, etc., which is not trivial – but probably easier to maintain than dedenting in C.

5 Likes

I imagine some people will want this to be a separate option on backwards-compatibility principle (maybe someone has a toolchain that uses python -c to detect indentation on a string somewhere…). It looks like -C should be possible, or -cc. I assume that e.g. -cd (“(c)ode to (d)edent first”) or -ci (“(c)ode that may be (i)ndented”) wouldn’t be accepted. (Did you know: python -ic foo is the same as python -i -c foo; but python -ci foo treats the i as the code and fails with a NameError, and then doesn’t give an interactive prompt?)

It would show up in 3.13, so I don’t think ultra strict backwards compatibility applies or is useful as a design goal.

Any syntax change in the language changes the behavior of python -c if “it fails with a syntax error when the input is X” is considered part of the interface there.

I don’t like the idea of -ci being asymmetric with -ic, but I suspect from what you shared that I may have other gripes if I start looking at the python CLI too closely.


I don’t see any significant reason not to do this? I’ve never been bothered by having dedented code in a heredoc, but preferences vary.

2 Likes

python accepts input from stdin, and that seems to solve al the use cases.

python < WHATEVER
WHATEVER | python

Your suggestion seems like a good way to implement the feature. I’ve also definitely wanted the Python REPL to ignore spurious leading whitespace (like the IPython REPL does), although I guess that’s the interactive rule.

Note that dedenting vs handling it in the grammar results in different behaviour for multiline strings. I’m not sure that there is a behaviour for multiline strings that is clearly great (and use with indented -c commands seems questionable), although I think I’d be less surprised by what you get when handling it in the grammar.

1 Like

For what it’s worth, there are many other situations I’ve seen (I did a huge roundup on Stack Overflow of old questions about simple IndentationErrors, and of course not everything that comes up in a search is simple) where people try to execute indented Python code dynamically and get bitten by the indentation:

  • Trying to copy and paste an example from e.g. a docstring, or try code from the middle of a function at the REPL (granted, this one is not a very sympathetic case)

  • Python embedded within JavaScript etc. using tools like Brython or Pyscript

  • Dynamically building code for exec using multi-line string literals, not at top level (also happens with programmatic use of timeit - yes, really)

  • Python deliberately embedded in a YAML file (especially in the context of doing weird things with Docker and Kubernetes that I don’t understand), a CDATA tag in an XML file, etc.

  • Code scraped from somewhere and fed to exec, eval, compile or - most interestingly - ast.parse (people might especially expect that one to work and just give a subtree for the not-independently-valid code; the grammar has explicit indent/dedent tokens after all)

  • The user’s own code being scraped and fed to ast.parse by a buggy library even though it tried to use textwrap.dedent to sidestep the issue (and this looks like another example in a different library)

(I also found a fair number of cases where people use some IDE’s interpreter - not the CPython REPL - and end up trying to interpret the code line-by-line, instead of treating a statement with its suite as a complete block.)

1 Like

There was another thread on this topic (python -c and indentation) and
someone posted a very nice incantation which I’ve taken to using myself:
start the python command with if 1:, and indent the following lines.
Here’s an example from my ffmpeg-docker shell script:

 set -- \
   python3 -c 'if 1:
                 import sys
                 from cs.ffmpegutils import ffmpeg_docker
                 sys.exit(ffmpeg_docker(*sys.argv[1:]).returncode)
              ' ${1+"$@"}

Here you can see nice indented python code, all enabled by the leading
if 1: leading line.

… and I see the if 1: incantation is one of the examples in the opening post. That example indents the main python code the same as the if 1:, but indenting the main code a little more makes it more readable to my eye.

I like the automatically dedenting the -c idea. I would not go for a new command line flag, I support the behavior change as is. I’ve never seen a use of -c before that would be tripped up by this change.

I expect it to be transparent and go unnoticed other than people writing nicer command line -c code that no longer works on older Python versions. That’d count as success.

3 Likes