Apologies for yet another bytecode-related discussion. This time I was dealing with COPY as a replacement for DUP_TOP and I stumbled upon this unexpected behaviour. The following script (this requires the bytecode package to run)
from bytecode import Bytecode, Instr
print(
[
eval(
Bytecode(
[
Instr("LOAD_CONST", 24),
Instr("LOAD_CONST", 42),
Instr("COPY", i),
Instr("RETURN_VALUE"),
]
).to_code()
)
for i in range(3)
]
)
Push the i-th item to the top of the stack. The item is not removed from its original location.
I would have expected COPY 0 to duplicate the TOS, which before the occurrence of the COPY opcode itself should have been the literal 42. Instead I do get a literal 3. I do get the expected result with COPY 1 though.
Based on the simple experiment above, I think I can conclude the following:
the COPY oparg is not zero-based;
COPY 0 does not crash the interpreter, but instead produces an unexpected TOS.
COPY 1 is the replacement for DUP_TOP.
This makes me wonder whether this behaviour is by design, or just accidental. Is there, perhaps, a special meaning for COPY 0 that is not documented in the section for the dis module?
If you look at the source code you’ll see that COPY with oparg=0 is invalid, but this is only checked using a C assert() call. That only does anything when built in debug mode (./configure --with-pydebug). Other than that, COPY uses PEEK(oparg), and PEEK is indeed 1-based.
I really urge you to (a) read the source code and (b) use a CPython binary built in debug mode before posting questions here.
https://github.com/python/cpython/pull/96462 is an attempt at improving the documentation of the stack effect for several opcodes. I am awaiting for some feedback before dealing with the latest conflicts.
I really urge you to (a) read the source code and (b) use a CPython binary built in debug mode before posting questions here.
I have started the OP by apologising, so I hope I can be pardoned this time! I confess I didn’t go to the source code this time because the behaviour was already clear from my experiments, and I didn’t expect to find the rationale for a 1-based stack there either. Hence I thought I had better chances of getting an insight as to why these new opcodes behave like this here.
I think this could be regarded as one of those cases when no documentation is better than some documentation. For I went to the dis documentation and assumed that indexing was 0-based (how often do you deal with a 1-based collection?). Based on the fact that I have also been swapping the TOS with itself with SWAP 1, I think it’s fair to conclude that the indexing mentioned in the dis module is 1-based. It would have been great if that was mentioned somewhere, like @MatthieuDartiailh is doing in their PR. I look forward to those docs improvements!