Make test module external

latot · June 20, 2023, 5:32pm

Hi, I have been working with several OSs and Envs… I have noticed there is several problems with python on them, so I wanted to run the python tests… but… I was not able to do that in all the cases because the test module is internal of python, and some OSs/Envs compiled and installed python without the test module… so the python regression tests can be executed…

A possible solution would be have the test module external with pip, so we would be able to install in every env and check the integrity of python…

Thx!

Rosuav · June 20, 2023, 6:15pm

What problems have you noticed?

Jelle · June 20, 2023, 6:36pm

The test package is already part of a Python installation created through CPython’s own mechanisms. Some vendors may choose not to distribute it as part of their distribution of CPython, but that’s not something we can fix.

It would be possible to create a PyPI module that distributes the test package, but I don’t think that’s a burden the core team would like to take on.

latot · June 20, 2023, 7:21pm

D:

For now… conda python just crashes with some big projects, seems to overflow some time some libs.

In debian, I found there was a problem when writing data on a VM, I was not able to get a simpler code to reproduce it, but any test I did it works, only on some cases does not works.

Debian does not includes the full test module… so when I wanted to know where to problem ca be it get harder…

Maybe, intead have the test module as external, be able to run them from the git repo but on the actual env instead of use the repo python one…

nad · June 20, 2023, 8:02pm

I believe you’ll find the Python test suite in a separate Debian package, for example: Debian -- Details of package libpython3.11-testsuite in bookworm

jeanas · June 20, 2023, 8:29pm

Although possible, I consider it quite unlikely that you found a bug in the Anaconda or Debian distributions of Python that is related to those distributions and not in Python itself. For packaging matters, it is possible, but your use case didn’t sound packaging-related.

It is far more likely that either your code, or a third-party library you are using has a problem.

latot · June 20, 2023, 8:47pm

The type of errors is not like that, the code works fine with linux out of a VM. Different code sections fails due to different reasons… on different Envs… and the crashed of conda are weird…

I don’t know if they problems are related to the distributions, but at least python from linux is able to handle bigger numbers from windows one, so in windows it crashes just loading a CSV.

In a linux VM just fails to write a numpy array with imageio… in some weird circumstances…

So is not like the code, I have tested them…

How I can’t found very well the causes, is that I think would be great be able to run the testsuites, thx for the comment above, I was able to install on debian, I’m running the tests now, and checking them to run them on conda (still can’t found how to run them, conda has python-regr-testsuite but seems is not the testsuite).

jeanas · June 20, 2023, 9:13pm

If you are more specific about what the “weird” crashes and circumstances are, or show some code, maybe we can be of of more help.

Just accept my personal advice: in my own path as a beginner, 100% of the times I blamed Python, the bug was in my code or environment instead.

I don’t understand what you mean by “Python from Linux is able to handle bigger numbers than on Windows”. Can you explain what you mean by “handle numbers”? In Python, integers have arbitrary size (“bignums”), there are no limits to the integers you can create and manipulate^[1].

Other than the memory available to the computer, which is normally huge. ↩︎

latot · June 20, 2023, 9:42pm

Sorry, I didn’t wrote explicitiy what I means with it, but again, I wrote it above.

From a VM I got I/O error writing on closed file, while I’m writing a file with imageio, just a normal thing, a “with” with a open file, writing a image, I can do this fine in linux without a vm, but from inside seems to have this…

While, on windows, open a csv a set columns to int gives… “int too large to convert to c long windows” while if I run exactly the same code, with the same data it runs on linux…

So I wants to be able to check if python is working fine, to exclude it…

elis.byberi · June 20, 2023, 9:59pm

It looks like you are storing the CSV data in some third party library. What library are you using, numpy, pandas…?

Rosuav · June 20, 2023, 10:04pm

That might be possible if you have a 64-bit Linux Python and a 32-bit Windows Python, but you can get 64-bit Python on Windows too. I don’t know how big your CSVs are or how you’re loading them, but I am extremely dubious that this is a problem in the core Python interpreter. Can you post some actual code?

jeanas · June 20, 2023, 10:18pm

Can you show the code that you are using to do this?

This I can reproduce:

$ python
Python 3.11.3 (main, May 24 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> csv.field_size_limit(2**65)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

That’s the only way I can see to trigger this error in the csv module. The function csv.field_size_limit() indeed does not accept integers that are bigger than machine long (because it’s written in C, so the rules are a little bit different than how it works when you code in pure Python). Do you have such a csv.field_size_limit() call, and why? Even on Windows, a limit of 2^{32} bytes is enough for several gigabytes. Do you really need to set a higher limit than that? My guess is that you added a csv.field_size_limit() call in the hope that it would somehow fix some error, with an exceedingly large number.

Well, even on 64-bit Windows, sizeof(long) == 4 (whereas sizeof(void *) == 8). This is a know quirk of the Windows platform that causes much pain.

latot · June 21, 2023, 6:21am

Hi, running this code from a module fails with the error:

with open(out_png_, "wb") as f:
  imageio.imwrite(f, imageio.imread(data), compress_level=9, format="png")
ValueError: I/O operation on closed file" occurs when you try to perform an operation on a closed file

I have not any call to csv.field_size_limit()

Thx!

jeanas · June 21, 2023, 6:25am

Then what is your code that causes the long conversion error? And what is the backtrace that is printed when running it?

jeanas · June 21, 2023, 5:23pm

Latot:

running this code from a module fails with the error:

with open(out_png_, "wb") as f:
  imageio.imwrite(f, imageio.imread(data), compress_level=9, format="png")
ValueError: I/O operation on closed file" occurs when you try to perform an operation on a closed file

Maybe it is not the imageio.imwrite call that fails, but the imageio.imread one. What is data? Have you tried moving this call out of the with block, e.g.,

im = imageio.imread(data)
with open(out_png, "wb") as f:
    imageio.imwrite(f, im, compress_level=9, format="png")

and seeing which is the line on which you get an exception?

MRAB · June 22, 2023, 2:03am

What is data? The docs say it can be “str, pathlib.Path, bytes, file”. If it’s a file, is that file open?

latot · June 22, 2023, 4:43am

yeah, well half, is data that is in io.BytesIO.

MRAB · June 22, 2023, 4:29pm

When you say “data that is in io.BytesIO”, do you mean that data is a BytesIO object? If it is, what does data.closed return? You don’t want to close it before reading back what it contains; you’d want to seek to the start (data.seek(0)) so that you could then read from it.

latot · June 22, 2023, 4:56pm

I don’t close the BytesIO object… that is after reading the data… this only fails when is executed from a VM, in a normal linux env works correctly. I have not checked the closed property, but is hard to check it now.