Regarding whether we should add `Py_CurrentArch` or `Py_ArchName` function

When working on task gh-102536: Added `_architecture` field to `sys.implementation` by rruuaanng · Pull Request #124582 · python/cpython · GitHub, I found that getting the current running architecture through C API is quite troublesome. Even though there’s uname in Unix-like systems and GetNativeSystemInfo in Win32, not everyone knows these two APIs (maybe I’m not familiar enough with C API, or maybe there’s another way). So, I think it’s necessary to add a public function(or private) in C API to get the hardware architecture we’re currently running on.

Perhaps when implementing other hardware architecture-specific optimizations in the future, an API can be provided to distinguish the current system architecture, And take corresponding actions according to the architecture.

Maybe this code can illustrate my dilemma:

#if !defined(MS_WINDOWS) && defined(HAVE_SYS_UTSNAME_H)
    res = uname(&u);
    if (res < 0)
        goto error;
    value = PyUnicode_FromString(u.machine);
#else
    /* ignore other */
    GetNativeSystemInfo(&s);
    switch (s.wProcessorArchitecture) {
    case PROCESSOR_ARCHITECTURE_AMD64:        break;
    case PROCESSOR_ARCHITECTURE_ARM:          break;
    case PROCESSOR_ARCHITECTURE_ARM64:        break;
    case PROCESSOR_ARCHITECTURE_IA64:         break;
    case PROCESSOR_ARCHITECTURE_INTEL:        break; 
    case PROCESSOR_ARCHITECTURE_UNKNOWN:      break;
    default: break;
    }
#endif /* !MS_WINDOWS */
    if (value == NULL)
        goto error;
    res = PyDict_SetItemString(impl_info, "_architecture", value);
    Py_DECREF(value);
    if (res < 0)
        goto error;

Actually, this is how I wanted to express it:

   arch = Py_ArchName();
   value = PyUnicode_FromString(arch);
   res = PyDict_SetItemString(impl_info, "_architecture", value);
   Py_DECREF(value);
   PyMem_RawFree(arch);
   if (res < 0)
       goto error;

And in the future, when he branches out into other related areas, he’ll have even more impressive performance.

arch = Py_ArchName() // or Py_CurrentArch()
switch(arch) {
case "win32": break;
case "arm":   break;
/* some other*/
}
1 Like

If this proposal is allowed, I will submit my changes (I have already implemented and applied them in my personal compiler, and they are very practical).

I don’t recall that I ever needed to know the architecture while writing C code. I don’t understand the use case for such function.

(By the way, a string is not convenient in C, I expected an integer or an enum.)

1 Like

Maybe based on what you mentioned, if allowed, maybe I can initiate this submission.

I don’t know that we’ve come up against a major need for checking “does the current OS differ from how CPython was compiled.”[1] If there are a few places we already do this kind of check, it might be worth creating a helper function, but I’m not aware of any. Usually we target optimisations for the compile-time architecture, and users get that behaviour wherever they run, so they can choose which build to use.

The proposed sys.implementation._architecture[2] attribute is also not meant to be the current OS, but it should be the architecture that CPython was compiled for. Right now, there’s no obvious way for Python code to tell whether it’s running on an ARM64, x64 or x86 build of Python (if we’re on Windows ARM64, any of these are possibilities).[3]


  1. Which mainly matters on Windows, with its transparent CPU emulation, and possibly on macOS. ↩︎

  2. Name is still under discussion ↩︎

  3. The non-obvious way is to check sys.winver, but it requires parsing that isn’t specified and won’t be specified, because that’s not the point of this field. It just happens to be the only place where the architecture is embedded at compile time. ↩︎

1 Like

Isn’t is better at runtime to do a feature check?

Is the API you require supported?

  • you want to use a new linux kernel system call for example

Does the CPU have the instruction you want to use?

  • you want to use AVX512 to optimise an algorithm
1 Like

But from what I can tell, it seems like the result it returns is just the python version, not the current architecture.

>>> import sys;sys.winver
'3.14'

I’m sorry, I don’t understand why sys.winver is parsed.
So my PR does not include any part about winver, but only obtains the current architecture through win32 api.

Maybe, but this post is just to collect responses from developers. There may be the same developers who may need this API.

If you are using a 32-bit build, it’ll be 3.14-32, and on an ARM64 build it’ll be 3.13-arm64, regardless of what operating system or CPU you’re actually running on. You can try this with a 32-bit build easily enough.

1 Like

Yes, which we don’t need. The architecture used at compile time is available as preprocessor variables, and we typically code so that we assume the current CPU matches what we compiled for. If it doesn’t match, users may get worse performance, but they could switch to a build that matches. This ensures that we have portable code.

I’m not aware of anywhere currently that we would benefit from detecting that the current CPU is different from what we were compiled for. Hypothetical cases exist, sure, but not adding the API doesn’t prevent us from adding it in the future (or make it more complex in the future).

1 Like

It seems that it would be better as a macro, because it can be passed as an optional parameter to config.h.

Feature checking is a design that is frequently used as your proposed API leads to maintenance issues in code that use it.

You would use answers from your proposed API to imply the availability of features. But history shows this is an error prone method to determine if a feature is available.

Yes! Here, I’ve gotten confirmation that I should probably try getting the value of sys.winver instead of using other method.

I know what adding this API entails, but it seems like it could provide a temporary solution to a specific issue. (So this post is about whether it’s worth adding the API, not about whether the action is correct).

I’m confused. are you saying this is a bad idea to add this API but you want it anyway?

I think it could work for getting the architecture when the program runs, instead of when it’s compiled. But it doesn’t seem to work well with CPython. Hmm, it looks like it’s only good for solving my specific issue and not something I want to put out there publicly.