Python code work differently in Windows vs CentOS

Hi!
I have a python code that is presenting a different behavior when I run it on Windows and when I run it on CentOS.
Below is the partial code that is of interest for this issue with comments to explain what is the purpose. It basically process a bunch of CSV files (some of them with different columns from each other) and merge them into a single CSV that has all the columns:

 #Get the name of CSV files of the current folder:
 local_csv_files = glob("*.csv")
 #Define the columns and the order they should appear on the final file:
 global_csv_columns = ['Timestamp', 'a_country', 'b_country', 'call_setup_time','quality','latency','throughput','test_type']
 #Dataframe list:
 lista_de_dataframes=[]
 
 #Loop to be executed for all the CSV files in the current folder.
 for ficheiro_csv in local_csv_files:
    df = pd.read_csv(ficheiro_csv)
    #Store the CSV columns on a variable and collect the number of columns:
    colunas_do_csv_aux= df.columns.values
    global_number_of_columns = len(global_csv_columns)
    aux_csv_number_of_columns = len(colunas_do_csv_aux)
    #Normalize each CSV file so that all CSV files have the same columns
    for coluna_ in global_csv_columns:
       if search_column(colunas_do_csv_aux, coluna_)==False:
          #If the column does not exist in the current CSV, add an empty column with the correct header:
          df.insert(0, coluna_, "")
    #Order the dataframe columns according to the order of the global_csv_columns list:
    df = df[global_csv_columns]
    lista_de_dataframes.append(df)
    del df
 big_unified_dataframe = pd.concat(lista_de_dataframes, copy=False).drop_duplicates().reset_index(drop=True)
 big_unified_dataframe.to_csv('global_file.csv', index=False)

#Create an additional txt file to present with each row of the CSV in a JSON format:
with open('global_file.csv', 'r') as arquivo_csv:
   with open('global_file_c.txt', 'w') as arquivo_txt:
      reader = csv.DictReader(arquivo_csv, global_csv_columns)
      iterreader = iter(reader)
      next(iterreader)
      for row in iterreader:
         out=json.dumps(row)
         arquivo_txt.write(out)

Now, on Windows and on CentOS, this works well for the final CSV since it has all the columns ordered as defined in the list:
global_csv_columns = ['Timestamp', 'a_country', 'b_country', 'call_setup_time','quality','latency','throughput','test_type']
This ordering is achieved by this code line:

 #Order the dataframe columns according to the order of the global_csv_columns list:
    df = df[global_csv_columns]

But the final ‘txt’ file is different on CentOS. In CentOS the order is changed. Below the output of the txt file in both platforms (windows and CentOS).
Windows:

{"Timestamp": "06/09/2022 10:33", "a_country": "UAE", "b_country": "UAE", "call_setup_time": "7.847", "quality": "", "latency": "", "throughput": "", "test_type": "voice_call"}
{"Timestamp": "06/09/2022 10:30", "a_country": "Saudi_Arabia", "b_country": "Saudi_Arabia", "call_setup_time": "10.038", "quality": "", "latency": "", "throughput": "", "test_type": "voice_call"}
...

CentOS:

{"latency": "", "call_setup_time": "7.847", "Timestamp": "06/09/2022 10:33", "test_type": "voice_call", "throughput": "", "b_country": "UAE", "a_country": "UAE", "quality": ""}
{"latency": "", "call_setup_time": "10.038", "Timestamp": "06/09/2022 10:30", "test_type": "voice_call", "throughput": "", "b_country": "Saudi_Arabia", "a_country": "Saudi_Arabia", "quality": ""}
...

Is there any way to assure the column order in CentOS?

Are you running the same Python version on both platforms?

On CentOS I’m running: Python 2.7.18
On Windows I’m running: Python 3.9.6

I tried to install a recent version on CentOS but wasn’t able to. If you know which command/version/repository I should use to install a similar version on CentOS please let me know.

That’s a HUGE difference and you’ll definitely want to upgrade to Python 3. Use your package manager and look for something called python3 - most likely it’ll be at least broadly recent enough. Once you do that, check the version again and you can see if there are any significant differences.

Yes. Installed python 3.7 and its now working properly.

2 Likes

By RV via Discussions on Python.org at 12Sep2022 16:04:

Yes. Installed python 3.7 and its now working properly.

Leaving aside that you should avoid Python 2 without a really good
reason these days, the notable change is probably that dicts became
insertion-order-preserving recently. So you’d have had that on Windows
and not on CentOS, thus the order difference.

Cheers,
Cameron Simpson cs@cskk.id.au