How to keep the floating precision when converting a data frame to PyTorch tensor

elenora · July 17, 2024, 1:51am

Hello. I have a data frame and one of its columns contains ids. I need to convert column id and the rest column separately to torch.tensor. when I convert, I see the values of ids which should be like the following:
[12., 13., 13.1, 13.2, 14.0, 14.1, 5004.0, 5008.0, 5010.0, ...]

the format changes to like the following:

[0.0000e+00, 1.0000e+00, 2.0000e+00, 3.0000e+00, 3.1000e+00, 3.2000e+00,
        4.0000e+00, 4.1000e+00, 4.2000e+00, 5.0000e+00, 5.1000e+00, 6.0000e+00,
        6.1000e+00, 6.2000e+00, 7.0000e+00, 7.1000e+00, 8.0000e+00, 8.1000e+00,
        8.2000e+00, 9.0000e+00, 9.1000e+00, 9.2000e+00, 1.0000e+01, 1.1000e+01...]

Here is the code, I am working with:

def get_datafeatures(self,subset_df):

        subset_df_sorted = subset_df.sort_values('id')

        # Select the required columns
        selected_columns = ['id', 'rolling_mean_speed', 'rolling_std_speed', 'rolling_mean_accel', 'rolling_std_accel', 'rolling_std_y', 'rolling_mean_y']
        sub_df_sorted = subset_df_sorted[selected_columns].copy()
        np.set_printoptions(suppress=True)
        # Convert DataFrame to NumPy array
        numpy_array = sub_df_sorted.to_numpy()

        id_column = numpy_array[:, 0]  # Convert ID column to float
        x_without_id = numpy_array[:, 1:]

        # Convert ID column and features to PyTorch tensors
        id_tensor = torch.tensor(id_column,dtype=torch.float32)  #, dtype=torch.float
        x_tensor = torch.tensor(x_without_id)  #, dtype=torch.float

        return id_tensor, x_tensor

franklinvp · July 17, 2024, 12:10pm

Which was the type of the column of the data frame that contained the ids?