Had all sorts of noob confusion coming from C - pass by value / using pointers vs the python object reference model regarding mutable and immutable types.
Kept on getting caught on function argument re-assignment (thinking it was some kind of automatically dereferenced pointer). The idea of str being immutable was a totally new concept. Without the help of an IDE there’s no warnings.
TL;DR:
These apparently do the same and print the same result, but the first creates a new object, the second mutates the same object.
df = df * 2
print(df)
df[:] = df[:] * 2
print(df)
To a beginner this could be very confounding. Its not until you have a mental model of mutable types and object references that it becomes second nature. Coming from C pointers, the foundation were there, but still the pythonic way was confusing, especially with immutable strings and extended pandas notations.
Full example:
import pandas as pd
data_dict = {'col_a': [1, 2, 3], 'col_b': [1, 2, 3]}
df = pd.DataFrame(data_dict)
df
Out[5]:
col_a col_b
0 1 1
1 2 2
2 3 3
df * 2
Out[6]:
col_a col_b
0 2 2
1 4 4
2 6 6
def double(df):
df = df * 2
print(df)
double(df)
col_a col_b
0 2 2
1 4 4
2 6 6
df
Out[9]:
col_a col_b
0 1 1
1 2 2
2 3 3
# aaargh why didn't the double stick??
# Even printed it in the function to make sure!!
Same applies for any mutable.
Of course, this works:
df[:] = df[:] * 2
df
Out[14]:
col_a col_b
0 2 2
1 4 4
2 6 6