Error Code: Try using .loc[row_indexer,col_indexer] = value instead

I am trying to find books from the list with certain criteria:

german = (books[books['language_code'] == 'ger'])

german['year'] = pd.DatetimeIndex(german['publication_date']).year
filtered_german = german.loc[(german['year'] >= 2001) & (german['year'] <= 2005)]
print(filtered_german.shape)

filtered_german

After I executed it, I got this message from the kernel:

/var/folders/7y/rvhwt04n48g8nsfq443cymkh0000gn/T/ipykernel_57908/656079508.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a data frame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  german['year'] = pd.DatetimeIndex(german['publication_date']).year

Can someone explain the message and how I can fix this?
THANK YOU!

This question is probably the most asked of any pandas questions – for a pandas user it’s also pretty important to know how to avoid this warning. Have you tried some other tried-and-true methods to find the answer for yourself first? (Like - Google? StackOverFlow? Pandas documentation?)

Links:

I did. But it seems they didn’t have the same problem.

I used different coding to eliminate the “warning”. It’s still there. I wonder why the codes trigger those warning.

So, this has nothing to do with any kind of encodings. It’s about how selections (slices) are made from a given DataFrame. The error message (and the links I gave) point to the Pandas docs for this, where there is an elaborate explanation of exactly why pandas gives this warning and why it may be relevant (sometimes it’s an innocuous warning).

>>> import pandas as pd
>>> df = pd.DataFrame(dict(A=[1, 2, 3, 4], B=["a", "a", "b", "b"))
>>> df
>>> df
   A  B
0  1  a
1  2  a
2  3  b
3  4  b
>>> z = df[df.B == 'b']  # selects subset of rows
>>> z['C'] = z.A + 1    # add column, implicitly using chained indexing

<stdin>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

>>>  w = df[df.B == 'b'].copy()  # creates a clearly totally new DataFrame based on the selection
>>>  w['C'] = w.A + 1   # succeeds without warning

When column ‘C’ is created on z, this is really using chained indexing (df[df.B == 'b']['C'] = some_series). In this simple case that’s fine, but there are cases where this might operate on a temporary copy of the view (the selected subset of the df) which might be thrown away, leading to bugs.
Pandas is not always able to determine whether or not chained indexing is ok or not, so it emits the warning.

Alternative,

>>> df['C'] = df.A + 1  # fine, simply creates a new column in original df
>>> z = df[df.B == 'b']
#  z.C = z.A + 2  # will generate warning again, instead do:
>>> z.loc[:, 'C'] = z.A + 2

In simple cases as this the warning is harmless; but in general I think it’s best (simplest) to avoid it by
never trying to modify a selection or view of a given dataframe using chained indexing. If you need a modifcation, use .loc. If you you need to add a new column, either add a dummy already in the original df and then modify that with .loc, or make an explicit copy of the view (as above with w).