How to remove extenions in datafram

Hello All…in my dataframe column “Paper_Code” have values as shown below:

Paper_Code
a.doc
b.docx

I got script to removed the extension with below code:

df["Paper_Code"] = df["Paper_Code"].str.replace(r'.doc$', '')
df.index = df.index + 1

But the problem it only remove the .doc extension…The “x” remain from .docx file…see below.


Paper_Code
1	a
2	bx	

Please help me find how to remove all extension…thanks

One way is to use pathlib.Path.stem:

from pathlib import Path
df["Paper_Code"] = df["Paper_Code"].apply(lambda x: Path(x).stem)
2 Likes

Another way:

names = ("name_test.txt", "something.doc", "something_else.docx")

for name in names:
    new_name = name.replace(name[name.find('.'):], '') # this being the line of code you need
    print(new_name)
2 Likes

thank u very much…