i200yrs
(Rhett)
April 18, 2023, 9:55am
1
Hello All…in my dataframe column “Paper_Code” have values as shown below:
Paper_Code
a.doc
b.docx
I got script to removed the extension with below code:
df["Paper_Code"] = df["Paper_Code"].str.replace(r'.doc$', '')
df.index = df.index + 1
But the problem it only remove the .doc extension…The “x” remain from .docx file…see below.
Paper_Code
1 a
2 bx
Please help me find how to remove all extension…thanks
abessman
(Alexander Bessman)
April 18, 2023, 10:06am
2
One way is to use pathlib.Path.stem
:
from pathlib import Path
df["Paper_Code"] = df["Paper_Code"].apply(lambda x: Path(x).stem)
2 Likes
rob42
(Rob)
April 18, 2023, 10:16am
3
Another way:
names = ("name_test.txt", "something.doc", "something_else.docx")
for name in names:
new_name = name.replace(name[name.find('.'):], '') # this being the line of code you need
print(new_name)
2 Likes