I have Python 3.11 on Windows 10. I’m still fairly new to Python.
I have a string with data I got from a website. The website uses an extended ascii character for the minus sign and I’d like to change that to a normal printable dash. The next value of the odd character is \x2212.
But I’ve never seen anyway to represent a value in hex in a Python string so I can use the .replace()
function.
This is what I’ve tried:
def cleanhtmllist(mylist):
r'''Change extended ascii to normal characters.
In the Wikipedia page there is an extended ascii character that must be changed to a minus sign.
In: list of strings
Out: List of strings
Change this, to this:
\x2212 (dec 8722) -
'''
procname = str(inspect.stack()[0][3]) + ":"
for l in mylist:
l = l.replace('\x2212', '-')
newlist.append(l)
return newlist
The function actually receives a list of strings which I have to clean up.
I’m also having a problem finding the extended non-printing characters in this code which calls the function. This regex never finds the non-printing characters.
# I didn't know how to represent the hex ctr 0x2212 in Python so I used \x2212.
ind_row_data = ['Walmart', 'more stuff', 'something else', '\x2212$220,500']
tstr = ', '.join(ind_row_data) # Change list to string.
if re.match(r'[^ -~]', tstr): # Find high ascii ctrs.
print(f"{ind_row_data[0]} profit={profit}")
print(f"Row has ext ascii: {ind_row_data}")
ind_row_data = cleanhtmllist(ind_row_data)
So the cleanhtmllist() function is not running.