{k: "" if math.isnan(v) else v for k, v in d.items()}
TypeError: must be real number, not str
What is the best way to check for nan that doesn’t break on non-numeric types? The only options I could come up with are:
Using an if statement that checks if each value is a float: {k: v if not (isinstance(v, float) and math.isnan(v)) else '' for k,v in d.items()}
Utilizing the json module: json.loads(json.dumps(d), parse_constant={"NaN":""}.get)
Relying on every other value being equal to itself: {k: v if v==v else "" for k,v in d.items()}
There are ways to do this with dependencies like Numpy/Pandas, but they are not available at the time this code is executed. I am wondering whether there is a standard, idiomatic way to do this with just Python and the standard library before I pick from one of the above.
Sure. If only specific NaNs are wanted, then a more specific test is needed.
I was reading around the lines and took an educated guess about what OP meant to say they wanted, not what they actually said that they wanted. I.e. general data cleaning, and fewer bugs
Why would anyone want to keep one type of NaN, and not another (we already know they’re being replaced with an empty string, not a particular NaN option)?
Yep. That’s the standard idiom for nan testing. If there are any other values that aren’t equal to themselves, deal with it when you find it; you’ll probably find that those would be considered to be nan by various other tools too. There’s really no reason to force a float check first.
To replace nan without external libraries when strings are present, you can iterate through your data and check each element. Use a conditional statement to replace ‘nan’ strings with a desired value, ensuring data integrity and consistency throughout your dataset.
Hey @micky388 and welcome to Python discussions! I’m actually really new here myself, but you seem to be newer so I thought I’d offer the welcome.
I appreciate you trying to help me, but I don’t see how I can take your comment and put it to action. The original post lists three ways to do what you suggest, but you don’t seem to identify which of them (if any) to go with. Let me know if I’m missing something from your post.
For everyone else, I went with
{k: "" if isinstance(v, float) and math.isnan(v) else v for k, v in d.items()}
because I think it’s more clear for code readers. I like how succinct if v==v else "" is, but I think the explicit type check is more clear to someone newer to Python (which is expected in the business context of this module). Thanks to everyone who answered!