Hi everyone.
in http.cookiejar
we’ve got constant IPV4_RE
which is widely used to decide whether a string is a HDN or a IP address
IPV4_RE = re.compile(r"\.\d+$", re.ASCII)
it is used like this:
def liberal_is_HDN(text):
"""Return True if text is a sort-of-like a host domain name.
For accepting/blocking domains.
"""
if IPV4_RE.search(text):
return False
return True
It seems like it only detects that whether the string ends with a dot and some number. Well IMO
- Firstly, IPv4 shouldn’t be like
257.257.257.257
right? we could add number ranges in the regex. - Secondly, it does not support IPv6, address like
2001:db8::1
would be detected as a HDN which is completely wrong.
the author wrote that in func is_HDN
:
# XXX
# This may well be wrong. Which RFC is HDN defined in, if any (for
# the purposes of RFC 2965)?
# For the current implementation, what about IPv6? Remember to look
# at other uses of IPV4_RE also, if change this.
Since we’ve got stdlib ipaddress
after 3.3, why couldn’t we just simply use
import ipaddress
def is_ip(text):
try:
ipaddress.ip_address(text)
return True
except ValueError:
return False
to cope with the ip problem?
I want to create a PR about this (change IPV4_RE
regex into is_ip()
). Please let me know if you have any suggestions. I deeply appreciate it for your time!