Ipaddress.py exclude address speed up

Currently ipaddress address_exclude does a kind of recursive call checking if the exclude ip is contained within the subnet to be split. This ends up checking a lot of times when it is a bit easier to exclude things.

IPs in ipaddress are treated as integers behind the scenes. You can even construct an IP address from an integer

ipaddress.IPv4Address(100000)
IPv4Address('0.1.134.160')

Given this you can consider a network address (10.0.0.0/8) as a range (167772160, 184549375). If you wanted to exclude an IP you can just remove the integer or range of integers. It ends up being much quicker.

I am wondering if this would be desired in the official ipaddress library and if so, how would I go about it? Create a PR on cpython/ipaddress.py at 3.10 · python/cpython · GitHub ?

It address is in network byte order that will stop this working I think on little endian machines.

(Surprisingly) in the ipaddress module there is no IPv4Range class. Do you suggest to call ipaddress.summarize_address_range() on the two resulting ranges to get a list of IPv4Network objects?

@barry-scott I have never worked with endian-ness in machines. The ipaddress objects can be constructed from an int, byte, or string.
IPv6 does the same thing with a few other checks.

In the end self._ip should be an integer. I am happy to test this out but am not sure what to test it on. I have a desktop and linux server. I believe Solaris would be something that is little endian. Would a VM of solaris be possible and preserve this?

@vbrozik That is what I was thinking. Would be happy to contribute more to ipaddress as my new project makes extensive use of it. I am however new to contributing to the standard library and would need guidance on how to make my first submission. Would it be as simple as a PR to the github repos above?(I cannot put more than two links, weird) If this was changed, would it only make it into later version of python or would it be available in 3.7? I imagine not.

Any new features would target 3.12.

I did some testing and I think your int may work for IPv4 and IPv6 addresses.
IPv4Address._ip is native int.

If you make from a bytes it assumes it is in network byte order.
If from an int its just copied it in it seems.

So 127.0.0.1 as an native int is 0x7f000001 as bytes is b’\x7f\x00\x00\x01’ that is the reverse byte order from the int (e.g its 0x0100007f).

:>>> socket.inet_aton('127.0.0.1')
b'\x7f\x00\x00\x01'
:>>> ipaddress.IPv4Address(socket.inet_aton('127.0.0.1'))._ip
2130706433
:>>> ipaddress.IPv4Address(0x7f000001)._ip
2130706433
:>>> ipaddress.IPv4Address('127.0.0.1')._ip
2130706433

IPv6Adress looks like it does the same conversion to an int.

:>>> x=ipaddress.IPv6Address('fe80::ae1f:6bff:fef6:e094')
:>>> x._ip
338288524927261089666565762675463610516
:>>> hex(x._ip)
'0xfe80000000000000ae1f6bfffef6e094'
:>>>

Seems like this should be a pretty straight forward improvement. In general I would like to see multiple improvements to the ipaddress.py module beyond this. Should I create PRs on Github or is there some other location?

Our workflow is typically as follows:

  • Discuss viability on forum (done)
  • Open issue for new feature on GitHub (your next step)
  • Open PR for new feature (do this after the design is agreed upon in the issue)
4 Likes

Thanks for the response Guido! Will open that issue shortly.
edit: Better exclude_address function in ipaddress.py · Issue #97610 · python/cpython · GitHub