How can I get the "format" function to leave my leading ZEROs in place?

I have some binary data that I want to hash. Specifically I am hashing a 128 bit private key. But for the discussion let’s just use a 16 bit private key. So here is my code and it works great except when my private key begins with leading zero’s. Specifically 4 leading zeros or more.

10 my_data = "0000000010001100"
20 
30 hex_data = "{0:0>4X}".format(int(my_data,2))
40 bin_data = binascii.a2b_hex(hex_data)
50 chksum = hashlib.sha256(bin_data).hexdigest()

So the “my_data” variable is a string and contains the binary bits of my private key. I know that at the Linux command line I can type:

echo 0000000010001100 | shasum -a 256 -0

and it will return the hash:
bd767b4aa0cdb6b8e4e62761c3e9744353845698ca9aa44f7a1ffb525cdf5cb8
which is correct.

However, my program is returning the hash:
9defb0a9e163278be0e05aa01b312ec78cfa3726869503385e76e3a4b7950648
which is NOT correct.

After pulling some hair out I have determined the reason is because it does not put the leading ZERO’s in the “hex_data” that it is passing to the hashing algorithm.

If I print(hex_data) after line 30 I get:
8C
When I should get:
008C

I manually set my “hex_data” to “008C” and ran the code after line 30 and it gave me the correct hash. So I know the problem is that my leading ZERO’s are being left off when I convert the BINARY STRING to HEX.

I know I can just pad my finished output with leading ZERO’s and that would probably fix my problem. Or I can set up a loop and convert my binary to hex in a more manual deliberate way by looking at the actual binary digits 4 at a time and putting the HEXADECIMAL equivalent character into my hex_data string. The ZERO’s would be added no problem then.

But I’m thinking there has to be a more graceful way to do this? So I’m just asking in an effort to become more elegant in my programming skills. I don’t want anyone pulling their hair out to try and find me an elegant solution. I’m just thinking some Python programming guru out there might just have the solution off the top of their heads.

I basically just want an elegant way of taking a character string that is made up of 128 “1’s” & “0’s”, a binary private key, and passing it to the sha256 hashing algorithm to get my hash so I can extract my checksum from it.

Anyway, thanks in advance for all of you who have made it all the way down to here in your reading. Your time is appreciated. Kresp.

You are converting from string to integer.
Leading 0’s is only meaningful on a string, not an integer.

Your bash command gives you the shrsum of a string.

Why not do the same in the python? It seems the conversion to int is not required?

Maybe this will help:

from hashlib import sha256

my_data = "0000000010001100"
hex_data = "{0:0>4X}".format(int(my_data, 2))
chksum = sha256(bytearray.fromhex(hex_data)).hexdigest()
print(hex_data)
print(chksum)
008C
bd767b4aa0cdb6b8e4e62761c3e9744353845698ca9aa44f7a1ffb525cdf5cb8

I’m not seeing that. Are you sure that the code that gave you 8C, is the exact code that you’ve posted here?

1 Like

I cannot reproduce this:

$ python
Python 3.8.10 (default, May 26 2023, 14:05:08) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> my_data = "0000000010001100"
>>> hex_data = "{0:0>4X}".format(int(my_data,2))
>>> hex_data
'008C'
>>> 

The 4 in the format string ensures that there is the appropriate amount of padding.

If there are 128 bits, then there will be 32 hex digits, so you can use 32 instead of 4 in the format string.

That said, yes, the call to int cannot preserve leading zeros, because the type doesn’t represent leading zeros (in any base) - it only represents an integral numeric value. If you are going to involve int in this process, then you need to track separately how many bits are involved.

But if you are doing that, then the conversion can be much simpler. You just need to create the corresponding bytes object, and integers offer a direct conversion to the corresponding bytes, if you specify the byte length (which of course should be 16 for a 128-bit input):

>>> int(my_data, 2).to_bytes(2, 'big')
b'\x00\x8c'
>>> int(my_data, 2).to_bytes(4, 'big')
b'\x00\x00\x00\x8c'
>>> int(my_data, 2).to_bytes(16, 'big')
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x8c'

Finally: please keep in mind that if you only have 128 bits of input to hash, then the sha256 hash is misrepresenting your cryptographic strength :slight_smile:

Thank you for your response. It was helpful. And I do understand your last comment.