How to convert a strange hashing algorithm from PHP to Python?

Hi everyone - I’m trying to implement a legacy hashing algorithm written in PHP into Python. But I’m having some trouble, and I think it’s when trying to find an equivalent of the pack function from PHP.

The purpose of doing this is so that users of a website with passwords hashed with the old algorithm will be able to authenticate with the new, rewritten site (at which point their password will be hashed with something better). I’m relatively new to Python but not new to coding in general.

Here’s the PHP code. I will follow it up with the Python equivalent I’ve written. The password and seed are just sample data, but I expect it to always produce the same output every time.

$password = "password";
$seed = "dfa187ec";

$key  = str_pad($seed, 64, chr(0));
$ipad = str_repeat(chr(0x36), 64);
$opad = str_repeat(chr(0x5c), 64);

echo md5(($key^$opad) . pack('H*', md5(($key^$ipad) . $password)));

This produces the output e53ae584fb2ac17712785143b7bd1279 which is what I will expect when using the same data in Python.

Here’s my Python implementation (it’s a bit more verbose than the PHP algorithm for the sake of clarity):

from binascii import unhexlify
import hashlib

password = 'password'
seed = 'dfa187ec'

key = seed.ljust(64, chr(0))
ipad = chr(54) * 64
opad = chr(92) * 64

# Equivalent of PHP $key^ipad
ipad_xor = ""
for c in [ord(a) ^ ord(b) for a,b in zip(ipad, key)]:
    ipad_xor += chr(c)

# Equivalent of PHP md5($input)
hashed = hashlib.md5((ipad_xor + password).encode('utf-8')).hexdigest()

# Equivalent of PHP pack('H*', $input)
packed = ""
for c in unhexlify(hashed):
    packed += chr(c)

# Equivalent of PHP $key^opad
opad_xor = ""
for c in [ord(a) ^ ord(b) for a,b in zip(opad, key)]:
    opad_xor += chr(c)

final_hash = hashlib.md5((opad_xor + packed).encode('utf-8')).hexdigest()

print(final_hash)

This produces the output ce7a3ced60ceb06e746665fd5d22a2a0 which is not the same as the PHP implementation’s e53ae584fb2ac17712785143b7bd1279

I should note that I’m running the PHP on an online PHP application, which might be a factor. I haven’t confirmed if running PHP on my local system will produce different results.

I think the problem is in the call to unhexlify since everything before that point, when debugging, matches what I expect from the equivalent steps in the PHP algorithm. How do I ensure that my output of Python unhexlify matches the output of PHP pack? Here’s the comparison of the output in both cases (which may or may not be useful)

PHP: NE��@S�Gu�$ҕC (copied from browser)
Python: NE¸¬@S´GuÔ$ҕC (copied from terminal output - note that this is 1 character longer than the PHP output)

Hmm. Not sure here. My understanding is the php pack() is similar to perl’s pack(), and my quick testing is that pack(‘H*’) is similar to unhexlify. But I suspect I’d need a perl equivalent to follow along. I’m not sure of what the PHP side is doing…

Is it possible struct.pack() and struct.unpack() are more similar to
those PHP functions?

How would struct.pack() work in this context? I could do something like this (based on what I see from other examples online):

format = "{}s".format(len(hashed))
other_packed = struct.pack(format, hashed.encode('utf-8'))

But I’m not sure what to do with the results, which is just a byte array of length 32.

Is it possible PHP uses ISO-8859-1 by default instead of utf-8?

final_hash = hashlib.md5((opad_xor + packed).encode('ISO-8859-1 ')).hexdigest()

This gives the same e53... result.

Your “legacy algorithm” is HMAC-MD5.

Try this:

import hmac
print(hmac.HMAC(b"dfa187ec", b"password", 'md5').hexdigest())

This might do the trick - I wasn’t familiar with the hashing it was doing, but this looks like a much simpler approach.