How to create a registration & user login with hashing?

Hi Pythoners
extremely new here, and I’m completely stumped on this topic.
for context I’m making a game that runs in the console, and every time the game is run, I want to ask the user if they would like to register or login.
if they register, I’m assuming Python will hash the password and save the login in a CSV.
if they click login, Python will take the username and hash the password and match it up against what’s in the CSV, if it matches the game will run, if it doesn’t they will be told invalid credentials, try again

the thing is, I’m completely stumped and have no idea what I’m doing.
I ended up hashing a hardcoded password and storing it to a CSV
I also assigned a user unput to a variable (username and password)

but as soon as I try and tie all these things together, while making sure it fits my use case, my brain goes blank and I don’t know what to do next.

any help will be greatly appreciated
thankyou

Please the share the code you have so far so we have something to review.

It can be a little confusing, if you’ve never done this kind of thing before.

Try this:

from hashlib import sha256
import csv


def login():
    with open("users.csv", mode="r", encoding='utf-8') as file:
        reader = csv.reader(file)
        name = input("Your registered name? ")
        pw = input("Your passphrase? ")
        for row in reader:
            reg_name = row[0]
            reg_pass = row[1]
            name_hash = sha256(name.encode('utf-8')).hexdigest()
            pw_hash = sha256(pw.encode('utf-8')).hexdigest()
            if name_hash != reg_name or pw_hash != reg_pass:
                check = False
            else:
                check = True
                break
    return check, name


check, name = login()

if check:
    print(f"Hey {name.capitalize()}. Welcome back.")
else:
    print("The login details are not valid.")

The csv file:

7f0b629cbb9d794b3daf19fcd686a30a039b47395545394dadc0574744996a87,a9d137a13239d2d4b7c10830734c7da2dbfea81bb10d8b74a7a4425a08848abe
84313ef39b0a979f0608491608870b3f2065f447d73e4373ba75ae2330aa82b5,a7fdb881ef96729565b682e3f4a9fdbd13275cc81ff8c0f9a5884612b08819bd

You should be able to figure how this works, but if not, then feel free to ask.

There’s also a way to hide what the user is typing, but I’ve refrained from including that.

You have a lot of the right ideas here. A few points, though:

  1. Usernames usually aren’t secret. So you can probably skip hashing those, and store them as is.
  2. You might have a LOT of users, so you will definitely want a quick way of looking up a user’s details. I suggest a dictionary, but that will mean saving in something better than CSV. Maybe JSON? Alternatively, look into actual databases.
  3. Hashing the password like that is a good start. It’s way better than storing cleartext passwords. But you have a few major risks that you’ll need to deal with, and the best way is to…
  4. … let someone else deal with them. Look into bcrypt. It’ll save you a ton of headaches.
  5. Or… truly let someone else deal with them. Look into OAuth, also known as “Sign in with Google/Facebook/GitHub/Twitch/Twitter/etc”. That’s a bit more effort to set up, but you get the enormous advantage that you no longer care about passwords at all.

But as Barry said, you need to share code before we can really help.

1 Like

Hi Rob
Thanks for your help. I understand what’s happening,
I have slightly edited the code to get rid of the username hashing, I am not sure if I have done this correctly though.
here is my current code

def login():
    with open("upw.csv", mode='r',  encoding='utf-8', newline='') as file:
        userreader = csv.reader(file, delimiter=',')
        name = input("Your registered name? ")
        pw = input("Your password? ")
        for row in userreader:
            reg_name = row[0]
            reg_pass = row[1]
            pw_hash = sha256(pw.encode('utf-8')).hexdigest()
            if pw_hash != reg_pass and name != reg_name:
                check = False
            else:
                check = True
                break
    return check, name


check, name = login()

if check:
    print(f"Hey {name}. Welcome back.")
else:
    print("The login details are not valid.")

I am now trying to write a “register” function that will take a new username and password, that will then be stored in the CSV and be accessible by the login function, also finding this quite difficult…
this is what I have so far

def register():
    with open("upw.csv", mode="a", encoding='utf-8', newline='') as file:
        userwriter = csv.writer(file, delimiter=',')
        newname = input("Enter your new account name? ")
        newpw = input("Create your password? ")
        newpw_hash = sha256(newpw.encode('utf-8')).hexdigest()
        for row in userwriter:
            newname = row[0]
            newpw_hash = row[1]

any help would be greatly appreciated. I’m already understanding it a lot more, thankyou,

The condition is wrong:

if pw_hash != reg_pass and name != reg_name:
    check = False
else:
    check = True
    break

The check will fail if neither the name nor the password match, so the check will succeed if the name or the password match.

You can shorten the code by returning as soon as you’ve found a match. It’s safe to return from the middle of a loop and it’s also safe to return from inside with statement because it’s guaranteed that the file will be closed.

That means that if it finishes the loop, it hasn’t found a match, and the final return only has to deal with that.

You are very welcome.

I’ve not got so much time right now, but I will come back to this within the next 10hrs or so. In the meantime, keep on trying to do what you need to do, and post back to let me know how you’re doing and what (if anything) you’re still not able to get coded.

A couple things:

  1. I’m assuming that this project is simply a learning exercise, to examine the principles and is not going to be exposed to an open port on the internet, via some kind of a server, right?
  2. Don’t let anyone convince you that hashing a user name (or any other user information) is unnecessary, particularly if said database in which the information is going to be stored (be that a simple flat database, such as a CSV file, or some kind of a relational DB) is publicly exposed.

Hi Barry, Thanks,
I have changed it to

            if pw_hash != reg_pass or name != reg_name:
                check = False
            else:
                check = True
                break
    return check, name```

Hi rob, hopefully I can have it done by the time you come back :slight_smile:
yes you are correct, this is plainly just a learning exercise.

I am doing quite well, I nearly have it. 2 more things to go.
I have to make it read the right username and password. at the moment the only person who can log in is the user in row 0, 1 of the CSV (hey, you)
and my register function isn’t saving the username and hashed password into the cells I was expecting.
shouldn’t be a problem if I find a different way to access the desired user and password.

csv broken

here is my code:


def register():
    newname = input("Enter your new account name? ")
    newpw = input("Create your password? ")
    newpw_hash = sha256(newpw.encode('utf-8')).hexdigest()
    with open("upw.csv", mode="a") as newFile:
        newWriter = csv.writer(newFile, dialect='excel')
        newWriter.writerow([newname])
        newWriter.writerow([newpw_hash])
        newFile.close()


def login():
    with open("upw.csv", mode='r',  encoding='utf-8', newline='') as file:
        userreader = csv.reader(file, delimiter=',')
        name = input("Your registered name? ")
        pw = input("Your password? ")
        for row in userreader:
            reg_name = row[0]
            reg_pass = row[1]
            pw_hash = sha256(pw.encode('utf-8')).hexdigest()
            if pw_hash != reg_pass and name != reg_name:
                check = False
            else:
                check = True


            if check:
                print(f"Hey {name}. Welcome back.")
                break
            else:
                print("The login details are not valid.")
                sys.exit()


no_account = True
while no_account:
    k = ""
    while k == "":
        k = input(
            "would you like to register a new account? y/n")
        if k == "y":
            register()
            no_account = False
        elif k == "n":
            login()
            no_account = False
        else:
            k = ""


for some clarity
this was my old register function that saved the username and hashed password onto the same row

def register():
    newname = input("Enter your new account name? ")
    newpw = input("Create your password? ")
    newpw_hash = sha256(newpw.encode('utf-8')).hexdigest()
    f = open("upw.csv", mode="a", encoding='utf-8', newline='')
    f.write(newname)
    f.write(newpw_hash)

There are some ways to improve the hashing scheme. One simple way is to use a random salt value for each key. So you store (for each user) the username, a random salt value, and the hash. The hash is computed as hash(salt + username). This makes the login a bit more secure against outside attacks.
(See for instance: cryptography - Why are salted hashes more secure for password storage? - Information Security Stack Exchange)

While this is true and necessary, it’s still only a partial solution, which is why I will always recommend going the whole way and bcrypting the passwords (or equivalent).

1 Like

Indeed - I think the OP also understands that, since they said this was a learning exercise.

1 Like

In a production system you would need to be able to change algorithm as well as using best practice hashing at any particular time. So the password info would be a tuple of (algorithm, has_value).

Once the algorithm needs upgrading you can gracefully manage that process while will allowing, for a time, the use of the old algorithm.

Hi Nick,

Tbh, I think that you’re over complicating this with the dialect='excel': a .csv file is just that; it’s a file in which the record data fields are separated by a comma and each record has a newline terminator, which is why the csv sample I posted looks like this:

7f0b629cbb9d794b3daf19fcd686a30a039b47395545394dadc0574744996a87,a9d137a13239d2d4b7c10830734c7da2dbfea81bb10d8b74a7a4425a08848abe
84313ef39b0a979f0608491608870b3f2065f447d73e4373ba75ae2330aa82b5,a7fdb881ef96729565b682e3f4a9fdbd13275cc81ff8c0f9a5884612b08819bd

The format there is username,password, but that is easy to attack, as you’ll see, if you find the sha256 hashes for nick and rob. If any ‘attacker’ does that, they have 50% of the puzzle solved; now to crack the other half. For your password, that’s not so hard, as I’ve simply used the information that is clearly shown in your posts, but for mine, well, good luck with that.

What would better? Well, how about the same information stored as:

3c1204eaac8c55fee4aa7cdc226b73023a77b32608688c2514f3fc22c317680a,
39986d061547706ad515a9999268dd97feafed570457f94500251e6b891f82af

Now it’s not so easy, right? But it can still be done, given that we know what the first two hashes are, so the entire system is still reliant on some sensible choices being made by the user, with regard to both their user name and their password, as well as some sensible choices being made by the system designer, so that it’s not so easy for an attacker to suss out the components of the table: enter what @hansgeunsmeyer has posted regarding the use of a ‘salt’, but I think that we’re getting somewhat ahead of the curve right now, as you’ve yet to get the csv read/write working correctly (unless I’ve missed something).

One way forward would be to separate all the operations, putting each of them into their own custom functions, so that you can build and test each function. Then glue all the functions together with some ‘driver code’. Indeed you have already made some good progress with regard to that approach, but you seem to have digressed a little toward the end. I could do all that for you, but I believe that you would learn a great deal more from this project if you were to do that for yourself, posting back with any questions that you may have along the way. In fact, I would go so far as to even farm out the hashing part of the script to its own function, so that the hashing can simply be switched in or out.

Edit to add: you could simplify the question regarding the registration of a new account:

response = False
while not response:
    response = input("Would you like to register a new account? (Y/N)").upper()
    # the .upper() method is used so that the user need not bother about that detail
    if response not in ("Y", "N"):  # now it's easy to test
        response = False
    else:
        if response == "Y":
            register()
        else:  # it has to be N
            pass  # a placeholder for what to do with this event

That should work, but I’ve not tested it; I’ve simply coded that on the fly and as such it may need some tinker time.

Hi Rob, sorry I took so long to get back to you, once I got home, I completely forgot about coding LOL

I will definitely go over what you have replied to me with tomorrow when I am working on it again as it is 12am here now :slight_smile:

I ended up changing some more stuff around at the end of the day, and I think I almost have it… very very close

here is my current code:

global username
global password

def saveuser(LOGIN,PW_hash):

    fieldnames = ['Username', 'Password']
    data = [{'Username': LOGIN, 'Password': PW_hash}]

    try:

        with open('upw.csv', 'x', newline='') as file:
            writer = csv.DictWriter(file, fieldnames=fieldnames)
            writer.writeheader()
            for data in data:
                writer.writerow(data)
    except FileExistsError:
        with open('upw.csv', 'a', newline='') as file:
            writer = csv.DictWriter(file, fieldnames=fieldnames)
            for data in data:
                writer.writerow(data)

            return True

def register():

    LOGIN = input("enter an account name")
    PW = input("enter a password")
    PW_hash = hashlib.sha256(PW.encode()).hexdigest()
    reply = saveuser(LOGIN,PW_hash)

    if reply:
        print("user created")
    else:
        print("failed to create")
        sys.exit()

def login():

    with open("upw.csv", mode='r') as file:
        global LOGIN
        LOGIN = username.get()
        global PW
        PW = password.get()
        PW_hash = hashlib.sha256(PW.encode()).hexdigest()
        userreader = csv.DictReader(file)
        for row in userreader:
            reg_name = row['Username']
            reg_pass = row['Password']
            if LOGIN == reg_name:
                if PW_hash == reg_pass:
                    print(f"Hey {LOGIN}. Welcome back")
                    break
                else:
                    continue
            elif LOGIN == reg_name:
                continue
        print("please enter valid username and password")
        #sys.exit()

no_account = True
while no_account:
    k = ""
    while k == "":
        k = input(
            "would you like to register a new account? y/n")
        if k == "y":
            register()
            no_account = False
        elif k == "n":
            login()
            no_account = False
        else:
            k = ""

as you can see there is a error with the login function.
everything else works perfectly
I was tinkering with this for hours today, if you see the code in my Pycharm, there is about 100 commented out lines of code :sweat_smile:
please let me know if you see any way of getting around this, I will be working on it all day again tomorrow, Many Thanks :slight_smile:

You have

global username
...
def login():
      ...
      username.get()

This will give you a NameError when login is called, since username is not really defined (it doesn’t have any value).

  • If you want to use globals, then username should simply be defined in the toplevel as
    username = None  # or whatever default value you want to give it
    
    After that you can declare it as ‘global’ inside a function and read/write to it.
  • But…! For this kind of program it’s actually bad to have any globals, and especially bad to use globals for the username etc. In real-world scenarios this kind of code would for instance need to be thread-safe and those globals totally prevent that (the code could be called through some web API, quasi-simultaneously for multiple users). Also, the globals make it less clear what is being passed to-and-fro between functions. So I would suggest, to make username, pwd, hash value etc, arguments or return-values of the functions you are writing. This will make it easier to debug and unit test your code too.
    Also, the global for ‘LOGIN’ and “PW” are better removed - just make them input arguments to the login function. You will see, this ultimately makes it easier to reason about what your function is doing/what you want to do in this function.
  • Since your code is an exercise related to security, you also have to be aware that global variables can indirectly make the code less secure, because they disconnect source data from the actual usage inside a function.
2 Likes

There’s bug that I see, right off of the bat:

  • saveuser()
    • The return is within the 2nd file context manager and as such, it’s not going to return True if the csv file is created with the 1st context manager.

The other things I notice are inconsistencies in the way that you’ve coded your file context managers. While one of them has mode='r' (which is certainly the way that I would code them), the others simply have 'x' and 'a', which is fine, but inconsistent. None of them, on the other hand, have encoding='utf-8', which I’m 90% sure will get you into some bother at some point, maybe not with this script, but if you get into a habit of including the encoding, then it’s not something that you’re likely fall over in the future.

I’ll not pull this apart any more, save to say that I don’t think that the default option/root should be to ask if a user wants to create an account; it’s generally a second option, but maybe I’m being too critical at this point, given that this is early in the development stage and you’ve not had any time yet to digest what I said in my last post.

I’m a little unclear as to why you’ve made the radical change to the data structure.

Do take note of what @hansgeunsmeyer has said and avoid the use of global variables.

I trust that you’ll take my comments as constructive, which is certainly the way that they are intended.

1 Like

Hi Rob, Thankyou so much for all your help.

I ended up finishing the project today.

i wrote a couple more functions that hashed the password, assigned it a value and used DictReader to read the CSV for me.
I then called those functions with my login function.
MISSION COMPLETE :smile:

1 Like

thanks mate, ended up following all of your advice in my final solution