4
\$\begingroup\$

I'm new on Python, so I'm looking at these YouTube videos of Python projects to practice, and I saw this one of a Password Manager with encryption, but the encryption part couldnt use password, because the module used (cryptography.fernet) would need more modules to be able to use password. So I started trying to make an function to make this, and ended up with that function.

It is working basically, but i want advice about modules that i can use to get same output or simillar and dont need other modules to work fine.

i want to know other functions that i can change in my project to make it faster, since i dont know so much of then;

I want to know another way to store and encrypt the strings, since lists are bigger, and every caracter on the message is turning into an 3-times-larger information at least.

And, (if you can help), i need help to make the commented code part work:

all, except this first "caracters-scrambler" is working fine, and i cant understand what is the wrong function on that part, because it doesnt make an exception, and the outcome comes allmost like how it should when the password is right, but it is still scrambled. You can see it if you de-comment the commented code, i made this to keep the code still working right when you try it.

import random
def coddecod (senha: str, texto: str | list, modo = 'd', debug = False) -> list|str:
    '''String <=> List\n|
    debug (True / False) is an option that show more information about the process if marked as True\n
    texto (String / List) is the text to be Encrypted or Decrypted.\n
    modo ( 'c' / 'd' ) is how the function will be used, it can be 'c' (Encrypt) or 'd' (Decrypt).\n
    senha (String) is the password to be used to Encrypt/Decrypt (Needs to be the same on both sides).\n
    note: Not every keyboard key was added to this function, such "\" or "/" or "§"
    This function creates an list (like string) with some caracters of the password, then a code to the password with the list made, which is allways the same for the same password.\n
    Then checks to the mode:\n
        If 'c', Encrypts:\n
            It, randomlly, makes an string with 85 caracters, making 1 in 120 options, making to it an number (data) to be identfied.\n
            Then data * password-code to be "hiden"\n
            Adds the result to the text and Encrypts the text.\n
        If 'd', Decrypts:\n
            It re-makes the caracters-string used to Encrypt using the identifier number.\n
            And Decrypt using the password code and the string.\n
            If the password is REALLY wrong:\n
                It makes an random text.\n'''
    if debug == True:
        print(f"Password used: {senha}\nText used: {texto}\nMode: {modo}\nDebug: True")
    lista = ["0bm1!d2Mafgh3TtijkcheH4lnEou5pqUsr67Svwx8yz9BCDFGHIJKLNOPQRVWXYZ! |@#$%&*()_+-=[]°ºª^~`", 0, "", [], "", [0, 0, 0, 0, 0]]
    step = 0
    for _ in range(5):
        lista[3].append(lista[0][0 + step: 17 + step])
        step += 17
    if debug == True:
        print(f"blocks of string to be used: {lista[3]}")
    #needs some fix, de-comment this to see.
    '''
    for idx, char in enumerate(senha): #this is the code i need help. look at the '#string scrambler' to see an better code than this here with same objective.
        for step in range(len(lista[3])):
            if lista[5][step] == 1:
                continue
            for idx_fromstep, char_fromstep in enumerate(lista[3][step]):
                if char == ch and idx_fromstep <= 4:
                    lista[2] += "".join(lista[3][step])
                    lista[5][step] = 1
        if idx == len(senha) - 1 and lista[5][0] + lista[5][1] + lista[5][2] + lista[5][3] + lista[5][4] != 5:
            for i, ch in enumerate(lista[5]):
                if ch == 1:
                    continue
                else:
                    lista[5][i] = 1
                    lista[2] += "".join(lista[3][i])
    lista[5] = [0, 0, 0, 0, 0]
    if debug == True:
        print(f"list of the string scrambled by the password: {lista[2]}")'''
    for idx, char in enumerate("0" + senha):
        for i2, c2 in enumerate(lista[0]): #change to lista[2] when code be fixed
            if char == c2:
                lista[1] += i2 * idx
    if lista[1] == 0:
        lista[1] += len(senha)
    if debug == True:
        print(f"code made from password: {lista[1]}")
    if modo == 'c':
        if debug == True:
            print("mode: Encrypting")
        lista[2] = ""
        senha = []
        lista[4] += str(random.randint(1, 9))
        while True: #string scrambler
            teste2 = random.randint(0, 4)
            if lista[5][teste2] == 1:
                continue
            lista[2] += "".join(lista[3][teste2])
            lista[4] += str(teste2)
            lista[5][teste2] = 1
            if lista[5][0] + lista[5][1] + lista[5][2] + lista[5][3] + lista[5][4] == 5:
                break
        data = int(lista[4]) * lista[1]
        if debug == True:
            print(f"Code of the scambled caracters: {lista[4]}\nList of scrambled caracters: {lista[2]}")
        senha.append(data)
        for _, char in enumerate(texto):
            for idx, c2 in enumerate(lista[2]):
                if char == c2:
                    senha.append(idx * lista[1])
        return senha
    if modo == 'd':
        if debug == True:
            print("mode: Decrypting")
        s = []
        data = str(texto[0] // lista[1])
        if debug == True:
            print(f"String code got from text: {data}")
        try:
            for idx, char in enumerate(data):
                if idx == 0:
                    continue
                lista[2] += "".join(lista[3][int(char)])
            if debug == True:
                print(f"list of scrambled caracters got from code: {lista[2]}")
            for i in range(len(texto) - 1):
                letra = texto[i + 1] // lista[1]
                s.append(lista[2][letra])
            senha = "".join(s)
            return senha
        except:
            if debug == True:
                print(f"Password was really wrong.")
            reallywrongpassword = ""
            for _ in range(random.randint(5, 80)):
                reallywrongpassword += random.choice(lista[0][:])
            return reallywrongpassword
test = coddecod("FeijaoTropeiro", "Elias", "c")
wrong = coddecod("54hvrvrwe4ij", test)
wrong2 = coddecod("Feijao", test, "d")
right = coddecod("FeijaoTropeiro", test, "d")
print(f"The text is: 'Elias'.\nThis is the message encrypted: {test}\nThese 2 are the outcomes with wrong passwords: '{wrong}' and '{wrong2}'.\nThis is the message with the right password: {right}.\nRun this code more times to see the encrypted message change.")
New contributor
Unknown is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\$\endgroup\$

3 Answers 3

5
\$\begingroup\$

illegible source code

You may think that future maintainers of this code will be glad to read lines longer than 180 characters long. Perhaps there are some humans who fit that description. But you are needlessly limiting the scope of future team mates. Write code so people can read it. If you've gone much past 100 characters per line, then it's time to rethink things.

triple quote

print(f"The text is: 'Elias'.\nThis is the message ...

Ok, that's just silly, when we go much more than two hundred characters beyond the left margin. Python has perfectly good facilities for representing such a sequence of codepoints while keeping them legible. Prefer

print(f"""The text is: 'Elias'.
This is the message ... """)

design of Public API

Naming a function "code de-code", and accepting a mode flag, is probably a mistake. At a minimum it prevents type checkers like mypy from verifying that caller used the proper types for the intended mode.

Prefer to present a pair of functions in your Public API, without a modo flag.

respect the signature order

You supplied a docstring. Thank you.

def coddecod (senha: str, texto: str | list, modo = 'd', debug = False) -> list|str:
    '''String <=> List\n|
    debug (True / False) is an option that show more information about the process if marked as True\n
    texto (String / List) is the text to be Encrypted or Decrypted.\n
    modo ( 'c' / 'd' ) is how the function will be used, it can be 'c' (Encrypt) or 'd' (Decrypt).\n
    senha (String) is the password to be used to Encrypt/Decrypt (Needs to be the same on both sides).\n

What are all those \n newlines doing in there? Ok, fine, we'll just ignore them.

Rather than offer explanations in the order
{debug, texto, modo, senha}, please prefer the signature order of
{senha, texto, modo, debug}. It's just how the human brain works. Maintainers anticipate seeing such things appear in the same order with parallel construction, so a consistent narrative will evolve. Scrambling the order gratuitously throws a monkey wrench into that for no gain.

typo

For "caracters", read "characters". Yes, I understand the Portuguese word "caracteres" lacks "h". It's worth getting used to, if only to make sense of common abbreviations like "char" and "ch".

Also, after "Then checks to the mode" I failed to glean much of use. The signature already told me that we're going to encode and decode. But as far as the details go, all the docstring really told me was that I'd have to read the source to tell exactly what happens.

keying material

If you call it senha or contrasena or mot_de_passe or wachtwoord; I don't much care.

What does concern me is that you've not written down any assumptions about how many bits of entropy it should contain. As written it appears the Concept of Operations is for a person to type an "easily remembered" password or passphrase, which likely has fairly low entropy.

The whole point of crypto is to make it "hard" for Eve to recover the plaintext, and here you're not giving any hints about what "hard" means. Must she do \$2^{256}\$ work? Must she do "trivial" work, as for rot13?

During a Code Review we answer these questions:

  1. Does it work?
  2. Is it maintainable?

We can't really address whether it "works" well if we can't tell what Security Parameter (256 bits?) you're shooting for, and whether there's reason to believe the implemented code matches that spec. You need to be explicit about your security objectives.

boolean variable is a boolean expression

    if debug == True:

No.

Please just write if debug:

magic constant

    lista = ["0bm1!d2Mafgh3TtijkcheH4lnEou5pqUsr67Svwx8yz9BCDFGHIJ...

That's crazy. Why start with "0bm1" and not "1m0b" or another permutation?

Recall Kerckhoffs's principle. Eve has already read your source code. The details of lista are already known to her.

Much better to lexically sort the valid characters in the source code, and then permute them:

    lista = [" !!#$%&()*+-0123456789=@BCDEFGHHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghhijklmnopqrstuvwxyz|~ª°º", ...

Uggh, wait! Why do we have a pair of ! bangs in there? And a pair of capital H's? Also a pair of lowercase h's?!?

... makes an string with 85 ...

No, after dups it looks more like 84 distinct codepoints, to me.

BTW, kudos on using _ in for _ in range(5): to say that we won't use and don't care about the index value -- no need to name it.

more magic numbers

        lista[3].append( ... )

This is not great. The 3 is cryptic. Prefer to use a dict or @dataclass for a mutable structured tuple such as this.

tl;dr: Name your fields!

        step += 17

I imagine we might have a Caesar cipher going on here? It's obscure. Cite your references.

Also, you wrote some commented code. It doesn't execute; I didn't read it. Present working code on the Code Review site, and buggy code over on Stack Overflow.

another magic constant

    for idx, char in enumerate("0" + senha):

I can't imagine why we're prepending "0" there. It warrants a # comment.

If you write mysterious code, don't be surprised if a future teammate / maintainer rips it out because it's unclear it's doing anything useful.

extract helper

There seems to be some "scrambling" behavior going on near the top of the function, reading and writing several lista fields. The behavior is common to both the encode() and decode() paths.

Extract that common behavior into a small helper function, which both paths call into. Consider turning the whole codebase into a class, so methods can refer to self.mumble and self.blah when accessing those carefully scrambled variables.

cryptographically strong random numbers

        senha = []
        lista[4] += str(random.randint(1, 9))

I don't know exactly what element 4 is all about. But I'm worried that on this and on other calls you wanted unpredictable numbers, and that's not what we see here. When you read the fine docs about this generator you will find that

it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes.

It goes on to explain in bold writing

Warning The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module.

versioning

You are defining your own cryptographic primitive here, something that all the textbooks advise against doing. You will make mistakes. (Ok, further mistakes in future.) You will find occasion to make Breaking Changes to the file format, as you revise the encryption algorithm.

Write a magic number at the beginning of each encrypted output. Follow it with a version number, so your code can recognize the appropriate algorithm to use with one blob of bits or another, and so you can offer backward compatibility for historic files written in previous years.

\$\endgroup\$
3
\$\begingroup\$

Documentation

This large block of commented-out code should be deleted to reduce clutter:

#needs some fix, de-comment this to see.
'''
for idx, char in enumerate(senha): #this is the code i need help. look at the '#string scrambler' to see an better code than this here with same objective.
    for step in range(len(lista[3])):
        if lista[5][step] == 1:
            continue

Alternate versions of code can be stored in you version control system during development.

There is no need for long lines inside a docstring:

This function creates an list (like string) with some caracters of the password, then a code to the password with the list made, which is allways the same for the same password.\n

That line should be shortened by splitting it up into multiple lines.

Many of the lines in the doctring end in literal \n characters which can be deleted:

Then checks to the mode:\n

Simpler

This line:

if debug == True:

is simpler as:

if debug:

This line:

print(f"Password was really wrong.")

is simpler without the f-string:

print("Password was really wrong.")

ruff identifies other similar issues.

Naming

The PEP 8 style guide recommends snake_case for function and variable names.

The function name coddecod is strange. Perhaps code_decode is more meaningful.

The variable reallywrongpassword would be really_wrong_password.

senha does not convey much meaning in English. Either choose a more descriptive name or add a comment to describe what it means. The same is true for other variables like lista, etc.

\$\endgroup\$
2
\$\begingroup\$

To restate what has already been said, it is strongly advised to use English for variables, function names etc, and even comments and documentation. Because that makes maintenance easier for others, especially if you work in a team, or you create Github repos and other people find them useful and fork them. This can be challenging, but you have already made that effort in comments.

I find the flow and the code logic cryptic and hard to follow. What does not help is that you have a single function that is quite long, with multiple levels of indentation.

Without even understanding the internals of your code, it is obvious to me that it should be split into two functions, one for encoding, one for decoding. As the old saying goes, one function should do just one thing and do it well. You will surely notice that smaller functions are easier to manage and to debug.

The most problematic in my view is that lista because it is counter-intuitive. Basically, it is a range of parameters, but the purpose of each is not obvious without analyzing the code in depth. Working with indices is tricky and error-prone. There are different data structures that are more convenient to use, for example a class, a dataclass or even a named tuple.

Regarding the debug mode, you should get acquainted with the logging module, then instead of print just use logging.debug. So that your debug messages will show up depending on the requested logging level (can be set via a command line parameter or config file). Think of it, you have plenty of if debug == True: in your code, and you could easily get rid of them.

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.