7

I need to build a function that takes as input a string and returns a dictionary.
The keys are numbers and the values are lists that contain the unique words that have a number of letters equal to the keys.
For example, if the input function is as follows:

n_letter_dictionary("The way you see people is the way you treat them and the Way you treat them is what they become")

The function should return:

{2: ['is'], 3: ['and', 'see', 'the', 'way', 'you'], 4: ['them', 'they', 'what'], 5: ['treat'], 6: ['become', 'people']}

The code that I have written is as follows:

def n_letter_dictionary(my_string):
    my_string=my_string.lower().split()
    sample_dictionary={}
    for word in my_string:
        words=len(word)
        sample_dictionary[words]=word
    print(sample_dictionary)
    return sample_dictionary

The function is returning a dictionary as follows:

{2: 'is', 3: 'you', 4: 'they', 5: 'treat', 6: 'become'}

The dictionary does not contain all the words with the same number of letters but is returning only the last one in the string.

0

7 Answers 7

7

Since you only want to store unique values in your lists, it actually makes more sense to use a set. Your code is almost right, you just need to make sure that you create a set if words isn't already a key in your dictionary, but that you add to the set if words is already a key in your dictionary. The following displays this:

def n_letter_dictionary(my_string):
    my_string=my_string.lower().split()
    sample_dictionary={}
    for word in my_string:
        words=len(word)
        if words in sample_dictionary:
            sample_dictionary[words].add(word)
        else:
            sample_dictionary[words] = {word}
    print(sample_dictionary)
    return sample_dictionary

n_letter_dictionary("The way you see people is the way you treat them and the Way you treat them is what they become")

Output

{2: set(['is']), 3: set(['and', 'the', 'see', 'you', 'way']), 
 4: set(['them', 'what', 'they']), 5: set(['treat']), 6: set(['become', 'people'])}
Sign up to request clarification or add additional context in comments.

3 Comments

oh, this is better, our other solutions will raise KeyError...
how to sort the list ['the', 'way', 'you', 'see', 'the', 'way', 'you', 'and', 'the', 'way', 'you']
just do some_list.sort() if you want to have it alphabetically
3

The problem with your code is that you just put the latest word into the dictionary. Instead, you have to add that word to some collection of words that have the same length. In your example, that is a list, but a set seems to be more appropriate, assuming order is not important.

def n_letter_dictionary(my_string):
    my_string=my_string.lower().split()
    sample_dictionary={}
    for word in my_string:
        if len(word) not in sample_dictionary:
            sample_dictionary[len(word)] = set()
        sample_dictionary[len(word)].add(word)
    return sample_dictionary

You can make this a bit shorter by using a collections.defaultdict(set):

    my_string=my_string.lower().split()
    sample_dictionary=collections.defaultdict(set)
    for word in my_string:
        sample_dictionary[len(word)].add(word)
    return dict(sample_dictionary)

Or use itertools.groupby, but for this you have to sort by length, first:

    words_sorted = sorted(my_string.lower().split(), key=len)
    return {k: set(g) for k, g in itertools.groupby(words_sorted, key=len)}

Example (same result for each of the three implementations):

>>> n_letter_dictionary("The way you see people is the way you treat them and the Way you treat them is what they become")
{2: {'is'}, 3: {'way', 'the', 'you', 'see', 'and'}, 4: {'what', 'them', 'they'}, 5: {'treat'}, 6: {'become', 'people'}}

1 Comment

Quite right, of course it makes more sense to remove duplicates!
2

With sample_dictionary[words]=word you overwrite the current contents which you have put there so far. You need a list, and to that you can append.

Instead of that you need:

if words in sample_dictionary.keys():
    sample_dictionary[words].append(word)
else:
    sample_dictionary[words]=[word]

So if there is a value to this key, I append to it, and else create a new list.

3 Comments

Yup, and you don't actually require the .keys()
Hi, Thanks a lot for the help. Still, i am getting repeating values for keys already present in the dictionary. Do you know a way to prevent repeating words without using set()?
Why don't you want to use the set()? Well, there is a way, of course. Replace the else: by elif word not in sample_dictionary[words]: -- then it will check this condition
2

You can use a defaultdict found in the collections library. You can use it to create a default type for the value portion of your dictionary, in this case a list, and just append to it based on the length of your word.

from collections import defaultdict

def n_letter_dictionary(my_string):
    my_dict = defaultdict(list)
    for word in my_string.split():
        my_dict[len(word)].append(word)

    return my_dict

You could still do this without defaultdict's, but would just be a little longer in length.

def n_letter_dictionary(my_string):
    my_dict = {}
    for word in my_string.split():
        word_length = len(word)
        if word_length in my_dict:
            my_dict[word_length].append(word)
        else:
            my_dict[word_length] = [word]

    return my_dict

To ensure no duplicated in the values list, without using set(). Be warned though, if your value lists are large, and your input data is fairly unique, you'll experience a performance setback as checking if the value already exists in the list will only early exit once it is encountered.

from collections import defaultdict

def n_letter_dictionary(my_string):
    my_dict = defaultdict(list)
    for word in my_string.split():
        if word not in my_dict[len(word)]:
            my_dict[len(word)].append(word)

    return my_dict

# without defaultdicts
def n_letter_dictionary(my_string):
    my_dict = {}                                  # Init an empty dict
    for word in my_string.split():                # Split the string and iterate over it
        word_length = len(word)                   # Get the length, also the key
        if word_length in my_dict:                # Check if the length is in the dict
            if word not in my_dict[word_length]:  # If the length exists as a key, but the word doesn't exist in the value list
                my_dict[word_length].append(word) # Add the word
        else:
            my_dict[word_length] = [word]         # The length/key doesn't exist, so you can safely add it without checking for its existence

So if you have a high frequency of duplicates and a short list of words to scan through, this approach would be acceptable. If you had for example a list of randomly generated words with just permutations of alphabetic characters, causing the value list to bloat, scanning through them will become expensive.

4 Comments

Thanks a lot, still i am getting repeating values for keys already present in the dictionary. Is there a way to remove repeating words without using set()?
I added a section on ensuring no duplicates without using set().
I am trying to do it using your 1st method without the defaultdict's, by adding an 'if word not in my_dict' after 'for word in my_string.split():', but i am still getting the same output with repeating words. Could you help me with your the method without the defaultdict's?
I have added an example without defaultdict but with unique results in the list without using set(). If you had if word not in my_dict that would always return True as word is in the value and your statement is only checking the keys of my_dict.
1

The shortest solution I came up with uses a defaultdict:

from collections import defaultdict

sentence = ("The way you see people is the way you treat them"
            " and the Way you treat them is what they become")

Now the algorithm:

wordsOfLength = defaultdict(list)
for word in sentence.split():
    wordsOfLength[len(word)].append(word)

Now wordsOfLength will hold the desired dictionary.

Comments

1

itertools groupby is the perfect tools for this.

from itertools import groupby
def n_letter_dictionary(string):
    result = {}
    for key, group in groupby(sorted(string.split(), key = lambda x: len(x)), lambda x: len(x)):
        result[key] = list(group)
    return result

print n_letter_dictionary("The way you see people is the way you treat them and the Way you treat them is what they become")

# {2: ['is', 'is'], 3: ['The', 'way', 'you', 'see', 'the', 'way', 'you', 'and', 'the', 'Way', 'you'], 4: ['them', 'them', 'what', 'they'], 5: ['treat', 'treat'], 6: ['people', 'become']}

4 Comments

Indeed, let me correct that swiftly.
Also, key = lambda x: len(x) is the same as just key=len ;-)
Yes, noticed that, Thanks !
Sorting the things is unnecessary effort just to please groupby. Reconsider that aspect.
0
my_string="a aa bb ccc a bb".lower().split()
sample_dictionary={}
for word in my_string:
    words=len(word)
    if words not in sample_dictionary:
        sample_dictionary[words] = []
    sample_dictionary[words].append(word)
print(sample_dictionary)

1 Comment

Reconsider the name of the variable words. It's rather a wordLength or similar.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.