61

When constructing a dictionary as follows:

dict = { True: 'yes', 1: 'No'}

When I run it in the interactive Python interpreter the dict is represented this way:

dict = {True: 'No'}

I understand that the values True and 1 are equal due to the type coercion because when comparing numeric types, the narrowed type is widened to the other type (boolean is a child of integer). So as I understood from the documentation, when we enter True == 1 Python converts True to 1 and compares them.

What I don't understand is why the True is selected as a key instead of 1.

I am missing something?

14
  • 9
    It's just the last key-value pair that wins. Since 1 == True, the second key overwrites the value for the first key. There is no reason to re-write the key for that operation however, only the value needs updating. Commented May 11, 2016 at 13:40
  • 13
    everyone stop answering this question! The OP is asking why True is kept as the key even though it keeps the value assigned to 1, this question is about "why is the mechanics of choosing the value different then choosing the key" Commented May 11, 2016 at 13:40
  • 7
    Why would it rewrite the key? Keys are immutable, True and 1 hash and compare equal, so they are equal, True was in there first, so True it is. If you swap the order of the dict-literal, putting key 1 first, it'll be 1. When a new item is supplied, if it is already represented in the dict (and it is, since True and 1 are the same), it overwrites the value - standard dict behavior. But it shouldn't overwrite the key, that makes no sense. Commented May 11, 2016 at 13:44
  • 10
    @Muposat: there is nothing 'stupid' about what is being done here, lets keep this constructive and not imply something about the programmer. Commented May 11, 2016 at 13:54
  • 3
    Python dictionary are unordered key: value but when constructed, they are processed in order @dwanderson. That's what you wanted to say? @Muposat this is a question that came next to an other question asked about why True and 1 are mapped to True and I answered that question, so some programmers was confronted to that. And I was wondering why python 3 made this choice of the True value. The only question that is stupid is the one we don't ask. Thanks for everyone's helping me to understand Commented May 11, 2016 at 15:53

6 Answers 6

43

Dictionaries are implemented as hash tables and there are two important concepts when adding keys/values here: hashing and equality.

To insert a particular key/value, Python first computes the hash value of the key. This hash value is used to determine the row of the table where Python should first attempt to put the key/value.

If the row of the hash table is empty, great: the new key/value can inserted into the dictionary, filling the empty row.

However, if there's already something in that row, Python needs to test the keys for equality. If the keys are equal (using ==) then they're deemed to be the same key and Python just needs to update the corresponding value on that row.

(If the keys are not equal Python looks at other rows in the table until it finds the key or reaches an empty row, but that's not relevant for this question.)


When you write {True: 'yes', 1: 'No'}, you are telling Python to create a new dictionary and then fill it with two key/value pairs. These are processed left to right: True: 'yes' then 1: 'No'.

We have hash(True) equals 1. The key True goes in at row 1 in the hash table and the string 'yes' is its value.

For the next pair, Python sees that hash(1) is also 1 and so looks at row 1 of the table. There's something already there, so now Python checks the keys for equality. We have 1 == True so 1 is deemed to be the same key as True and so its corresponding value is changed to the string 'No'.

This results in a dictionary with one entry: {True: 'No'}.


If you want to peer at the guts of CPython 3.5 to see what creating a dictionary looks below the surface-Python level, here's more detail.

  • The Python code {True: 'yes', 1: 'No'} is parsed into tokens and given to the compiler. Given the syntax, Python knows that a dictionary must be created using the values inside the braces. Byte code to load the four values onto the virtual machine's stack (LOAD_CONST) and then build the dictionary (BUILD_MAP) is queued up.

  • The four constant values are pushed onto the top of the stack in the order that they're seen:

    'No'
    1
    'yes'
    True
    
  • The opcode BUILD_MAP is then called with the argument 2 (Python counted two key/value pairs). This opcode is responsible for actually creating dictionary from the items on the stack. It looks like this:

    TARGET(BUILD_MAP) {
        int i;
        PyObject *map = _PyDict_NewPresized((Py_ssize_t)oparg);
        if (map == NULL)
            goto error;
        for (i = oparg; i > 0; i--) {
            int err;
            PyObject *key = PEEK(2*i);
            PyObject *value = PEEK(2*i - 1);
            err = PyDict_SetItem(map, key, value);
            if (err != 0) {
                Py_DECREF(map);
                goto error;
            }
        }
    
        while (oparg--) {
            Py_DECREF(POP());
            Py_DECREF(POP());
        }
        PUSH(map);
        DISPATCH();
    }
    

The three key steps here are as follows:

  1. An empty hashtable is created using _PyDict_NewPresized. Small dictionaries (of just a few items, like 2 in this case) need a table with eight rows.

  2. The for loop is entered, starting at 2 (in this case) and counting down to 0. PEEK(n) is a macro that points to the nth item down the stack. Therefore on the first iteration of the loop, we'll have

PyObject *key = PEEK(2*2);       /* item 4 down the stack */  
PyObject *value = PEEK(2*2 - 1); /* item 3 down the stack */

This means that *key will be True and *value will be 'yes' on the first loop through. On the second it will be 1 and 'No'.

  1. PyDict_SetItem is called in each loop to put the current *key and *value into the dictionary. This is the same function that is called when you write dictionary[key] = value. It computes the hash of the key to work out where to look first in the hash table and then, if needed, compare the key to any existing key on that row (as discussed above).
Sign up to request clarification or add additional context in comments.

7 Comments

@IljaEverilä: no, dicts do not consider types. Equality is all that is tested. Same happens with str vs. unicode objects in Python 2, since u'foo' == 'foo', the result of {u'foo': 42, 'foo': 81} is {u'foo': 81}.
Image
@MartijnPieters hash and then equality.
@MartijnPieters just read the docs with thought, right you are
@TadhgMcDonald-Jensen: objects that test equal must produce the same hash. The hash is then used to find the right slot in the hashtable, yes, but if the objects were equal to start with that's a given.
Image
@MartijnPieters float("nan") == float("nan") -> False and yet hash(float("nan")) == hash(float("nan")) -> True
|
31

Basic premise is - True and 1 have same hashes and are equal to each other - that's why they cannot be separate keys in hash table (technically inequal object with same hashes may - but hash collisions decreases performance).

>>> True == 1
True
>>> hash(1)
1
>>> hash(True)
1

Now, let's consider a bytecode:

import dis
dis.dis("Dic = { True: 'yes', 1: 'No'}")

This prints:

      0 LOAD_CONST               0 (True)
      3 LOAD_CONST               1 ('yes')
      6 LOAD_CONST               2 (1)
      9 LOAD_CONST               3 ('No')
     12 BUILD_MAP                2
     15 STORE_NAME               0 (Dic)
     18 LOAD_CONST               4 (None)
     21 RETURN_VALUE

Basically what happens is that dict literal is tokenized to keys and values, and they are pushed to stack. After that BUILD_MAP 2 coverts two pairs of (keys, values) to dictionary.

Most likely order of data on stack (which seems to be determined by order of keys in dict literal) and implementation details of BUILD_MAP decides on resulting dict keys and values.

It seems like key-value assignments are done in order defined in dict literal. Assignment behaves same as d[key] = value operation, so it's basically:

  • if key is not in dict (by equality): add key do dict
  • store value under key

Given {True: 'yes', 1: 'No'}:

  1. Start with {}
  2. Add (True, 'yes')

    1. True is not in dict -> (True, hash(True)) == (True, 1) is new key in dict
    2. Update value for key equal to 1 to 'yes'
  3. Add (1, 'no')

    1. 1 is in dict (1 == True) -> there is no need for new key in dictionary
    2. Update value for key equal to 1 (True) with value 'no'

Result: {True: 'No'}

As I commented, I do not know if this is guarranted by Python specs or if it is just CPython implementation-defined behavior, it may differ in other interpreter implementations.

Comments

21

True and 1 are different objects, but they both have the same value:

>>> True is 1
False
>>> True == 1
True

This is similar to two strings that may have the same value, but are stored in different memory locations:

>>> x = str(12345)
>>> y = str(12345)
>>> x == y 
True
>>> x is y
False

First one item is added to the dictionary; then when the other is added, that value already exists as a key. So the key is updated, key values are unique.

>>> {x: 1, y: 2}
{"12345": 2}

Comments

9

If the key is already present in the dictionary, it does not override the key only the value associated.

I believe that writing x = {True:"a", 1:"b"} is along the lines of:

x = {}
x[True] = "a"
x[1] = "b"

and by the time it reaches x[1] = "b" the key True is already in the dict so why change it? why not just override the associated value.

8 Comments

I was just wondering why when the two keys are evaluated for equity Python choose true instead of 1, because type coercion has it rules and True is evaluated to 1 to be compared to the key 1 and the answer seems to be order of the keys. And what is interesting about that is to know if it is an implementation of the CPython or it is a Python spec
Image
The absolutely only case where it makes a difference is when objects are considered equal (and produce same hash value) but act differently. I cannot think of a case other then True vs 1 or False vs 0 where this comes up, and as you have seen it makes absolutely no difference except when printing the dict out (d[1] <==> d[True]).
Image
imagine if every time you did d['a'] = 4 it changed which 'a' object was used for a key, what would be the point? This isn't a python spec so I guess that defaults it to an implementation detail but I don't see why you would want to implement it any other way...
I don't want to implement it in an other way, I wanted to understand and there is nothing wrong with this behavior. Understanding that dict are constructed the way @Rogalski explained was very interesting.
Image
I agree... But why did you comment on my question in the first place? I assumed it was to ask if I knew if it was a standard or implementation detail and I think I overreacted a bit, but did you have another reason to comment on my answer specifically?
|
6

The ordering seems to be the reason. Code example:

>>> d = {True: 'true', 1: 'one'}
>>> d
{True: 'one'}
>>> d = {1: 'one', True: 'true'}
>>> d
{1: 'true'}

3 Comments

@rll: of course that's the point. The first key sets a value, then the next key is equal to the first, and sets a new value for that key.
@MartijnPieters of course the order is not the point of the question.... as the many other answers show
@rll: swapping the order illustrates what happens. For equal keys, the last key-value wins, but only the value is used. That the keys don't look equal to someone not familiar with bools being a subclass of int is also important, but ordering is very much part of the explanation.
3

It's an ambiguous statement.

Logic: d = { True: 'no', 1: 'yes'}

When python evaluates the expression, it does it sequentially, so it's equivalent to this.

d = dict() d[True] = 'no' d[1] = 'yes'

The constant True is the key, but it evaluates to 1, so you're just setting the value for the key twice.

3 Comments

Image
did you copy this from my answer or did we come up with the same reasoning around the same time?
I was typing it out on my phone in the bathtub while you were posting your answer (I type slow on the phone). Noticed the duplicate when I posted, but kept it because we explained it in slightly different words so maybe it would help comprehension.
Image
That's... more specific then you needed to be. This still is mostly answering this other question about why True and 1 correspond to the same place in the dict. (In any case I'm glad you didn't plagiarize)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.