Let us try and find the relevant bits in the docs.
from the np.array doc string:
array(...)
[...]
Parameters
[...]
dtype : data-type, optional
The desired data-type for the array. If not given, then the type will
be determined as the minimum type required to hold the objects in the
sequence. This argument can only be used to 'upcast' the array. For
downcasting, use the .astype(t) method.
[...]
(my emphasis)
It should be noted that this is not entirely accurate, for example for integer arrays the system (C) default integer is preferred over smaller integer types as is evident form your example.
Note that for numpy to be fast it is essential that all elements of an array be of the same size. Otherwise, how would you quickly locate the 1000th element, say? Also, mixing types wouldn't save all that much space since you would have to store the types of every single element on top of the raw data.
Re your second question. First of all. There are type promotion rules in numpy. The best doc I could find for that is the np.result_type doc string:
result_type(...) result_type(*arrays_and_dtypes)
Returns the type that results from applying the NumPy type promotion
rules to the arguments.
Type promotion in NumPy works similarly to the rules in languages like
C++, with some slight differences. When both scalars and arrays are
used, the array's type takes precedence and the actual value of the
scalar is taken into account.
For example, calculating 3*a, where a is an array of 32-bit floats,
intuitively should result in a 32-bit float output. If the 3 is a
32-bit integer, the NumPy rules indicate it can't convert losslessly
into a 32-bit float, so a 64-bit float should be the result type. By
examining the value of the constant, '3', we see that it fits in an
8-bit integer, which can be cast losslessly into the 32-bit float.
[...]
I'm not quoting the entire thing here, refer to the doc string for more detail.
The exact way these rules apply are complicated and appear to represent a compromise between being intuitive and efficiency.
For example, the choice is based on inputs, not result
>>> A = np.full((2, 2), 30000, 'i2')
>>>
>>> A
array([[30000, 30000],
[30000, 30000]], dtype=int16)
# 1
>>> A + 30000
array([[-5536, -5536],
[-5536, -5536]], dtype=int16)
# 2
>>> A + 60000
array([[90000, 90000],
[90000, 90000]], dtype=int32)
Here efficiency wins. It would arguably be more intuitive to have #1 behave like #2. But this would be expensive.
Also, and more directly related to your question, type promotion only applies out-of-place, not in-place:
# out-of-place
>>> A_new = A + 60000
>>> A_new
array([[90000, 90000],
[90000, 90000]], dtype=int32)
# in-place
>>> A += 60000
>>> A
array([[24464, 24464],
[24464, 24464]], dtype=int16)
or
# out-of-place
>>> A_new = np.where([[0, 0], [0, 1]], 60000, A)
>>> A_new
array([[30000, 30000],
[30000, 60000]], dtype=int32)
# in-place
>>> A[1, 1] = 60000
>>> A
array([[30000, 30000],
[30000, -5536]], dtype=int16)
Again, this may seem rather non-intuitive. There are, however, compelling reasons for this choice.
And these should answer your second question:
Changing to a larger dtype would require allocating a larger buffer and copying over all the data. Not only would that be expensive for large arrays.
Many idioms in numpy rely on views and the fact that writing to a view directly modifies the base array (and other overlapping views). Therefore an array is not free to change its data buffer whenever it feels like it. To not break the link between views it would be necessary for an array to be aware of all views into its data buffer which would add a lot of admin overhead, and all those views would have to change their data pointers and metadata as well. And if the first array is itself a view (a slice, say) into another array things get even worse.
I suppose we can agree on that not being worth it and that is why types are not promoted in-place.
int32is the defaultdtypefor integers in numpy. However,22222222222is larger than2**32//2-1(the max value for anint32), soint64is used insteaddtypeof an array is fixed. Set values are changed to conform to that dtype, if possible.viewandastypecreate new arrays.np.arrayis a complex function, capable of handling a wide variety of inputs. It evaluates all input values, and chooses a dtype that will accommodate all. Most of us accept that choice as a black-box operation - the results are usually logical.