Two python issues

Discussion:

Two python issues

(too old to reply)

Raymond Boute

2024-11-05 14:48:07 UTC

L.S.,

Python seem to suffer from a few poor design decisions regarding strings
and lists that affect the elegance of the language.

(a) An error-prone "feature" is returning -1 if a substring is not found

s = 'qwertyuiop'
s[s.find('r')]

'r'

s[s.find('p')]

'p'

s[s.find('a')]

'p'
If "find" is unsuccessful, an error message is the only clean option.
Moreover, using index -1 for the last item is a bad choice: it should be
len(s) - 1 (no laziness!).
Negative indices should be reserved for elements preceding the element
with index 0 (currently not implemented, but a must for orthogonal
design supporting general sequences).

(b) When using assignment for slices, only lists with the same length as
the slice should be acceptable, otherwise an error should be given.
Anything that re-indexes items not covered by the slice is against the
essential idea of assignment. For changes that imply re-indexing (e.g.,
inserting a list longer than the slice), Python offers cleaner solutions.

Comments are welcome.

With best regards,

Raymond

Cameron Simpson

2024-11-05 21:06:12 UTC

Permalink

Post by Raymond Boute
Python seem to suffer from a few poor design decisions regarding
strings and lists that affect the elegance of the language.
(a) An error-prone "feature" is returning -1 if a substring is not
found by "find", since -1 currently refers to the last item.

`find` is a pretty old API interface. It is what it is. It may obtain
some of its design choices from C style calls where returning -1 for
failure was a common idiom.

Post by Raymond Boute
If "find" is unsuccessful, an error message is the only clean option.

This is not true. Often we want to look for something, and act one way
or another depending on whether it is found. I've got plenty of loops
and other tests which more or less go "run until this is not found". It
is not an error, it is just a circumstance to accomodate.

Post by Raymond Boute
Moreover, using index -1 for the last item is a bad choice: it should
be len(s) - 1 (no laziness!).
Negative indices should be reserved for elements preceding the element
with index 0 (currently not implemented, but a must for orthogonal
design supporting general sequences).

It is _far_ too late to propose such a change.

Plenty of us are quite hapoy with negative indices. We just view them as
counting backwarss from the end of the string or sequence instead of
forwards from the beginning.

Post by Raymond Boute
(b) When using assignment for slices, only lists with the same length
as the slice should be acceptable, otherwise an error should be
given.

There are many many circumstances where we replace a subsequence with a
different subsequence of different length. Outlawing such a thing would
remove and extremely useful feature.

Cheers,
Cameron Simpson <***@cskk.id.au>

Jason Friedman

2024-11-05 21:08:45 UTC

Permalink

Post by Raymond Boute
(a) An error-prone "feature" is returning -1 if a substring is not found

s = 'qwertyuiop'
s[s.find('r')]

'r'

s[s.find('p')]

'p'

s[s.find('a')]

'p'
If "find" is unsuccessful, an error message is the only clean option.
Moreover, using index -1 for the last item is a bad choice: it should be
len(s) - 1 (no laziness!).

I'm not sure if this answers your objection but the note in the
documentation (https://docs.python.org/3/library/stdtypes.html#str.find)
says:

The find() method should be used only if you need to know the position of
sub.

I think the use case above is a little bit different.

Piergiorgio Sartor

2024-11-05 21:27:53 UTC

Permalink

Post by Raymond Boute
L.S.,
Python seem to suffer from a few poor design decisions regarding strings
and lists that affect the elegance of the language.
(a) An error-prone "feature" is returning -1 if a substring is not found

s = 'qwertyuiop'
s[s.find('r')]

'r'

s[s.find('p')]

'p'

s[s.find('a')]

'p'
If "find" is unsuccessful, an error message is the only clean option.
Moreover, using index -1 for the last item is a bad choice: it should be
len(s) - 1 (no laziness!).
Negative indices should be reserved for elements preceding the element
with index 0 (currently not implemented, but a must for orthogonal
design supporting general sequences).
(b) When using assignment for slices, only lists with the same length as
the slice should be acceptable, otherwise an error should be given.
Anything that re-indexes items not covered by the slice is against the
essential idea of assignment. For changes that imply re-indexing (e.g.,
inserting a list longer than the slice), Python offers cleaner solutions.
Comments are welcome.

To write the nested expression, s[s.find(...)] it
means you're 200% sure of what happens in case of
not found.
It could be -1 or None or [] or anything.

So, the really correct thing to do, since you know
what will happen in case of not found, is *not* to
write the nested form, but explicitly state what it
will happen.

r = s.find(...)
if r is good:
s[r]
else:
print('not found')

Which is much easier to read, to debug, etc.

To paraphrase someone: "If the length of a
program would be measured by the time needed
to understand it, some programs are too short
to be short."

bye,

--
piergiorgio

Lawrence D'Oliveiro

2024-11-05 21:56:01 UTC

Permalink

To write the nested expression, s[s.find(...)] it means you're 200% sure
of what happens in case of not found.

Or use s.index(...) instead of s.find(...). Then you get an exception if
the substring is not found, instead of having your program produce some
mysteriously-incorrect result.

2024-11-05 21:41:59 UTC

Permalink

Post by Jason Friedman

Post by Raymond Boute
(a) An error-prone "feature" is returning -1 if a substring is not found

s = 'qwertyuiop'
s[s.find('r')]

'r'

s[s.find('p')]

'p'

s[s.find('a')]

'p'
If "find" is unsuccessful, an error message is the only clean option.
Moreover, using index -1 for the last item is a bad choice: it should be
len(s) - 1 (no laziness!).

I'm not sure if this answers your objection but the note in the
documentation (https://docs.python.org/3/library/stdtypes.html#str.find)
The find() method should be used only if you need to know the position of
sub.
I think the use case above is a little bit different.

Not really, there are two questions:

1. is x in sequence (or in this case "not in")
2. where is x within sequence (find())

There are situations where one might be used, similarly where the other
will be used, and still more where both apply.

That said, and with @Cameron's observation, the idea that a function's
return-value (appears to) performs two functionalities is regarded as a
'code-smell' in today's world - either it indicates "found" or it
indicates "where found" (see also various APIs which return both a
boolean: success/fail, and a value: None/valid-info).

The problem with the third scenario being that purity suggests we should
use both (1) and (2) which seems like duplication - and is certainly
going to take more CPU time.
(will such be noticeable in your use-case?)

Backward-compatibility... ('nuff said!)

With reference to the OP content about slicing:
- Python's memory-addressing is different from many other languages.
Thus, requires study before comparison/criticism
- there are major differences in what can be accomplished with mutable
and immutable objects

--
Regards,
=dn

Stefan Ram

2024-11-06 00:26:51 UTC

Permalink

Post by Raymond Boute

s[s.find('a')]

'p'

If you want exceptions, index() is your jam.

Post by Raymond Boute
Moreover, using index -1 for the last item is a bad choice: it should be
len(s) - 1 (no laziness!).

This feature is the bee's knees, man.

last_item = my_list[ -1 ]
second_to_last = my_list[ -2 ]

Way cleaner than my_list[ len( my_list )- 1 ], don't you think?

Post by Raymond Boute
(b) When using assignment for slices, only lists with the same length as
the slice should be acceptable, otherwise an error should be given.

The current setup lets you pull off some seriously gnarly code
maneuvers:

python
# Insert elements
my_list[ 2: 2 ]=[ 4, 5, 6 ]

# Replace a section
my_list[ 1: 4 ]=[ 10, 11 ]

# Delete elements
my_list[ 2: 5 ]= []

These moves are slick and expressive. Forcing exact length
matching would only cramp our style.

Look, I get it. You're probably fresh out of coding bootcamp,
thinking you can reinvent the wheel. But trust me, these features
are solid gold. They're not perfect, sure, but they're about
as good as finding a parking spot in San Francisco on a Saturday
night. So before you go throwing shade at Python, maybe spend
some more time with it. You might just find it grows on you like
a NorCal redwood. Keep coding, and don't let the fog get you down!

Roel Schroeven

2024-11-06 09:36:01 UTC

Permalink

Post by Raymond Boute
L.S.,
Python seem to suffer from a few poor design decisions regarding
strings and lists that affect the elegance of the language.
(a) An error-prone "feature" is returning -1 if a substring is not

This is IMO indeed not the best design decision. Fortunately there's an
alternative: the "index" method on strings, which raises exception
ValueError when the substring is not found, instead of returning -1. An

Post by Raymond Boute

s = 'qwertyuiop'
s[s.index('p')]

'p'

Post by Raymond Boute

s[s.index('a')]

Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
s[s.index('a')]
ValueError: substring not found

I don't agree, I think this is a case of "practicality beats purity".
Being able to use negative indices to index from the end is often very
useful in my experience, and leads to code that is much easier to grok.
General sequences on the other hand is a concept that I don't see ever
implemented in Python (or most other programming languages AFAIK). I
think it would be wrong to avoid implementing a feature that's very
useful in practice in order to keep the door open for a theoretical
feature that's probably not even wanted in the language.

Post by Raymond Boute
(b) When using assignment for slices, only lists with the same length
as the slice should be acceptable, otherwise an error should be
given. Anything that re-indexes items not covered by the slice is
against the essential idea of assignment. For changes that imply
re-indexing (e.g., inserting a list longer than the slice), Python
offers cleaner solutions.

Again I don't agree. I don't see anything wrong with replacing a part of
a list with something that's longer, or shorter, or even empty. It's
very practical, and I don't see how it's against the essential idea of
assignment (or actually I'm not even sure what you mean by that).

Two closing remarks:

(1) I think that Python indeed has some warts, inconsistencies, gotchas.
I think this is unavoidable in any complex system. Python got rid of a
number of those in the transition from Python 2 to Python 3; others
might remain forever. Overall though I feel Python is more consistent
than most other programming languages I come in contact with.

(2) Design decisions are not necessarily wrong when they don't match
your mental model, or don't match what you know from other languages.
Often there are different valid options, each with their own tradeoffs.

--
"Programming today is a race between software engineers striving to build bigger
and better idiot-proof programs, and the Universe trying to produce bigger and
better idiots. So far, the Universe is winning."
-- Douglas Adams