Why are there extra empty strings at the beginning and end of the list returned by re.split?

Question

I'm trying to take a string of ints and/or floats and create a list of floats. The string is going to have these brackets in them that need to be ignored. I'm using re.split, but if my string begins and ends with a bracket, I get extra empty strings. Why is that?

Code:

import re
x = "[1 2 3 4][2 3 4 5]"
y =  "1 2 3 4][2 3 4 5"
p = re.compile(r'[^\d\.]+')
print p.split(x)
print p.split(y)

Output:

['', '1', '2', '3', '4', '2', '3', '4', '5', '']
['1', '2', '3', '4', '2', '3', '4', '5']

None of the answers here actually answer the OP's question (i.e. "Why is that?"). Some answers can be found in this stackoverflow question. — SpaceMonkey55
– SpaceMonkey55, Commented Apr 29, 2019 at 13:56

Florian Winter · Accepted Answer · 2017-01-25 11:54:36Z

11

If you use re.split, then a delimiter at the beginning or end of the string causes an empty string at the beginning or end of the array in the result.

If you don't want this, use re.findall with a regex that matches every sequence NOT containing delimiters.

Example:

import re

a = '[1 2 3 4]'
print(re.split(r'[^\d]+', a))
print(re.findall(r'[\d]+', a))

Output:

['', '1', '2', '3', '4', '']
['1', '2', '3', '4']

As others have pointed out in their answers, this may not be the perfect solution for this problem, but it is a general answer to the problem described in the title of the question, which I also had to solve when I found this question using Google.

answered Jan 25, 2017 at 11:54

Florian Winter

5,5062 gold badges52 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Kasravnd · Accepted Answer · 2015-06-18 20:05:10Z

1

As a more pythonic way you can just use a list comprehension and str.isdigit() method to check of your character is digit :

>>> [i for i in y if i.isdigit()]
['1', '2', '3', '4', '2', '3', '4', '5']

And about your code first of all you need to split based on space or brackets that could be done with [\[\] ] and for get rid of empty strings that is for leading and trailing brackets you can first strip your string :

>>> y =  "1 2 3 4][2 3 4 5"
>>> re.split(r'[\[\] ]+',y)
['1', '2', '3', '4', '2', '3', '4', '5']
>>> y =  "[1 2 3 4][2 3 4 5]"
>>> re.split(r'[\[\] ]+',y)
['', '1', '2', '3', '4', '2', '3', '4', '5', '']
>>> re.split(r'[\[\] ]+',y.strip('[]'))
['1', '2', '3', '4', '2', '3', '4', '5']

You can also wrap your result with filter function and using bool function.

>>> filter(bool,re.split(r'[\[\] ]+',y))
['1', '2', '3', '4', '2', '3', '4', '5']

edited Jun 18, 2015 at 20:05

answered Jun 18, 2015 at 19:58

Kasravnd

108k19 gold badges167 silver badges196 bronze badges

1 Comment

Mark Ransom Over a year ago

Your list comprehension only works if all the numbers are single digit. Certainly that's the case for the example in the question, but I would never assume it for the general case.

mkrieger1 · Accepted Answer · 2021-10-20 12:58:48Z

1

You can just use filter to avoid empty results:

x = "[1 2 3 4][2 3 4 5]"

print filter(None, re.split(r'[^\d.]+', x))
# => ['1', '2', '3', '4', '2', '3', '4', '5']

edited Oct 20, 2021 at 12:58

mkrieger1

24.4k7 gold badges69 silver badges85 bronze badges

answered Jun 18, 2015 at 20:02

anubhava

791k67 gold badges606 silver badges674 bronze badges

Comments

Federico Piazza · Accepted Answer · 2015-06-18 19:59:08Z

0

You can use regex to capture the content you want instead of splitting the string. You can use this regex:

(\d+)

Working demo

enter image description here

Python code:

import re
p = re.compile(ur'(\d+)')
test_str = u"[1 2 3 4][2 3 4 5]"

re.findall(p, test_str)

answered Jun 18, 2015 at 19:59

Federico Piazza

31.2k15 gold badges91 silver badges133 bronze badges

Collectives™ on Stack Overflow

Why are there extra empty strings at the beginning and end of the list returned by re.split?

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related