-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
import pandas as pd
from datetime import datetime, timedelta
end = datetime.utcnow()
begin = end - timedelta(minutes=1)
data_interval = 10
date_index = pd.date_range(start=begin,
end=end,
freq='{} s'.format(data_interval))
df = pd.DataFrame([], columns=['time','a','b'])
df = df.set_index('time', drop=True)
tol = timedelta(seconds=9)
df = df.reindex(date_index, method='pad', tolerance=tol)
# IndexError: index -1 is out of bounds for axis 0 with size 0
df = pd.DataFrame([], columns=['time','a','b'])
df = df.reindex(date_index, method='nearest')
# IndexError: index -1 is out of bounds for axis 0 with size 0You get an index error when a dataframe is empty using the tolerance= or method='nearest' .
This is not something that happens with other usages of reindex and can come up as a surprise when reindexing an empty window of data. I would expect it to behave the same as it does without tolerance here.
Expected Output
Should be same as reindex with no args in this case which returns...
a b
2019-07-09 22:35:05.165640 NaN NaN
2019-07-09 22:35:15.165640 NaN NaN
2019-07-09 22:35:25.165640 NaN NaN
2019-07-09 22:35:35.165640 NaN NaN
2019-07-09 22:35:45.165640 NaN NaN
2019-07-09 22:35:55.165640 NaN NaN
2019-07-09 22:36:05.165640 NaN NaN
Temp Solution
Simple user solution is to check length... but this is a problem that might surprise someone at a bad time like it did for us.
if(len(df) is 0):
df = df.reindex(date_index)
else:
df = df.reindex(date_index, method='pad', tolerance=tol)Output of pd.show_versions()
pandas: 0.24.2
pytest: 5.0.1
pip: 18.0
setuptools: 40.4.1
Cython: None
numpy: 1.16.4
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.1.8
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.3.5
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: 0.6.1
pandas_datareader: None
gcsfs: None