9

I have found examples of how to remove a column based on all or a threshold but I have not been able to find a solution to my particular problem which is dropping the column if the last row is nan. The reason for this is im using time series data in which the collection of data doesnt all start at the same time which is fine but if I used one of the previous solutions it would remove 95% of the dataset. I do however not want data whose most recent column is nan as it means its defunct.

A B C
nan t x 
1 2 3
x y z
4 nan 6

Returns

A C
nan x
1 3
x z
4 6

5 Answers 5

5

You can also do something like this

df.loc[:, ~df.iloc[-1].isna()]
    A   C
0   NaN x
1   1   3
2   x   z
3   4   6
Sign up to request clarification or add additional context in comments.

Comments

4

Try with dropna

df = df.dropna(axis=1, subset=[df.index[-1]], how='any')
Out[8]: 
     A  C
0  NaN  x
1    1  3
2    x  z
3    4  6

Comments

3

You can use .iloc, .loc and .notna() to sort out your problem.

df = pd.DataFrame({"A":[np.nan, 1,"x",4],  
                   "B":["t",2,"y",np.nan],
                   "C":["x",3,"z",6]})
 
df = df.loc[:,df.iloc[-1,:].notna()]

Comments

2

You can use a boolean Series to select the column to drop

df.drop(df.loc[:,df.iloc[-1].isna()], axis=1)

Out:

     A  C
0  NaN  x
1    1  3
2    x  z
3    4  6

Comments

1
for i in range(temp_df.shape[1]):
    if temp_df.iloc[-1,i] == 'nan':
        temp_df = temp_df.drop(i,1)

This will work for you. Basically what I'm doing here is looping over all columns and checking if last entry is 'nan', then dropping that column. temp_df.shape[1] this is the numbers of columns.

pandas.df.drop(i,1) i represents the column index and 1 represents that you want to drop the column.

EDIT: I read the other answers on this same post and it seems to me that notna would be best (I would use it), but the advantage of this method is that someone can compare anything they wish to. Another method I found is isnull() which is a function in the pandas library which will work like this:

for i in range(temp_df.shape[1]):
    if temp_df.iloc[-1,i].isnull():
        temp_df = temp_df.drop(i,1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.