4

This is my dataframe

     ID       Date    Value Final Value
0   9560    12/15/2021  30  5.0
1   9560    07/3/2021   25  5.0
2   9560    03/03/2021  20  20.0
3   9712    08/20/2021  15  5.0
4   9712    12/31/2021  10  10.0
5   9920    04/11/2021  5   5.0

Here I need to create a another column 'Round Date'. Get the date from 'Date' column if date is greater than 15 Date should be round off to the beginning date of that month or else begininng date of next month. The expected output is given below.

     ID       Date    Value Final Value  Round Date
0   9560    12/15/2021  30  5.0          12/01/2021
1   9560    07/3/2021   25  5.0          07/01/2021
2   9560    03/03/2021  20  20.0         03/01/2021
3   9712    08/20/2021  15  5.0          09/01/2021
4   9712    12/31/2021  10  10.0         01/01/2022
5   9920    04/11/2021   5   5.0         04/01/2021 
2
  • Is the problem doing the rounding or adding the column or both? Commented Dec 1, 2021 at 10:44
  • I am finding difficulties in rounding sir! Commented Dec 1, 2021 at 10:45

3 Answers 3

3

The solution is composed of a few elements:

  1. create a function to round a single date. Although an anonymous function lambda can be used it is better to create a function because we can test the function (unit test) to see that it performs the way we expect.

  2. First we need to convert the 'Date' column into a datetime type. For this we can use the built in pandas function pd.to_datetime().

  3. Lastly in order to create the 'Round Date' column we will just call apply() on the 'Date' column and give it our round date function.

Here is the code to round the Date column in pandas dataframe:

from dateutil import relativedelta
import pandas as pd

# function to round date
def round_date(date):
    # check if day is greater than 15
    if date.day > 15:
        # change month to next month
        date += relativedelta.relativedelta(months=1)    
    # change day to start of month
    date = date.replace(day=1)
    return date

# read data
df = pd.read_csv('your-path')
# change date column to datetime
df['Date'] = pd.to_datetime(df['Date'] )
# apply round date function to column
df['Round Date'] = df['Date'].apply(round_date)

Input:

      ID       Date       Value    Final Value
0    9560    12/15/2021    30         5.0
1    9560    07/03/2021    25         5.0
2    9560    03/03/2021    20         20.0
3    9712    08/20/2021    15         5.0
4    9712    12/31/2021    10         10.0
5    9920    04/11/2021    5          5.0

Output:

      ID       Date       Value    Final Value    Round Date
0    9560    12/15/2021    30         5.0         12/01/2021
1    9560    07/03/2021    25         5.0         07/01/2021
2    9560    03/03/2021    20         20.0        03/01/2021
3    9712    08/20/2021    15         5.0         09/01/2021
4    9712    12/31/2021    10         10.0        01/01/2022
5    9920    04/11/2021    5          5.0         04/01/2021
Sign up to request clarification or add additional context in comments.

Comments

2

Here is a solution using apply() and lambda

  1. Convert values in Date column to datetime using pd.to_datetime()
  2. use apply() and lambda function to set day to 1 and increase month value if condition day is greater than 15

Code:

import pandas as pd
from datetime import timedelta
from dateutil.relativedelta import relativedelta

df = pd.DataFrame({'Date': ['2/1/2021', '01/31/2021', '12/31/2021', '2021-12-01']})
df.Date = pd.to_datetime(df.Date)
df['Round Date'] = df.Date.apply(lambda x: x.replace(day=1)+ relativedelta(months=1)  
                                 if x.day > 15 
                                 else x.replace(day=1))

Input:

    Date
0   2/1/2021
1   01/31/2021
2   12/31/2021
3   2021-12-01

Output:


    Date        Round Date
0   2021-02-01  2021-02-01
1   2021-01-31  2021-02-01
2   2021-12-31  2022-01-01
3   2021-12-01  2021-12-01

Comments

2

Using numpy's where:

import pandas as pd
import numpy as np

df["Date"] = pd.to_datetime(df.date)
df["date.rounded"] = np.where(
    df.Date.dt.day > 15,
    df.Date + pd.offsets.MonthBegin(0),
    df.Date + pd.offsets.MonthEnd(0) - pd.offsets.MonthBegin(1)
)

This yields:

    Date      date.rounded
0   2021-12-15  2021-12-01
1   2021-07-03  2021-07-01
2   2021-03-03  2021-03-01
3   2021-08-20  2021-09-01
4   2021-12-31  2022-01-01
5   2021-04-11  2021-04-01

2 Comments

This solution doesn't work when the day is 1 eg: '2/1/2021', '2021-12-01'
Thanks for pointing that out. Fixed the logic.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.