5

I have df with column salary_day

            salary_day
    0       thursday
    1       friday

I'm trying to get alternative dates present for each day.

For May 2020:

thursdays in may : 7,14,21,28 ,fridays in may : 1,8,15,22,29

Expected output for alternative Thursday and Friday for the month of May:

df

salary_day        req_dates
thursday           7,21 
friday           1,15,29

For June 2020:

Thursdays in june : 4,11,18,25 Friday in june : 5,12,19,26

As there are 5 fridays in may , first friday in june is not an alternative day and should be excluded and 12,26 should be considered

expected output for alternative Thursday and Friday for the month of June:

df

salary_day        req_dates
thursday           4,18
friday             12,26



Edit1: For all weekdays

For month of May

      salary_day        req_dates
0     Monday            4,18
1     Tuesday           5,19
2     Wednesday         6,20
3     Thursday          7,21
4     Friday           1,15,29 
5     Saturday         2,16,30 
6     Sunday           3,17,31
6
  • Somewhere the year has to be specified as well. Commented May 28, 2020 at 12:08
  • Updated with current year Commented May 28, 2020 at 12:11
  • What do you exactly mean with "alternate" day? Why is friday june 12,26 ? Commented May 28, 2020 at 12:40
  • Looks like starting with the first occurrence of Thursday, the salary must be processed every alternate Thursdays and Fridays. May was an exception as in May first Friday was on 1st of the month. Commented May 28, 2020 at 12:42
  • So you also have a starting month, so it does not start at 1s of january? That is an important detail you didnt provide. Commented May 28, 2020 at 12:43

2 Answers 2

2

I think the most clean and general way to do this is create a a help table with all the days of the specified year. And create extra columns: month, day_name, day.

Then to check which day_names are in df['salary_day]`.

After this we check if the day is un-even, by: day % 2 > 0.

Finally we GroupBy.agg and join the day as string by ,:

# create salary days to get altnerative days
days = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']
df = pd.DataFrame({'salary_day': days})

START_MONTH = 5
YEAR = 2020

def create_dates(y, month_start):
    dates = pd.date_range(f'{y}-{str(month_start).zfill(2)}-01', f'{y}-12-31')
    dates = pd.DataFrame({'dates': dates})
    dates['month'] = dates['dates'].dt.month
    dates['day_name'] = dates['dates'].dt.day_name().str.lower()
    dates['day'] = dates['dates'].dt.day
    return dates


def get_alternative_dates(salary_days, y, month_start):
    df_dates = create_dates(y, month_start)

    m = df_dates['day_name'].isin(salary_days)

    months = df_dates[m].copy()
    months['day'] = months['day'].astype(str)
    months['rank_days'] = months.groupby('day_name')['day'].cumcount().add(1)

    months = months[months['rank_days'].mod(2).ne(0)]
    df_final = months.groupby(['month', 'day_name'])['day'].agg(','.join).reset_index()

    return df_final

get_alternative_dates(df['salary_day'], YEAR, START_MONTH)

Output

    month  day_name      day
0       5    friday  1,15,29
1       5  thursday     7,21
2       6    friday    12,26
3       6  thursday     4,18
4       7    friday    10,24
5       7  thursday  2,16,30
6       8    friday     7,21
7       8  thursday    13,27
8       9    friday     4,18
9       9  thursday    10,24
10     10    friday  2,16,30
11     10  thursday     8,22
12     11    friday    13,27
13     11  thursday     5,19
14     12    friday    11,25
15     12  thursday  3,17,31
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, Erfan. How to include all days of week in above code
See edit on top, remove or add the days you want to get the altnerative from in the list days. @Zanthoxylumpiperitum
0

This worked for me:

# for read_clipboard()
'''
salary_day
thursday
friday
'''

import pandas as pd
df = pd.read_clipboard()
print(df)

.

  salary_day
0   thursday
1     friday

.

import calendar

c = calendar.Calendar(firstweekday=calendar.SUNDAY)

year = 2020; month = 5

monthcal = c.monthdatescalendar(year,month)
fridays = [(str(day)[-2:]) for week in monthcal for day in week if \
                day.weekday() == calendar.FRIDAY and \
                day.month == month]
thursdays = [(str(day)[-2:]) for week in monthcal for day in week if \
                day.weekday() == calendar.THURSDAY and \
                day.month == month]

# Friday will be the first salary day of the month only if it occours on 1st
if int(thursdays[0]) < int(fridays[0]):
   fridays = fridays[1:] 


df['req_dates'] = ''

print(df)

df.loc[df['salary_day'] == 'thursday', 'req_dates'] = ','.join(thursdays[::2])
df.loc[df['salary_day'] == 'friday', 'req_dates'] = ','.join(fridays[::2])

Output:

  salary_day req_dates
0   thursday     07,21
1     friday  01,15,29

For the month of June:

year = 2020; month = 6

Output:

  salary_day req_dates
0   thursday     04,18
1     friday     12,26

7 Comments

But for June month. Req output for Friday is 12,26 instead of 05,19.Because there are 5 Fridays in May.
Fixed that as well, please check now
Thanks Anshul, It worked for this use-case. If I want to have it for all days of the week, What changes should be done.
All days as in? Can you clarify?
Can you share sample input and expected output?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.