Python/Pandas/Numpy-直接计算两个日期之间的工作日数(节假日除外)

问题描述:

是否有比以下方法更好/更直接的方法?

Is there a better / more direct way to calculate this than the following?

# 1. Set up the start and end date for which you want to calculate the      
# number of business days excluding holidays.

start_date = '01JAN1986'
end_date = '31DEC1987'
start_date = datetime.datetime.strptime(start_date, '%d%b%Y')
end_date = datetime.datetime.strptime(end_date, '%d%b%Y')

# 2. Generate a list of holidays over this period
from pandas.tseries.holiday import USFederalHolidayCalendar
calendar = USFederalHolidayCalendar()
holidays = calendar.holidays(start_date, end_date)
holidays

哪个给出了pandas.tseries.index.DatetimeIndex

Which gives a pandas.tseries.index.DatetimeIndex

DatetimeIndex(['1986-01-01', '1986-01-20', '1986-02-17', '1986-05-26',
           '1986-07-04', '1986-09-01', '1986-10-13', '1986-11-11',
           '1986-11-27', '1986-12-25', '1987-01-01', '1987-01-19',
           '1987-02-16', '1987-05-25', '1987-07-03', '1987-09-07',
           '1987-10-12', '1987-11-11', '1987-11-26', '1987-12-25'],
          dtype='datetime64[ns]', freq=None, tz=None)

但是您需要一个有关numpy busday_count的列表

But you need a list for numpy busday_count

holiday_date_list = holidays.date.tolist()

然后在有假期和无假期的情况下获得

Then with and without the holidays you get:

np.busday_count(start_date.date(), end_date.date()) 
>>> 521

np.busday_count(start_date.date(), end_date.date(), holidays = holiday_date_list)
>>> 501

还有一些其他问题,它们有些相似,但通常适用于pandas系列或数据框(计算两个系列之间的工作日)

There are some other questions slightly similar but generally working with pandas Series or Dataframes (Get business days between start and end date using pandas, Counting the business days between two series)

如果将创建的索引放在数据框中,则可以使用

If you put the index you created in a dataframe, you can use resample to fill in the gaps. The offset passed to .resample() can include things like business days and even (custom) calendars:

from pandas.tseries.holiday import USFederalHolidayCalendar

C = pd.offsets.CustomBusinessDay(calendar=USFederalHolidayCalendar())

start_date = '01JAN1986'
end_date = '31DEC1987'

(
pd.DataFrame(index=pd.to_datetime([start_date, end_date]))
    .resample(C, closed='right') 
    .asfreq()
    .index  
    .size
) - 1

索引的大小-1然后为我们提供了天数.

The size of the index - 1 then gives us the amount of days.