分组并填写缺少的日期时间值

问题描述:

我要尝试的是按合同和日期对Pandas Dataframe进行分组,并填写缺少的datetime值.

What I'm just trying is to group a Pandas Dataframe by contract and date, and fill missing datetime values.

我的输入是这样

contract         datetime             value1          value2
   x       2019-01-01 00:00:00          50              60
   x       2019-01-01 01:00:00          30              60
   x       2019-01-01 02:00:00          70              80
   y       2019-01-01 00:00:00          30              100

我想做的是为每个合约设置所有可能的日期时间(从00:00:00到23:00:00),并用NaN或None填充缺失值.

What I want to do is to have all possible datetimes (from 00:00:00 to 23:00:00) for each contract, and fill missing values with NaN or None.

非常感谢您.

您可以使用 DataFrame.groupby 和lambda函数:

You can use DataFrame.reindex per groups with DataFrame.groupby and lambda function:

df['datetime'] = pd.to_datetime(df['datetime'])

f= lambda x: x.reindex(pd.date_range(x.index.min().floor('d'),
                                      .index.max().floor('d')+pd.Timedelta(23, 'H'),freq='H'))
df1 = (df.set_index('datetime')
         .groupby('contract')
         .apply(f)
         .drop('contract', axis=1)
         .reset_index())
print (df1)