分组并填写缺少的日期时间值
问题描述:
我要尝试的是按合同和日期对Pandas Dataframe进行分组,并填写缺少的datetime值.
What I'm just trying is to group a Pandas Dataframe by contract and date, and fill missing datetime values.
我的输入是这样
contract datetime value1 value2
x 2019-01-01 00:00:00 50 60
x 2019-01-01 01:00:00 30 60
x 2019-01-01 02:00:00 70 80
y 2019-01-01 00:00:00 30 100
我想做的是为每个合约设置所有可能的日期时间(从00:00:00到23:00:00),并用NaN或None填充缺失值.
What I want to do is to have all possible datetimes (from 00:00:00 to 23:00:00) for each contract, and fill missing values with NaN or None.
非常感谢您.
答
您可以使用 DataFrame.groupby
和lambda函数:
You can use DataFrame.reindex
per groups with DataFrame.groupby
and lambda function:
df['datetime'] = pd.to_datetime(df['datetime'])
f= lambda x: x.reindex(pd.date_range(x.index.min().floor('d'),
.index.max().floor('d')+pd.Timedelta(23, 'H'),freq='H'))
df1 = (df.set_index('datetime')
.groupby('contract')
.apply(f)
.drop('contract', axis=1)
.reset_index())
print (df1)