连接数据框-一个包含多索引列,另一个不包含
问题描述:
我正在尝试连接两个数据框-一个具有多索引列,另一个具有单个列名.它们具有相似的索引.
I'm trying to join two dataframes - one with multiindex columns and the other with a single column name. They have similar index.
我收到以下警告: "UserWarning:在不同级别之间合并会产生意想不到的结果(左侧3个级别,右侧1个级别)"
I get the following warning: "UserWarning: merging between different levels can give an unintended result (3 levels on the left, 1 on the right)"
例如:
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df2 = pd.DataFrame(np.random.randn(3), index=['A', 'B', 'C'],columns=['w'])
df3 = df.join(df2)
结合这两个数据框的最佳方法是什么?
What is the best way to join these two dataframes?
答
这取决于您的需求!您是否要使df2
中的列与df
中的第一级或第二级列对齐?
It depends on what you want! Do you want the column from df2
to be aligned with the 1st or second level of columns from df
?
您必须在df2
pd.concat
df.join(pd.concat([df2], axis=1, keys=['a']))
更好的方式
df2.columns = pd.MultiIndex.from_product([['a'], df2.columns])
df.join(df2)