连接数据框-一个包含多索引列,另一个不包含

连接数据框-一个包含多索引列,另一个不包含

问题描述:

我正在尝试连接两个数据框-一个具有多索引列,另一个具有单个列名.它们具有相似的索引.

I'm trying to join two dataframes - one with multiindex columns and the other with a single column name. They have similar index.

我收到以下警告: "UserWarning:在不同级别之间合并会产生意想不到的结果(左侧3个级别,右侧1个级别)"

I get the following warning: "UserWarning: merging between different levels can give an unintended result (3 levels on the left, 1 on the right)"

例如:

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df2 = pd.DataFrame(np.random.randn(3), index=['A', 'B', 'C'],columns=['w'])
df3 = df.join(df2)

结合这两个数据框的最佳方法是什么?

What is the best way to join these two dataframes?

这取决于您的需求!您是否要使df2中的列与df中的第一级或第二级列对齐?

It depends on what you want! Do you want the column from df2 to be aligned with the 1st or second level of columns from df?

您必须在df2

pd.concat

df.join(pd.concat([df2], axis=1, keys=['a']))

更好的方式

df2.columns = pd.MultiIndex.from_product([['a'], df2.columns])

df.join(df2)