将 Pandas DataFrame 的行转换为列标题,
我必须处理的数据有点乱.它的数据中有标题名称.如何从现有的 Pandas 数据框中选择一行并将其(重命名为)列标题?
The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?
我想做类似的事情:
header = df[df['old_header_name1'] == 'new_header_name1']
df.columns = header
In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])
In [22]: df
Out[22]:
0 1 2
0 1 2 3
1 foo bar baz
2 4 5 6
将列标签设置为等于第 2 行(索引位置 1)中的值:
Set the column labels to equal the values in the 2nd row (index location 1):
In [23]: df.columns = df.iloc[1]
如果索引具有唯一标签,您可以使用以下方法删除第二行:
If the index has unique labels, you can drop the 2nd row using:
In [24]: df.drop(df.index[1])
Out[24]:
1 foo bar baz
0 1 2 3
2 4 5 6
如果索引不是唯一的,您可以使用:
If the index is not unique, you could use:
In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
Out[133]:
1 foo bar baz
0 1 2 3
2 4 5 6
使用 df.drop(df.index[1])
删除 所有 与第二行标签相同的行.因为非唯一索引会导致像这样的绊脚石(或潜在的错误),通常最好注意索引是唯一的(即使 Pandas 不需要它).
Using df.drop(df.index[1])
removes all rows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).