如何根据条件用列名替换pandas数据框中的值?
我有一个看起来像这样的数据框:
I have a dataframe that looks something like this:
我想用列名替换A:D范围内的所有1,以便最终结果应类似于:
I want to replace all 1's in the range A:D with the name of the column, so that the final result should resemble:
我该怎么做?
您可以使用以下方法重新创建我的数据框:
You can recreate my dataframe with this:
dfz = pd.DataFrame({'A' : [1,0,0,1,0,0],
'B' : [1,0,0,1,0,1],
'C' : [1,0,0,1,3,1],
'D' : [1,0,0,1,0,0],
'E' : [22.0,15.0,None,10.,None,557.0]})
One way could be to use replace
and pass in a Series mapping column labels to values (those same labels in this case):
>>> dfz.loc[:, 'A':'D'].replace(1, pd.Series(dfz.columns, dfz.columns))
A B C D
0 A B C D
1 0 0 0 0
2 0 0 0 0
3 A B C D
4 0 0 3 0
5 0 B C 0
要使更改永久生效,请将返回的DataFrame分配回dfz.loc[:, 'A':'D']
.
To make the change permanent, you'd assign the returned DataFrame back to dfz.loc[:, 'A':'D']
.
除了解决方案外,请记住,将数字和字符串类型混合在列中可能会失去很多性能优势,这是有用的,因为熊猫*使用通用的对象" dtype来保存值.
Solutions aside, it's useful to keep in mind that you may lose a lot of performance benefits when you mix numeric and string types in columns, as pandas is forced to use the generic 'object' dtype to hold the values.