在熊猫数据框中检索NaN值的索引

问题描述：

我尝试为包含NaN值的每一行检索相应列的所有索引.

I try to retrieve for each row containing NaN values all the indices of the corresponding columns.

d=[[11.4,1.3,2.0, NaN],[11.4,1.3,NaN, NaN],[11.4,1.3,2.8, 0.7],[NaN,NaN,2.8, 0.7]]
df = pd.DataFrame(data=d, columns=['A','B','C','D'])
print df

      A    B    C    D
0  11.4  1.3  2.0  NaN
1  11.4  1.3  NaN  NaN
2  11.4  1.3  2.8  0.7
3  NaN   NaN  2.8  0.7

我已经完成以下操作:

添加一列，其中每行的NaN计数
获取包含NaN值的每一行的索引

我想要的(理想情况下是列的名称)是这样的列表:

What I want (ideally the name of the column) is get a list like this :

[ ['D'],['C','D'],['A','B'] ]

希望我可以找到一种方法，而不必为每一行都进行每一列的测试

Hope I can find a way without doing for each row the test for each column

if df.ix[i][column] == NaN:

我正在寻找一种熊猫方法来处理庞大的数据集.

I'm looking for a pandas way to be able to deal with my huge dataset.

提前谢谢.

答

另一种方法，提取NaN行:

Another way, extract the rows which are NaN:

In [11]: df_null = df.isnull().unstack()

In [12]: t = df_null[df_null]

In [13]: t
Out[13]:
A  3    True
B  3    True
C  1    True
D  0    True
   1    True
dtype: bool

这为您提供了大部分帮助，也许就足够了.
尽管使用系列"可能会更容易:

This gets you most of the way and may be enough.
Although it may be easier to work with the Series:

In [14]: s = pd.Series(t2.index.get_level_values(1), t2.index.get_level_values(0))

In [15]: s
Out[15]:
0    D
1    C
1    D
3    A
3    B
dtype: object

例如如果您需要这些列表(尽管我认为您不需要它们)

e.g. if you wanted the lists (though I don't think you would need them)

In [16]: s.groupby(level=0).apply(list)
Out[16]:
0       [D]
1    [C, D]
3    [A, B]
dtype: object

在熊猫数据框中检索NaN值的索引

相关推荐