在保留 pandas 的NaN的同时放下重复项

问题描述:

使用drop_duplicates()方法时,我减少了重复项,但也将所有NaNs合并为一个条目.

When using the drop_duplicates() method I reduce duplicates but also merge all NaNs into one entry. How can I drop duplicates while preserving rows with an empty entry (like np.nan, None or '')?

import pandas as pd
df = pd.DataFrame({'col':['one','two',np.nan,np.nan,np.nan,'two','two']})

Out[]: 
   col
0  one
1  two
2  NaN
3  NaN
4  NaN
5  two
6  two


df.drop_duplicates(['col'])

Out[]: 
   col
0  one
1  two
2  NaN

尝试

df[(~df.duplicated()) | (df['col'].isnull())]

结果是:

col
0   one
1   two
2   NaN
3   NaN     
4   NaN