如何用1替换数据帧的所有非NaN条目以及如何用0替换所有NaN

问题描述:

我有一个71列和30597行的数据框.我想将所有非nan条目替换为1,并将nan值替换为0.

I have a dataframe with 71 columns and 30597 rows. I want to replace all non-nan entries with 1 and the nan values with 0.

最初,我尝试对花费过多时间的数据框的每个值进行for循环.

Initially I tried for-loop on each value of the dataframe which was taking too much time.

然后,我使用了 data_new = data.subtract(data),其目的是将数据帧的所有值减去自身,以便我可以将所有非空值设为0. 但是,由于数据框具有多个字符串条目,所以发生了错误.

Then I used data_new=data.subtract(data) which was meant to subtract all the values of the dataframe to itself so that I can make all the non-null values 0. But an error occurred as the dataframe had multiple string entries.

您可以采用df.notnull()的返回值,即False,其中DataFrame包含NaNTrue,否则将其强制转换为整数,为您提供0,其中DataFrame为NaN,否则为1:

You can take the return value of df.notnull(), which is False where the DataFrame contains NaN and True otherwise and cast it to integer, giving you 0 where the DataFrame is NaN and 1 otherwise:

newdf = df.notnull().astype('int')

如果您真的要写入原始DataFrame,则可以使用:

If you really want to write into your original DataFrame, this will work:

df.loc[~df.isnull()] = 1  # not nan
df.loc[df.isnull()] = 0   # nan