如果行在Pandas中包含特定值,则删除列

问题描述:

我开始学习熊猫。我在SO中看到了很多问题,人们问如果列匹配特定值,如何删除行。

I am starting to learn Pandas. I have seen a lot of questions here in SO where people ask how to delete a row if a column matches certain value.

在我看来,情况恰恰相反。想象一下具有以下数据帧:

In my case it is the opposite. Imagine having this dataframe:

您想知道的是,如果任何一列在其任何一行的值都为 salty ,则应删除该列,其结果是:

Where you want to know is, if any column has in any of its row the value salty, that column should be deleted, having as a result:

我尝试了以下类似方法:

I have tried with several similarities to this:

if df.loc[df['A'] == 'salty']:
   df.drop(df.columns[0], axis=1, inplace=True)

但是我很迷失在寻找关于如何基于行值删除列的文档该列。该代码混合了查找特定列并始终删除第一列的过程(因为我的想法是在中为的所有列中搜索该列中的行的值

But I am quite lost at finding documentation onto how to delete columns based on a row value of that column. That code is a mix of finding a specific column and deleting always the first column (as my idea was to search the value of a row in that column, in ALL columns in a for loop.

对值进行比较,然后使用 DataFrame.any 以获得要索引的掩码:

Perform a comparison across your values, then use DataFrame.any to get a mask to index:

df.loc[:, ~(df == 'Salty').any()]

如果您坚持使用 drop ,这就是您需要执行的操作。传递索引列表:

If you insist on using drop, this is how you need to do it. Pass a list of indices:

df.drop(columns=df.columns[(df == 'Salty').any()])







df = pd.DataFrame({
    'A': ['Mountain', 'Salty'], 'B': ['Lake', 'Hotty'], 'C': ['River', 'Coldy']})
df
          A      B      C
0  Mountain   Lake  River
1     Salty  Hotty  Coldy

(df == 'Salty').any()
A     True
B    False
C    False
dtype: bool

df.loc[:, ~(df == 'Salty').any()]
       B      C
0   Lake  River
1  Hotty  Coldy

df.columns[(df == 'Salty').any()]
# Index(['A'], dtype='object')

df.drop(columns=df.columns[(df == 'Salty').any()])
       B      C
0   Lake  River
1  Hotty  Coldy