如果行在Pandas中包含特定值,则删除列
我开始学习熊猫。我在SO中看到了很多问题,人们问如果列匹配特定值,如何删除行。
I am starting to learn Pandas. I have seen a lot of questions here in SO where people ask how to delete a row if a column matches certain value.
在我看来,情况恰恰相反。想象一下具有以下数据帧:
In my case it is the opposite. Imagine having this dataframe:
您想知道的是,如果任何一列在其任何一行的值都为 salty
,则应删除该列,其结果是:
Where you want to know is, if any column has in any of its row the value salty
, that column should be deleted, having as a result:
我尝试了以下类似方法:
I have tried with several similarities to this:
if df.loc[df['A'] == 'salty']:
df.drop(df.columns[0], axis=1, inplace=True)
但是我很迷失在寻找关于如何基于行值删除列的文档该列。该代码混合了查找特定列并始终删除第一列的过程(因为我的想法是在中为
的所有列中搜索该列中的行的值
But I am quite lost at finding documentation onto how to delete columns based on a row value of that column. That code is a mix of finding a specific column and deleting always the first column (as my idea was to search the value of a row in that column, in ALL columns in a for
loop.
对值进行比较,然后使用 DataFrame.any
以获得要索引的掩码:
Perform a comparison across your values, then use DataFrame.any
to get a mask to index:
df.loc[:, ~(df == 'Salty').any()]
如果您坚持使用 drop
,这就是您需要执行的操作。传递索引列表:
If you insist on using drop
, this is how you need to do it. Pass a list of indices:
df.drop(columns=df.columns[(df == 'Salty').any()])
df = pd.DataFrame({
'A': ['Mountain', 'Salty'], 'B': ['Lake', 'Hotty'], 'C': ['River', 'Coldy']})
df
A B C
0 Mountain Lake River
1 Salty Hotty Coldy
(df == 'Salty').any()
A True
B False
C False
dtype: bool
df.loc[:, ~(df == 'Salty').any()]
B C
0 Lake River
1 Hotty Coldy
df.columns[(df == 'Salty').any()]
# Index(['A'], dtype='object')
df.drop(columns=df.columns[(df == 'Salty').any()])
B C
0 Lake River
1 Hotty Coldy