Python Pandas DataFrame根据条件分组
问题描述:
我的问题很简单,我有一个数据框,然后我根据列对结果进行 groupby
并得到这样的大小:
My question is simple, I have a dataframe and I groupby
the results based on a column and get the size like this:
df.groupby('column').size()
现在的问题是我只想要大小大于 X 的那些。我想知道是否可以使用lambda函数或类似的方法来做到这一点?我已经尝试过:
Now the problem is that I only want the ones where size is greater than X. I am wondering if I can do it using a lambda function or anything similar? I have already tried this:
df.groupby('column').size() > X
,它会打印出一些True和False值。
and it prints out some True and False values.
答
分组后的结果是常规的DataFrame,因此只需照常过滤结果即可。
The grouped result is a regular DataFrame, so just filter the results as usual:
import pandas as pd
df = pd.DataFrame({'a': ['a', 'b', 'a', 'a', 'b', 'c', 'd']})
after = df.groupby('a').size()
>> after
a
a 3
b 2
c 1
d 1
dtype: int64
>> after[after > 2]
a
a 3
dtype: int64