根据条目对数据框中的值进行计数
问题描述:
我有一个数据框,其格式为:
I have a dataframe of the form:
category | value |
cat a |x |
cat a |x |
cat a |y |
cat b |w |
cat b |z |
我希望能够返回类似的信息(显示唯一的值和频率)
I'd like to be able to return something like (showing unique values and frequency)
category | freq of most common value |most common value |
cat a 2 x
cat b 1 w #(it doesnt matter if here is an w or z)
答
使用 Series.head
:
df = (df.groupby('category', sort=False)['value']
.apply(lambda x: x.value_counts().head(1))
.reset_index()
.rename(columns={'level_1':'most_common_value','value':'freq of most common value'}))
print (df)
category most_common_value freq of most common value
0 cat a x 2
1 cat b w 1