在Pandas Dataframe中按分组进行多重聚合
问题描述:
SQL : Select Max(A) , Min (B) , C from Table group by C
我想在数据框上的熊猫中执行相同的操作.我离这更近了:
I want to do the same operation in pandas on a dataframe. The closer I got was till :
DF2= DF1.groupby(by=['C']).max()
我在两列中最多获取的地方,在分组时如何做一个以上的操作.
where I land up getting max of both the columns , how do i do more than one operation while grouping by.
答
尝试agg()
函数:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,5,size=(20, 3)), columns=list('ABC'))
print(df)
print(df.groupby('C').agg({'A': max, 'B':min}))
输出:
A B C
0 2 3 0
1 2 2 1
2 4 0 1
3 0 1 4
4 3 3 2
5 0 4 3
6 2 4 2
7 3 4 0
8 4 2 2
9 3 2 1
10 2 3 1
11 4 1 0
12 4 3 2
13 0 0 1
14 3 1 1
15 4 1 1
16 0 0 0
17 4 0 1
18 3 4 0
19 0 2 4
A B
C
0 4 0
1 4 0
2 4 2
3 0 4
4 0 1
或者,您可能要检查 pandas.read_sql_query()功能...
Alternatively you may want to check pandas.read_sql_query() function...