python - how to sum across many columns with pandas groupby? -

May 15, 2012

i have dataframe looks like

day  type  col  d_1  d_2  d_3  d_4  d_5... 1        1    1    0    1    0 1        2    1    0    1    0 2    b     1    1    1    0    0

that is, have 1 normal column (col) , many columns prefixed d_

i need perform groupby day , type , want compute sum of values in every d_ column every day-type combination. need perform other aggregation functions on other columns in data (such col in example)

i can use:

agg_df=df.groupby(['day','type']).agg({'d_1': 'sum', 'col': 'mean'})

but computes sum 1 d_ column. how can specify possible d_ columns in data?

in other words, write like

agg_df=df.groupby(['day','type']).agg({'d_*': 'sum', 'col': 'mean'})

so expected output is:

day  type  col  d_1  d_2  d_3  d_4  d_5... 1        1.5  2    0    2    0    ... 2    b     1    1    1    0    0

as can see, col aggregated mean, while d_ columns summed.

thanks help!

you can use filter:

in [23]: df.groupby(['day','type'], as_index=false)[df.filter(regex='d_.*').columns].sum()  out[23]:    day type  d_1  d_2  d_3  d_4 0    1       2    0    2    0 1    2    b    1    1    0    0

if wanna apply functions in 1 shot:

dic = {} dic.update({i:np.sum in df.filter(regex='d_.*').columns}) dic.update({'col':np.mean})  in [48]: df.groupby(['day','type'], as_index=false).agg(dic) #out[48]: #   day type  d_2  d_3  d_1  col  d_4 #0    1       0    2    2  1.5    0 #1    2    b    1    0    1  1.0    0

Search This Blog

Color

python - how to sum across many columns with pandas groupby? -

Comments

Post a Comment

Popular posts from this blog

Redirect to a HTTPS version using .htaccess -

Unlimited choices in BASH case statement -

javascript - jQuery: Add class depending on URL in the best way -