python 3.x - summing up certain rows in a panda dataframe -
i have pandas dataframe 1000 rows , 10 columns. looking aggregate rows 100-1000 , replace them 1 row indexvalue '>100' , column values sum of rows 100-1000 of each column. ideas on simple way of doing this? in advance
say have below
b c 0 1 10 100 1 2 20 100 2 3 60 100 3 5 80 100
and want replaced with
b c 0 1 10 100 1 2 20 100 >1 8 140 200
you use ix
or loc
shows settingwithcopywarning
:
ind = 1 mask = df.index > ind df1 = df[~mask] df1.ix['>1', :] = df[mask].sum() in [69]: df1 out[69]: b c 0 1 10 100 1 2 20 100 >1 8 140 200
to set without warning pd.concat
. may not elegant due 2 transposing worked:
ind = 1 mask = df.index > ind df1 = pd.concat([df[~mask].t, df[mask].sum()], axis=1).t df1.index = df1.index.tolist()[:-1] + ['>{}'.format(ind)] in [36]: df1 out[36]: b c 0 1 10 100 1 2 20 100 >1 8 140 200
some demonstrations:
in [37]: df.index > ind out[37]: array([false, false, true, true], dtype=bool) in [38]: df[mask].sum() out[38]: 8 b 140 c 200 dtype: int64 in [40]: pd.concat([df[~mask].t, df[mask].sum()], axis=1).t out[40]: b c 0 1 10 100 1 2 20 100 0 8 140 200
Comments
Post a Comment