python 3.x - summing up certain rows in a panda dataframe -


i have pandas dataframe 1000 rows , 10 columns. looking aggregate rows 100-1000 , replace them 1 row indexvalue '>100' , column values sum of rows 100-1000 of each column. ideas on simple way of doing this? in advance

say have below

        b    c 0    1    10   100 1    2    20   100 2    3    60   100 3    5    80   100 

and want replaced with

        b    c 0    1    10   100 1    2    20   100 >1   8    140  200 

you use ix or loc shows settingwithcopywarning:

ind = 1 mask = df.index > ind df1 = df[~mask] df1.ix['>1', :] = df[mask].sum()  in [69]: df1 out[69]:        b    c 0   1   10  100 1   2   20  100 >1  8  140  200 

to set without warning pd.concat. may not elegant due 2 transposing worked:

ind = 1 mask = df.index > ind df1 = pd.concat([df[~mask].t, df[mask].sum()], axis=1).t df1.index = df1.index.tolist()[:-1] + ['>{}'.format(ind)]  in [36]: df1 out[36]:        b    c 0   1   10  100 1   2   20  100 >1  8  140  200 

some demonstrations:

in [37]: df.index > ind out[37]: array([false, false,  true,  true], dtype=bool)  in [38]: df[mask].sum() out[38]:      8 b    140 c    200 dtype: int64  in [40]: pd.concat([df[~mask].t, df[mask].sum()], axis=1).t out[40]:       b    c 0  1   10  100 1  2   20  100 0  8  140  200 

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -