dataframe - Summing values after every third position in data frame in R -


i new r. have data frame following

>df=data.frame(id=c("entry_1","entry_1","entry_1","entry_2","entry_2","entry_2","entry_3","entry_4","entry_4","entry_4","entry_4"),start=c(20,20,20,37,37,37,68,10,10,10,10),end=c(50,50,50,78,78,78,200,94,94,94,94),pos=c(14,34,21,50,18,70,101,35,2,56,67),hits=c(12,34,17,89,45,87,1,5,6,3,26))  id       start end pos hits entry_1    20  50  14   12 entry_1    20  50  34   34 entry_1    20  50  21   17 entry_2    37  78  50   89 entry_2    37  78  18   45 entry_2    37  78  70   87 entry_3    68 200 101    1 entry_4    10  94  35    5 entry_4    10  94   2    6 entry_4    10  94  56    3 entry_4    10  94  67   26 

for each entry iterate data.frame in 3 different modes. example, entry_1 mode_1 =seq(20,50,3)and mode_2=seq(21,50,3) , mode_3=seq(22,50,3). sum values in column "hits" corresponding values in column "pos" falls in mode_1 or_mode_2 or mode_3 , generate data.frame follow:

id       mode_1   mode_2   mode_3 entry_1   0        17       34 entry_2   87       89        0 entry_3   1         0        0 entry_4   26        8        0  

i tried following code:

    mode_1=0     mode_2=0     mode_3=0     mode_1_sum=0     mode_2_sum=0     mode_3_sum=0     for(i in dim(df)[1])     {       if(df$pos[i] %in% seq(df$start[i],df$end[i],3))       {       mode_1_sum=mode_1_sum+df$hits[i]       print(mode_1_sum)       }       mode_1=mode_1_sum+counts       print(mode_1)        ifelse(df$pos[i] %in% seq(df$start[i]+1,df$end[i],3))       {       mode_2_sum=mode_2_sum+df$hits[i]       print(mode_2_sum)       }       mode_2_sum=mode_2_sum+counts       print(mode_2)        ifelse(df$pos[i] %in% seq(df$start[i]+2,df$end[i],3))       {       mode_3_sum=mode_3_sum+df$hits[i]       print(mode_3_sum)       }       mode_3_sum=mode_3_sum+counts       print(mode_3_sum) } 

but above code prints 26. can 1 guide me how generate desired output, please. can provide more details if needed. in advance.

it's not elegant solution, works.

m <- 3 # number of modes want  foo <- ((df$pos - df$start)%%m + 1) * (df$start < df$pos) * (df$end > df$pos) tab <- matrix(0,nrow(df),m) for(i in 1:m) tab[foo==i,i] <- df$hits[foo==i]  aggregate(tab,list(df$id),fun=sum) #   group.1 v1 v2 v3 # 1 entry_1  0 17 34 # 2 entry_2 87 89  0 # 3 entry_3  1  0  0 # 4 entry_4 26  8  0 

-- explanation --

first, find indices of df$pos both bigger df$start , smaller df$end. these should return 1 if true , 0 if false. next, take difference between df$pos , df$start, take mod 3 (which give vector of 0s, 1s , 2s), , add 1 right mode. multiply these 2 things together, values fall within interval retain right mode, , values fall outside interval become 0.

next, create empty matrix contain values. then, use for-loop fill in matrix. finally, aggregate matrix.

i tried looking quicker solution, main problem cannot work around varying intervals each row.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -