create count matrix in R -


this question has answer here:

i have large dataframe shown below few rows , column:

    id1 id2 id3 id4 s1  2   4   2   6 s2  2   1   3   2 s3  2   2   2   2 s4  3   0   2   2 

for each row need matrix count of each number in range of id value. since largest 6 in id values, creates matrix 7 columns i.e. 0 6 , fill count values.

sample output:

    0   1   2   3    4    5    6 s1  0   0   2   0    1    0    1 s2  0   1   2   1    0    0    0 s3  0   0   4   0    0    0    0 s4  1   0   2   1    0    0    0 

is there way of doing in r.

this perfect situation use apply + tabulate, except inclusion of zeroes in data , need include them.

since need include tabulation of zeroes, make small modification tabulate start @ 0 instead of 1.

here's function puts approach in place:

dftabulate <- function(indf) {   nbins <- max(indf)   `colnames<-`(t(apply(indf + 1, 1, tabulate, nbins = nbins + 1)), 0:nbins) } 

here applied sample data.

dftabulate(mydf) #    0 1 2 3 4 5 6 # s1 0 0 2 0 1 0 1 # s2 0 1 2 1 0 0 0 # s3 0 0 4 0 0 0 0 # s4 1 0 2 1 0 0 0 

you specify have "large" data.frame don't describe how large, i'm not sure how relevant following benchmark is.

however, share logic behind using approach: tabulate fast function, thought make use of efficiency.

here's benchmark:

set.seed(1) nrow = 10000 ncol = 100 min = 0 max = 500 mydf <- data.frame(   matrix(sample(min:max, nrow*ncol, true),           nrow = nrow, ncol = ncol,           dimnames = list(paste0("s", 1:nrow), paste0("id", 1:ncol))))  fun2 <- function(df1 = mydf) {   tbl <- table(c(row(df1)), factor(unlist(df1), levels=0:max))   dimnames(tbl)[[1]] <- row.names(df1)   tbl }  fun3 <- function(df1 = mydf) mtabulate(as.data.frame(t(df1)))  system.time(dftabulate(mydf)) #    user  system elapsed  #   0.000   0.000   0.154  system.time(fun2(mydf)) #    user  system elapsed  #   0.000   0.000   1.018  system.time(fun3(mydf)) #    user  system elapsed  #   4.560   0.000   3.081  

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -