faster way to run a loop in r -
i have 3 dataframes a,b , c.
a has 18000 rows , 18000 columns , b has 150000 rows , 5 cols.
i want fill elements of b.
the loop take long time. how can run loop faster?
example of
entrez_gene_id 2324 34345 4345 1234 3453 1 entrez_gene_id 0 0 0 0 0 2 23040 0 0 0 0 0 3 7249 0 0 0 0 0 4 64478 0 0 0 0 0 5 4928 0 0 0 0 0 6 58191 0 0 0 0 0
example of b
head(b) v1 gene1 gene2 weight newweight 1 1 4171 4172 2.01676494 0.020420929 2 2 2237 5111 1.933298567 0.015300857 3 4 506 509 2.439170425 0.020577243 4 7 6635 6636 2.255316779 0.081088975 5 8 6133 6210 3.427969232 0.021132906 6 10 23521 6217 1.607247743 0.027792961
and code :
b<- data.frame(lapply(c, as.character), stringsasfactors=false) for(i in 1:nrow(b)){ rname=b[i,2] cname=b[i,3] a[rname,cname]=b[i,5] print(i) }
it seems though trying fill full matrix matrix in sparse notation. can use dgcmatrix
class matrix
package this:
library(matrix) b_mat <- sparsematrix(i=b[,2],j=b[,3],x=b[,5])
this leaves matrix in sparse format. convert 18,000 x 18,000 form:
as.data.frame(as.matrix(b_mat))
edit: suggest leaving as.data.frame
call out here, matrix
easier work considering number of columns have
Comments
Post a Comment