r - Function to evaluate columns with different names within a data frame -
i have data frame (total) so:
id pos ori cont ma1 nma1 bda1 ma2 nma2 bda2 mb1 nmb1 bdb1 mb2 1: chrm 5 + ccg 0 1 2 0 1 2 0 4 5 0 2: chrm 6 + cgt 0 1 2 2 0 0 2 2 2 1 3: chrm 7 - cgg 0 1 2 0 6 7 0 3 4 1 4: chrm 10 + cga 0 2 3 2 1 2 2 3 2 1 5: chrm 11 - cga 0 1 2 2 6 2 0 3 4 1 --- 164264: chrm 366914 + caa 0 1 2 0 2 3 0 1 2 0 164265: chrm 366918 + ccg 0 1 2 0 2 3 0 0 1 0 164266: chrm 366919 + cgg 0 1 2 0 2 3 0 0 1 0 164267: chrm 366920 - cgg 1 2 2 0 5 6 0 1 2 0 164268: chrm 366921 - ccg 0 3 4 0 3 4 0 0 1 0 nmb2 bdb2 1: 5 6 2: 6 3 3: 3 2 4: 7 3 5: 8 3 --- 164264: 8 9 164265: 7 8 164266: 7 8 164267: 4 5 164268: 4 5 and want function evaluate couple of criteria. when doing 1 one used
total$crita <- as.numeric((total$ma1+total$nma1>=4)&(total$nma1>=bda1)) so 0 if true or 1 if false. i'd apply treatments (a1 (m, nm , bd), a2, a3, etc.)
i'm new r, , haven't figured out how bunch of stuff yet, appreciated. thanks!
i think this: (if share data dput i'd copy/paste , test... see here other tips on writing good, reproducible r questions.
add_crit = function(data, treatment) { m_name = paste0("m", treatment) nm_name = paste0("nm", treatment) bd_name = paste0("bd", treatment) crit_name = paste0("crit", treatment) data[crit_name] = as.numeric( (data[m_name] + data[nm_name] >= 4) & (data[nm_name] >= data[bd_name]) ) return(data) } treatments = c("a1", "a2", "b1", "b2") data_with_crit = total (trt in treatments) { data_with_crit = add_crit(data_with_crit, trt) } i build column names need strings paste. when have column names stored in variables, need use [ rather $, otherwise work well.
fortunes::fortune(343) sooner or later r beginners bitten convenient shortcut. r newbie, think of r bank account: overuse of
$-extraction can lead undesirable consequences. it's best acquire'[[','['habit early. -- peter ehlers (about use of $-extraction) r-help (march 2013)
the other (more generalizable) way handle problem "melt" data long format - have single treatment column values a1, a2, ... , single columns m, nm, bd, crit. multiple rows per id (one row per treatment per id). lend data.table or dplyr solution. perhaps else post example.
Comments
Post a Comment