bioinformatics - Convert missing values (-9) to NAs in a Plink PED file when reading into R -

January 15, 2012

i have 2 files: pedigree.ped , pedigree.map. these 2 file formats can used plink.

in case want use them r, , think must conversion r format. eg: missing values in plink different missing values in r.

how can convert these 2 files use them in r? how can change missing values na?

sample of data:

ped file:

1 1 0 0 1.02  a   g g   0 0 1 2 0 0 0.51  t g   c c   a 2 3 1 2 -9    0 0   g   t t ...

first column id_family, second id_individual, third , fourth father , mother of id_individual, fifth quantitative trait (-9 : missing value), remaining columns genotypes (snps allele). missing value columns 0 except quantitative trait -9.

map file:

1 rs1 0 100000 1 rs2 0 100100 1 rs3 0 100200

first column id chromosome (1-22, x, y or 0 if unplaced), second rs# or snp identifier, third genetic distance (morgans), , fourth base-pair position (bp units)

assuming data in ped file read r data frame -

> my.dataframe   v1 v2 v3 v4    v5 v6 v7 v8 v9 v10 v11 1  1  1  0  0  1.02    g  g   0   0 2  1  2  0  0  0.51  t  g  c  c     3  2  3  1  2 -9.00  0  0   g   t   t

now check invalid/missing values per column & assign na. ex, take 5th column -

my.dataframe[my.dataframe[,5] == -9, 5] <- na > my.dataframe   v1 v2 v3 v4   v5 v6 v7 v8 v9 v10 v11 1  1  1  0  0 1.02    g  g   0   0 2  1  2  0  0 0.51  t  g  c  c     3  2  3  1  2   na  0  0   g   t   t

similarly assign na required entries.

note: r functions treat nas in special way. respective function arguments. related keywords watch - na.rm, na.pass, na.fail, na.omit etc.

Search This Blog

Color

bioinformatics - Convert missing values (-9) to NAs in a Plink PED file when reading into R -

Comments

Post a Comment

Popular posts from this blog

Redirect to a HTTPS version using .htaccess -

Unlimited choices in BASH case statement -

javascript - jQuery: Add class depending on URL in the best way -