R code challenge: retrieving the values in matching columns and sum them up with matching rows -

- July 15, 2010

i have problem solving in r. have data frame called testa (dput included). need match letters in column alt colnames (a,c,g,t,n) , corresponding values in column along value ref letters , result ad.new (my code job).

however, need expand code solve issue line type column has flat @ end. row flat, need match start id (chr10:102053031) other ids in start column. if match, need sum corresponding value alt a,c,g,t,n column , replace ad.new column flat line along ref value.

if run dput , code able understand it. basically, want match letters in ref , alt columns , corresponding values columns (a,c,g,t,n) , separate values comma ref , alt. (in example), flat line want sum value in column a matching start id start id of flat line (the value in case 6) , value match (the value in case 7 g column) , sum them give 13. flat line result should 0,13.

the expected result shown below.

my incomplete code:

testa[is.na(testa)]<-0  ref.counts<-testa[,testa[,"ref"]] ref.counts<-as.matrix(ref.counts)  ref.counts[is.na(ref.counts)]<-0 ref.counts<-diag(ref.counts)  alt.counts<-testa[,testa[,"alt"]] alt.counts<-as.matrix(alt.counts) alt.counts[is.na(alt.counts)]<-0 alt.counts<-diag(alt.counts)  ############# ##need extend code here ############# ad.new<-paste(ref.counts,alt.counts,sep=",")

dput testa:

structure(c("chr10:101544447", "chr10:102053031", "chr10:102778767",  "chr10:102789831", "chr10:102989480", "chr10:102053031", "chr10:102053031",  "0", "6", "0", "0", "0", "0", "0", "0", "34", "24", "0", "0",  "34", "34", "0", "0", "0", "0", "0", "0", "7", "53", "0", "0",  "30", "12", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",  "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",  "chr10", "chr10", "chr10", "chr10", "chr10", "chr10", "chr10",  "101544447", "102053031", "102778767", "102789831", "102989480",  "102053031", "102053031", "a", "c", "c", "c", "c", "c", "c",  "t", "a", "t", "t", "t", "g", "g", "snp", "snp", "snp", "snp",  "snp", "snp:102053031:flat", "snp", "nonsynonymous snv",  "intronic", "nonsynonymous snv", "nonsynonymous snv", "ncrna_exonic",  "intronic", "intronic", "abcc2:nm_000392:exon2:c.a116t:p.y39f,",  "pkd2l1", "pdzd7:nm_024895:exon8:c.g1136a:p.r379q,pdzd7:nm_001195263:exon8:c.g1136a:p.r379q,",  "pdzd7:nm_024895:exon2:c.g146a:p.r49q,pdzd7:nm_001195263:exon2:c.g146a:p.r49q,",  "lbx1-as1", "pkd2l1", "pkd2l1"), .dim = c(7l, 15l), .dimnames = list(     c("1", "2", "3", "4", "5", "6", "7"), c("start", "a", "c",      "g", "t", "n", "=", "-", "chr", "end", "ref", "alt", "type",      "refgene::location", "refgene::type")))

expected result

 ad.new "0,53" "34,6" "24,0" "0,30" "0,12" "0,13"  "34,7"

something should work :

# apply "normal" rule (non considering flat exceptions) alts <- as.numeric(diag(testa[,testa[,"alt"]])) refs <- as.numeric(diag(testa[,testa[,"ref"]])) res <- paste(refs,alts,sep=",")  # replace lines having type ending "flat" flats <- grep('.*flat$',testa[,"type"]) res[flats] <-  unlist(lapply(flats,function(x){                 startid <- testa[x,"start"]                 selection <- setdiff(which(testa[,"start"] == startid),r)                 paste0("0,",sum(alts[selection]))              }))  ad.new <- as.matrix(res) > ad.new      [,1]   [1,] "0,53" [2,] "34,6" [3,] "24,0" [4,] "0,30" [5,] "0,12" [6,] "0,13" [7,] "34,7"

Search This Blog

Bay WIKI

R code challenge: retrieving the values in matching columns and sum them up with matching rows -

Comments

Post a Comment

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

Revit Family Rename in a project -