r - Updating only certain values of data frame based on match -


i'm trying update variable (popsnp) in higher scope within lapply, on basis of match. can't quite figure out syntax updating values though, have overwrites existing values na:

lapply(1:22, function(i){   in.name<-paste("/data/mdp14aps/ld/chr", i, ".ld", sep="")   out.name<-paste("/data/mdp14aps/r/ldatachr", i, ".rda", sep="")   ldata<-read.csv(in.name, sep="", header=true,                   colclasses=c(na,na,na,na,na,na,"null"))   freq<-count(ldata, c("snp_a", "chr_a", "bp_a"))    #the part i'm not sure   popsnp$chrom<<-freq[match(popsnp$marker, freq$snp_a),2]   popsnp$position<<-freq[match(popsnp$marker, freq$snp_a),3]   popsnp$freq<<-freq[match(popsnp$marker, freq$snp_a),4]    save(ldata,file=out.name)   rm(ldata, freq) }) 

i want preserve values i'm setting between iterations of lapply end popsnp containing all values of chrom, position , freq, not last iteration.

i feel should straightforward, i'm still unfamiliar r.

a toy example:

test<-data.frame(a = c("a", "b", "c", "d", "e"), b = c(rep(na,5))) test1<-data.frame(a = c("a", "b"), b = c(1, 2)) test2<-data.frame(a = c("c", "d", "e"), b = c(3, 4, 5))  test$b<-test1[match(test$a, test1$a), 2] test$b<-test2[match(test$a, test2$a), 2] 

i want test$b have values 1-5 in it.

update toy example

you need subset both sides of assignment, , convert conditions logical subsetting vectors.

logical1 <- !is.na(test1[match(test$a, test1$a),2]) # true/false logical2 <- !is.na(test1[match(test$a, test2$a),2])  test[t1,] <- test1[t1,] # selects true rows test[t2,] <- test2[t2,]  

i recommend @ each element individually can see what's happening.


previously...

i'm not sure understand you're example trying accomplish. i'm going provide toy example of subsetting:

dat <- data.frame(  = sample(letters[3:26],26,replace = true)  b = runif(26) )  # replaces in column b column == "a" dat[dat$a == "c", "b"] <- 1  # dat$a == "c" returns true/false vector, "b" returns column "b". 

best practice use true / false conditions while subsetting avoid future errors. subset row number, gets messy.

it's important note use of <<- pushes change of variable parent environment, outside of scope of function. can lead unexpected results in future. it's better supply variable want change , return again @ end of manipulation function. way have clear sequence of events.

myfun <- function(x,y) {    # ... stuff y   return(y) }  y <- myfun(x,y)  

final update

lastly, respect dropping unnecessary columns. typical practice drop them after import name (best practice) or reference number (changes in data break this).

ldata[c('col1','col2',...)] <- null # drop 

Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -