csv - How can i compare two columns in two different rows in python -

- April 15, 2014

i want go through each line of csv file , compare see if first field of line 1 same first field of next line , on. if finds match ignore 2 lines contains same fields , keep lines there no match

here example dataset (no_dup.txt)

ac_gene_id  m_gene_id ensgmog00000015632  ensorlg00000010573 ensgmog00000015632  ensorlg00000010585 ensgmog00000003747  ensorlg00000006947 ensgmog00000003748  ensorlg00000004636

basically want exclude line 1 , 2 since contains same fields (ensgmog00000015632) , keep lines 3 , 4

here code have tried couldn't finish it

prev = none  open("no_dup.txt", 'r') fh_in:     line in fh_in:         line = line.strip()         if line.startswith("e"):             line1 = line.split()             print "initial gene =", line1[0]             if prev not none or prev!= line1[0]:                 prev = line1[0]

i think clean way of doing make map of each entry -> list of lines.

entries = {} open('no_dup.txt', 'r') fh_in:     line in fg_in:         entry = line.split()[0]         if entry in entries:             entries[entry].append(line)         else:             entries[entry] = [line] matches in entries.iteritems():     if len(matches) == 1:         print matches[0]

you should note not preserve order of entries.

Search This Blog

Th

csv - How can i compare two columns in two different rows in python -

Comments

Post a Comment

Popular posts from this blog

xslt - Substring before throwing error -

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -