csv - How can i compare two columns in two different rows in python -


i want go through each line of csv file , compare see if first field of line 1 same first field of next line , on. if finds match ignore 2 lines contains same fields , keep lines there no match

here example dataset (no_dup.txt)

ac_gene_id  m_gene_id ensgmog00000015632  ensorlg00000010573 ensgmog00000015632  ensorlg00000010585 ensgmog00000003747  ensorlg00000006947 ensgmog00000003748  ensorlg00000004636 

basically want exclude line 1 , 2 since contains same fields (ensgmog00000015632) , keep lines 3 , 4

here code have tried couldn't finish it

prev = none  open("no_dup.txt", 'r') fh_in:     line in fh_in:         line = line.strip()         if line.startswith("e"):             line1 = line.split()             print "initial gene =", line1[0]             if prev not none or prev!= line1[0]:                 prev = line1[0] 

i think clean way of doing make map of each entry -> list of lines.

entries = {} open('no_dup.txt', 'r') fh_in:     line in fg_in:         entry = line.split()[0]         if entry in entries:             entries[entry].append(line)         else:             entries[entry] = [line] matches in entries.iteritems():     if len(matches) == 1:         print matches[0] 

you should note not preserve order of entries.


Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

javascript - oscilloscope of speaker input stops rendering after a few seconds -