linux - Diff 2 long strings and write result in 3rd file -


i working on initial bash scripts , got stuck @ place need forum.

how implement below in shell script? (any suggestions/pointers appreciated!!!)

requirement:

compare 2 files matchings key containing long string , persist in 3rd file long strings differ in other attributes (say value of user different). skip attributes comparison.

input file1-

aautox=y;acct=;action=c;aprice=99.975;aqty=5541;user=sam,bpl;confirm=y;key=29976dye4;dept=myna-clcd -- same aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=todd,chr) write in result file 

input file2-

aautox=y;acct=;action=c;aprice=99.975;aqty=5541;user=sam,bpl;confirm=y;key=29976dye4;dept=myna-clcd -- same aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=alan,ncr) write in result file aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye6;dept=myna-clcd -- no match (key) found write in result file 

output file3:

file1:aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd  file2:aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd   file1:  file2: aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd 

and on each differed line....

approach in mind (it's first cut , cud improve later):

  • read file1 line line (awk or read??) each line
    • a) read file2 matching unique "key" (which command use here??? awk read file based on key??? grep key file2 how break line fields comparison??)
    • b) compare each field of file1.line1 file2.line , if different write in 3rd result file (awk breaks line fields $1, $2 compare though not sure how if use "read" command???)

this uses gnu awk 4.* sorted in (see http://www.gnu.org/software/gawk/manual/gawk.html#controlling-array-traversal), other awks can pipe sort or otherwise determine key order:

$ cat tst.awk begin { fs="[;=]" } {     delete name2val     (i=1; i<=nf; i+=2) { name2val[$i] = $(i+1) }     key = name2val["key"]     keys[key]     recs[key,filename] = $0     (name in name2val) { vals[key,filename,name] = name2val[name] } } end {     procinfo["sorted_in"] = "@ind_str_asc"     file1 = argv[1]     file2 = argv[2]     (key in keys) {         state = "same"         if ( (key,file1) in recs ) {             if ( (key,file2) in recs ) {                 (name in name2val) {                     if (name != "confirm") {                         if (vals[key,file1,name] != vals[key,file2,name]) {                             state = "diff"                         }                     }                 }             } else { state = "file1_only" }         } else { state = "file2_only" }          if (state != "same") {             print file1":", recs[key,file1]             print file2":", recs[key,file2]             print ""         }     } } 

.

$ gawk -f tst.awk file1 file2 file1: aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=todd,c hr) write in result file file2: aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=alan,ncr) write in result file  file1:  file2: aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye6;dept=myna-clcd -- no match (key) found write in result file 

Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -