linux - Diff 2 long strings and write result in 3rd file -
i working on initial bash scripts , got stuck @ place need forum.
how implement below in shell script? (any suggestions/pointers appreciated!!!)
requirement:
compare 2 files matchings key containing long string , persist in 3rd file long strings differ in other attributes (say value of user different). skip attributes comparison.
input file1-
aautox=y;acct=;action=c;aprice=99.975;aqty=5541;user=sam,bpl;confirm=y;key=29976dye4;dept=myna-clcd -- same aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=todd,chr) write in result file
input file2-
aautox=y;acct=;action=c;aprice=99.975;aqty=5541;user=sam,bpl;confirm=y;key=29976dye4;dept=myna-clcd -- same aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=alan,ncr) write in result file aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye6;dept=myna-clcd -- no match (key) found write in result file
output file3:
file1:aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd file2:aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd file1: file2: aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd
and on each differed line....
approach in mind (it's first cut , cud improve later):
- read file1 line line (awk or read??) each line
- a) read file2 matching unique "key" (which command use here??? awk read file based on key??? grep key file2 how break line fields comparison??)
- b) compare each field of file1.line1 file2.line , if different write in 3rd result file (awk breaks line fields $1, $2 compare though not sure how if use "read" command???)
this uses gnu awk 4.* sorted in
(see http://www.gnu.org/software/gawk/manual/gawk.html#controlling-array-traversal), other awks can pipe sort or otherwise determine key order:
$ cat tst.awk begin { fs="[;=]" } { delete name2val (i=1; i<=nf; i+=2) { name2val[$i] = $(i+1) } key = name2val["key"] keys[key] recs[key,filename] = $0 (name in name2val) { vals[key,filename,name] = name2val[name] } } end { procinfo["sorted_in"] = "@ind_str_asc" file1 = argv[1] file2 = argv[2] (key in keys) { state = "same" if ( (key,file1) in recs ) { if ( (key,file2) in recs ) { (name in name2val) { if (name != "confirm") { if (vals[key,file1,name] != vals[key,file2,name]) { state = "diff" } } } } else { state = "file1_only" } } else { state = "file2_only" } if (state != "same") { print file1":", recs[key,file1] print file2":", recs[key,file2] print "" } } }
.
$ gawk -f tst.awk file1 file2 file1: aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=todd,chr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=todd,c hr) write in result file file2: aautox=y;acct=;action=c;aprice=05.975;aqty=3451;user=alan,ncr;confirm=n;key=29976dye5;dept=myna-clcd -- diff (user=alan,ncr) write in result file file1: file2: aautox=y;acct=;action=c;aprice=17.000;aqty=6453;user=todd,chr;confirm=n;key=29976dye6;dept=myna-clcd -- no match (key) found write in result file
Comments
Post a Comment