linux - Calculate variance in bash -
i want compute variance of input txt file one:
1, 5 2, 5 3, 5 4, 10
and want output like:
1, 0 2, 0 3, 0 4, 4.6875
i've used line:
awk '{c[nr]=$2; s=s+c[nr]; avg= s / nr; var=var+(($2 - avg)^2 / (nr )); print var }' inputfile > outputfile
standard deviation formula described in http://www.mathsisfun.com/data/standard-deviation.html
so need say:
for in items sum += [(item - average)^2]/#items
doing in sample input:
5 av=5/1=5 var=(5-5)/1=0 5 av=10/2=5 var=(5-5)^2+(5-5)^2/2=0 5 av=15/3=5 var=3*(5-5)^2/3=0 10 av=25/4=6.25 var=3*(5-6.25)^2+(10-6.25)^2/4=4.6875
so in awk
can say:
$ awk 'begin {fs=ofs=","} # set comma field input/output separator {a[nr]=$2 # store data in array sum+=a[nr] # keep track of sum av=sum/nr # calculate average far v=0 # reset counter variance (i=1;i<=nr;i++) # loop through values v+=(a[i]-av)^2 # calculate variance print $1, v/nr} # print 1st field + result ' file
test
$ awk 'begin {fs=ofs=","} {a[nr]=$2; sum+=a[nr]; av=sum/nr; v=0; (i=1;i<=nr;i++) v+=(a[i]-av)^2; print $1, v/nr}' 1,0 2,0 3,0 4,4.6875
Comments
Post a Comment