linux - Calculate variance in bash -


i want compute variance of input txt file one:

1, 5 2, 5 3, 5 4, 10 

and want output like:

1, 0 2, 0  3, 0 4, 4.6875 

i've used line:

awk '{c[nr]=$2; s=s+c[nr]; avg= s / nr; var=var+(($2 - avg)^2 / (nr )); print var }' inputfile > outputfile 

standard deviation formula described in http://www.mathsisfun.com/data/standard-deviation.html

so need say:

for in items    sum += [(item - average)^2]/#items 

doing in sample input:

5   av=5/1=5       var=(5-5)/1=0 5   av=10/2=5      var=(5-5)^2+(5-5)^2/2=0 5   av=15/3=5      var=3*(5-5)^2/3=0 10  av=25/4=6.25   var=3*(5-6.25)^2+(10-6.25)^2/4=4.6875 

so in awk can say:

$ awk 'begin {fs=ofs=","}      # set comma field input/output separator        {a[nr]=$2               # store data in array         sum+=a[nr]             # keep track of sum         av=sum/nr              # calculate average far         v=0                    # reset counter variance         (i=1;i<=nr;i++)    # loop through values              v+=(a[i]-av)^2    # calculate variance         print $1, v/nr}        # print 1st field + result   ' file 

test

$ awk 'begin {fs=ofs=","} {a[nr]=$2; sum+=a[nr]; av=sum/nr; v=0; (i=1;i<=nr;i++) v+=(a[i]-av)^2; print $1, v/nr}' 1,0 2,0 3,0 4,4.6875 

Comments

Popular posts from this blog

javascript - oscilloscope of speaker input stops rendering after a few seconds -

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -