I have a column of numbers in my database. How can I computer the standard deviation? I do not want use the stddev function.
2 Answers
Just because I was curious, I decided to test the actual STDEV(). Now, I could not nail the built in function.
I was close... 0.000141009220002264 or 0.00748% off
Also, The Total Average and Count has to be converted to float (variance was greater with decimal)
The example below is going after my Treasury Rates Table for the 10 Year Yield (not that it matters)
Select SQLFunction = Stdev([TR_Y10])
,ManualCalc = Sqrt(Sum(Power(((cast([TR_Y10] as float)-B.TotalAvg)),2) / B.TotalCnt))
,Variance = Stdev([TR_Y10]) - Sqrt(Sum(Power(((cast([TR_Y10] as float)-B.TotalAvg)),2) / B.TotalCnt))
From [Chinrus-Shared].[dbo].[DS_Treasury_Rates]
Join (Select TotalAvg=Avg(cast([TR_Y10] as float)),TotalCnt=count(*) From [Chinrus-Shared].[dbo].[DS_Treasury_Rates]) B on 1=1
Returns
SQLFunction ManualCalc Variance
1.88409468982299 1.88395368060299 0.000141009220002264
Comments
The standard deviation is the square root of the variance divided by n.
The variance is the sum of the squares of the differences between the average and the observed value.
So, in most databases, you can use window functions:
select sqrt(avg(var))
from (select square(t.x - avg(t.x) over ()) as var
from t
) t;
Notes:
- The
square()function might have some other name (such aspower()). - The
sqrt()function might have some other name. - This is not a good way to calculate the standard deviation in general. In particular, this is a numerically unstable algorithm (it will work just fine for finite numbers of normal numbers).
- The subquery is needed because window functions cannot be the arguments to aggregation functions.
stddevfunction? And which DBMS are you using? Postgres? Oracle?GROUP BYunless your group is the entireSELECTed set. You've got aggregate functions, e.g.,SUMandCOUNT