I have some data in a sql database and I\'d like to calculate the slope. The data has this layout:
Date | Keyword | Score
2012-01-10 | ipad
The cleanest one I could make:
SELECT
Scores.Date, Scores.Keyword, Scores.Score,
(N * Sum_XY - Sum_X * Sum_Y)/(N * Sum_X2 - Sum_X * Sum_X) AS Slope
FROM Scores
INNER JOIN (
SELECT
Keyword,
COUNT(*) AS N,
SUM(CAST(Date as float)) AS Sum_X,
SUM(CAST(Date as float) * CAST(Date as float)) AS Sum_X2,
SUM(Score) AS Sum_Y,
SUM(Score*Score) AS Sum_Y2,
SUM(CAST(Date as float) * Score) AS Sum_XY
FROM Scores
GROUP BY Keyword
) G ON G.Keyword = Scores.Keyword;
It uses Simple Linear Regression to calculate the slope.
Result:
Date Keyword Score Slope
2012-01-22 water bottle 0,010885442 0,000334784345222076
2012-01-23 water bottle 0,011203949 0,000334784345222076
2012-01-24 water bottle 0,008460835 0,000334784345222076
2012-01-25 water bottle 0,010363991 0,000334784345222076
2012-01-26 water bottle 0,011800716 0,000334784345222076
2012-01-27 water bottle 0,012948411 0,000334784345222076
2012-01-28 water bottle 0,012732459 0,000334784345222076
2012-01-29 water bottle 0,011682568 0,000334784345222076
Every database system seems to have a different approach to converting dates to numbers:
TO_SECONDS(date) or TO_DAYS(date)TO_NUMBER(TO_CHAR(date, 'J')) or date - TO_DATE('1','yyyy')CAST(date AS float) (or equivalent CONVERT)