问题
I have a simple table called LOGENTRY with fields called "DATE" and "COST". Example:
+--------------+-------+
| DATE | COST |
+--------------+-------+
| MAY 1 2013 | 0.8 |
| SEP 1 2013 | 0.4 |
| NOV 1 2013 | 0.6 |
| DEC 1 2013 | 0.2 |
+--------------+-------+
I would like to find the slope of the COST field over time (a range of rows selected), resulting in SLOPE=-0.00216 (This is equivalent to Excel's SLOPE function, aka linear regression).
Is there a simple way to SELECT the slope of COST? If I do the math in the calling language (php) I can find slope as:
SLOPE = (N * Sum_XY - Sum_X * Sum_Y)/(N * Sum_X2 - Sum_X * Sum_X);
I saw some similar questions posted but they are more complex. I'm trying to strip this example down to the simplest situation - so I can understand the answer :) Here's as close as I got...but MYSQL complains about the syntax near: 'float)) AS Sum_X, SUM(CAST(LOGENTRY.DATE as float) * CAST(LOGENTRY.DATE'
SELECT
COUNT( * ) AS N,
SUM( CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X,
SUM( CAST( LOGENTRY.DATE AS FLOAT ) * CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X2,
SUM( LOGENTRY.COST ) AS Sum_Y, SUM( LOGENTRY.COST * LOGENTRY.COST ) AS Sum_Y2,
SUM( CAST( LOGENTRY.DATE AS FLOAT ) * LOGENTRY.COST ) AS Sum_XY
FROM LOGENTRY
回答1:
It seems that MySQL cannot cast a date as float (as per the other examples in stackoverflow). Perhaps the other examples refer to another database. So by converting dates to unix_timestamps I am able to get an answer...with the final calculation in PHP. If this is WRONG...please post and I will remove answer...
SELECT
COUNT(*) AS N,
SUM(UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X,
SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X2,
SUM(LOGENTRY.COST) AS Sum_Y,
SUM(LOGENTRY.COST*LOGENTRY.COST) AS Sum_Y2,
SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * LOGENTRY.COST) AS Sum_XY
FROM LOGENTRY
来源:https://stackoverflow.com/questions/14899823/mysql-slope-trend-of-single-field-line-of-best-fit