How to improve wind data SQL query performance

自作多情 提交于 2019-12-25 02:57:55

问题


I'm looking for help on how to optimize (if possible) the performance of a SQL query used for reading wind information (see below) by changing the e.g. the database structure, query or something else?

I use a hosted database to store a table with more than 800,000 rows with wind information (speed and direction). New data is added each minute from an anemometer. The database is accessed using a PHP script which creates a web page for plotting the data using Google's visualization API.

The web page takes approximately 15 seconds to load. I've added some time measurements in both the PHP and Javascript part to profile the code and find possible areas for improvements.

One part where I hope to improve is the following query which takes approximately 4 seconds to execute. The purpose of the query is to group 15 minutes of wind speed (min/max/mean) and calculate the mean value and total min/max during this period of measurements.

SELECT  AVG(d_mean) AS group_mean, 
        MAX(d_max) as group_max, 
        MIN(d_min) AS
        group_min, 
        dir, 
        FROM_UNIXTIME(MAX(dt),'%Y-%m-%d %H:%i') AS group_dt 
FROM    (
    SELECT  @i:=@i+1, 
            FLOOR(@i/15) AS group_id, 
            CAST(mean AS DECIMAL(3,1)) AS d_mean, 
            CAST(min AS DECIMAL(3,1)) AS d_min, 
            CAST(max AS DECIMAL(3,1)) AS d_max, 
            dir, 
            UNIX_TIMESTAMP(STR_TO_DATE(dt, '%Y-%m-%d %H:%i')) AS dt 
            FROM table, (SELECT @i:=-1) VAR_INIT 
            ORDER BY id DESC
) AS T 
GROUP BY group_id
LIMIT 0, 360

...

$oResult = mysql_query($sSQL);

The table has the following structure:

1   ID      int(11)     AUTO_INCREMENT
2   mean    varchar(5)  utf8_general_ci
3   max     varchar(5)  utf8_general_ci
4   min     varchar(5)  utf8_general_ci
5   dt      varchar(20) utf8_general_ci    // Date and time
6   dir     varchar(5)  utf8_general_ci

The following setup is used:

  • Database: MariaDB, 5.5.42-MariaDB-1~wheezy
  • Database client version: libmysql - 5.1.66
  • PHP version: 5.6
  • PHP extension: mysqli

回答1:


I strongly agree with the comments so far -- Cleanse the data as you put it into the table.

Once you have done the cleansing, let's avoid the subquery by doing...

SELECT  MIN(dt) as 'Start of 15 mins',
        FORMAT(AVG(mean), 1) as 'Avg wind speed',
        ...
    FROM table
    GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 900)
    ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 900);

I don't understand the purpose of the LIMIT. I'll guess that you want to a few days at a time. For that, I recommend you add (after cleansing) between the FROM and the GROUP BY.

    WHERE dt >= '2015-04-10'
      AND dt  < '2015-04-10' + INTERVAL 7 DAY

That would show 7 days, starting '2015-04-10' morning.

In order to handle a table of 800K, you would decidedly need (again, after cleansing):

INDEX(dt)

To cleanse the 800K rows, there are multiple approaches. I suggest creating a new table, copy the data in, test, and eventually swap over. Something like...

CREATE TABLE new (
    dt DATETIME, 
    mean FLOAT,
    ...
    PRIMARY KEY(dt)  -- assuming you have only one row per minute?
) ENGINE=InnoDB;

INSERT INTO new (dt, mean, ...)
    SELECT str_to_date(...),
           mean, -- I suspect that the CAST is not needed
           ...;

Write the new select and test it.

By now new is missing the newer rows. You can either rebuild it and hope to finish everything in your one minute window, or play some other game. Let us know if you want help there.



来源:https://stackoverflow.com/questions/29806192/how-to-improve-wind-data-sql-query-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!