Calculate number of concurrent events in SQL

后端 未结 4 1701
攒了一身酷
攒了一身酷 2020-12-15 07:52

I have a table that holds phone calls, with the following fields:

  • ID
  • STARTTIME
  • ENDTIME
  • STATUS
  • CALL_FROM
  • CALL_TO
4条回答
  •  萌比男神i
    2020-12-15 08:25

    I'm assuming that you want to know the amount of active calls at any given time. Other answers give you how many other calls were active while the current call was active. For very long calls, this can give you very high numbers. It was indicated to me that the amount of active calls is what you wanted from one of your comments to the other answers (additionally, I also work in telecom). Unfortunately, I don't have enough reputation to comment that answer yet, as I created my account to answer this question. To get the number of active calls, you could use a variable which increases by one when a call is started and decreases by one when it's ended. I have tested this on a MySQL database with 50+ million calls. Sorry about any syntax differences between MySQL and pgsql.

    I added temporary tables for speed, but with only 2m rows and indexes, they may not be needed. MySQL cannot reference the same temporary table twice, so I had to create two.

    CREATE TEMPORARY TABLE a
    SELECT sid, StartTime, EndTime 
    FROM calls_nov
    WHERE StartTime between '2011-11-02' and '2011-11-03';
    
    CREATE TEMPORARY TABLE b
    SELECT *
    FROM a;
    
    SET @i := 0;
    
    SELECT *, @i := @i + c.delta AS concurrent
    FROM (
      SELECT StartTime AS time, 1 AS delta
      FROM a
      UNION ALL
      SELECT EndTime AS time, -1 AS delta
      FROM b
      ORDER BY time
    ) AS c
    ORDER BY concurrent DESC
    ;
    

    The inner SELECT returns two columns. The time column includes each StartTime and each EndTime from the original table (twice the amount of rows), and the delta column is +1 or -1 depending on which column was put in 'time'. This set is ordered by time, which we can then iterate through in the outer SELECT.

    Instead of "ORDER BY concurrent DESC" as you had in your query, I would use an additional outer SELECT where I could get MAX, MIN etc. values and I could also GROUP BY date, hour etc. This part of the query (ORDER BY concurrent DESC), I actually did not test. I used my own suggestion with an additional outer query, as ORDER BY does not perform as expected in MySQL when ordering by a variable that was set in the same SELECT. It orders by the previous value of the variable instead. If you absolutely need to order by concurrent calls (and pgsql has the same problem), I believe that you could get around this by again using an additional outer SELECT and ordering there.

    The query I ran was very fast! It scans through each temporary table once, and then the combination of the of the two once (with less data per row), and for my own version with an additional outer query it scans through the combination once again and then groups it. Each table is only scanned once! This will all be done in RAM if your configuration and hardware allows it. Other answers (or questions) will help you if it does not.

提交回复
热议问题