Complex SUM from multiple tables

北战南征 提交于 2019-12-12 01:27:36

问题


Here are my tables:

CREATE TABLE component
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            name TEXT UNIQUE);

CREATE TABLE file
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            component_id INTEGER,
                            name TEXT UNIQUE);

CREATE TABLE function
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            file_id INTEGER,
                            name TEXT,
                            FOREIGN KEY(file_id) REFERENCES file(id),
                            UNIQUE(file_id, name));

CREATE TABLE version
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            version TEXT UNIQUE);

CREATE TABLE data
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            file_id INTEGER,
                            version_id INTEGER,
                            function_id INTEGER,
                            errors INTEGER,
                            ...,
                            FOREIGN KEY(file_id) REFERENCES file(id),
                            FOREIGN KEY(version_id) REFERENCES version(id),
                            FOREIGN KEY(function_id) REFERENCES function(id),
                            UNIQUE(file_id, version_id, function_id));

I need two queries:

  • One to SUM the data.errors for all data in a file. For a given file id I need the total sum of all errors.
  • One to SUM the data.errors for all functions for all files inside a specific component.
  • ALL of the data.errors MUST belong to the most recent version_id.

Example of the version MAX requirement above:

DATA
id  file_id     version_id  function_id     errors
1       1           3           1           40
2       1           3           2           231
3       1           2           3           19

Here I need it to return ids 1,2 and disregard 3 even if it is the most recent version for a specific function. It does match with the most recent version for the the functions belonging to that file. Imagine a real world scenario where a function is removed from a file in a new version.

The only requirement is that the query is as fast as it can be. The constraints are not changing too much in the database (preferably nothing at all). If this is possible to do in Django ORM, where I intend to use it, that would be great but it's not required.


回答1:


The most recent version of a file can be computed like this:

SELECT MAX(version_id)
FROM data
WHERE file_id = ?

This can simply be plugged into another query to get the sum:

SELECT SUM(errors)
FROM data
WHERE file_id = ?
  AND version_id = (SELECT MAX(version_id)
                    FROM data
                    WHERE file_id = ?)

To extend this to a component, another subquery is needed to look up the component's files:

SELECT SUM(errors)
FROM data
WHERE file_id IN (SELECT id
                  FROM file
                  WHERE component_id = ?)
  AND version_id = (SELECT MAX(version_id)
                    FROM data
                    WHERE file_id IN (SELECT id
                                      FROM file
                                      WHERE component_id = ?))


来源:https://stackoverflow.com/questions/24719716/complex-sum-from-multiple-tables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!