Fully matching sets of records of two many-to-many tables

隐身守侯 提交于 2019-12-22 07:58:13

问题


I have Users, Positions and Licenses.

Relations are:

  • users may have many licenses
  • positions may require many licenses

So I can easily get license requirements per position(s) as well as effective licenses per user(s).

But I wonder what would be the best way to match the two sets? As logic goes user needs at least those licenses that are required by a certain position. May have more, but remaining are not relevant.

I would like to get results with users and eligible positions.

PersonID PositionID
1        1          -> user 1 is eligible to work on position 1
1        2          -> user 1 is eligible to work on position 2
2        1          -> user 2 is eligible to work on position 1
3        2          -> user 3 is eligible to work on position 2
4        ...

As you can see I need a result for all users, not a single one per call, which would make things much much easier.


There are actually 5 tables here:

create table Person ( PersonID, ...)
create table Position (PositionID, ...)
create table License (LicenseID, ...)

and relations

create table PersonLicense (PersonID, LicenseID, ...)
create table PositionLicense (PositionID, LicenseID, ...)

So basically I need to find positions that a particular person is licensed to work on. There's of course a much more complex problem here, because there are other factors, but the main objective is the same:

How do I match multiple records of one relational table to multiple records of the other. This could as well be described as an inner join per set of records and not per single record as it's usually done in TSQL.

I'm thinking of TSQL language constructs:

  • rowsets but I've never used them before and don't know how to use them anyway
  • intersect statements maybe although these probably only work over whole sets and not groups

回答1:


Final solution (for future reference)

In the meantime while you fellow developers answered my question, this is something I came up with and uses CTEs and partitioning which can of course be used on SQL Server 2008 R2. I've never used result partitioning before so I had to learn something new (which is a plus altogether). Here's the code:

with CTEPositionLicense as (
    select
        PositionID,
        LicenseID,
        checksum_agg(LicenseID) over (partition by PositionID) as RequiredHash
    from PositionLicense
)
select per.PersonID, pos.PositionID
from CTEPositionLicense pos
    join PersonLicense per
    on (per.LicenseID = pos.LicenseID)
group by pos.PositionID, pos.RequiredHash, per.PersonID
having pos.RequiredHash = checksum_agg(per.LicenseID)
order by per.PersonID, pos.PositionID;

So I made a comparison between these three techniques that I named as:

  1. Cross join (by Andriy M)
  2. Table variable (by Petar Ivanov)
  3. Checksum - this one here (by Robert Koritnik, me)

Mine already orders results per person and position, so I also added the same to the other two to make return identical results.

Resulting estimated execution plan

  1. Checksum: 7%
  2. Table variable: 2% (table creation) + 9% (execution) = 11%
  3. Cross join: 82%

I also changed Table variable version into a CTE version (instead of table variable a CTE was used) and removed order by at the end and compared their estimated execution plans. Just for reference CTE version 43% while original version had 53% (10% + 43%).




回答2:


One way to write this efficiently is to do a join of PositionLicences with PersonLicences on the licenceId. Then count the non nulls grouped by position and person and compare with the count of all licences for position - if equal than that person qualifies:

DECLARE @tmp TABLE(PositionId INT, LicenseCount INT)

INSERT INTO @tmp
SELECT  PositionId as PositionId
        COUNT(1) as LicenseCount
FROM PositionLicense
GROUP BY PositionId

SELECT  per.PersonID, pos.PositionId
FROM    PositionLicense as pos
INNER JOIN PersonLicense as per ON (pos.LicenseId = per.LicenseId)
GROUP BY t.PositionID, t.PersonId
HAVING COUNT(1) = (
    SELECT LicenceCount FROM @tmp WHERE PositionId = t.PositionID
)



回答3:


I would approach the problem like this:

  1. Get all the (distinct) users from PersonLicense.

  2. Cross join them with PositionLicense.

  3. Left join the resulting set with PersonLicense using PersonID and LicenseID.

  4. Group the results by PersonID and PositionID.

  5. Filter out those (PersonID, PositionID) pairs where the number of licenses in PositionLicense does not match the number of those in PersonLicense.

And here's my implementation:

SELECT
  u.PersonID,
  pl.PositionID
FROM (SELECT DISTINCT PersonID FROM PersonLicense) u
  CROSS JOIN PositionLicense pl
  LEFT JOIN PersonLicense ul ON u.PersonID = ul.PersonID
                            AND pl.LicenseID = ul.LicenseID
GROUP BY
  u.PersonID,
  pl.PositionID
HAVING COUNT(pl.LicenseID) = COUNT(ul.LicenseID)


来源:https://stackoverflow.com/questions/6725635/fully-matching-sets-of-records-of-two-many-to-many-tables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!