I've inherited a database and I'm having trouble constructing a working SQL query.
Suppose this is the data:
[Products] | Id | DisplayId | Version | Company | Description | |---- |----------- |---------- |-----------| ----------- | | 1 | 12345 | 0 | 16 | Random | | 2 | 12345 | 0 | 2 | Random 2 | | 3 | AB123 | 0 | 1 | Random 3 | | 4 | 12345 | 1 | 16 | Random 4 | | 5 | 12345 | 1 | 2 | Random 5 | | 6 | AB123 | 0 | 5 | Random 6 | | 7 | 12345 | 2 | 16 | Random 7 | | 8 | XX45 | 0 | 5 | Random 8 | | 9 | XX45 | 0 | 7 | Random 9 | | 10 | XX45 | 1 | 5 | Random 10 | | 11 | XX45 | 1 | 7 | Random 11 | [Companies] | Id | Code | |---- |-----------| | 1 | 'ABC' | | 2 | '456' | | 5 | 'XYZ' | | 7 | 'XYZ' | | 16 | '456' |
The Version
column is a version number. Higher numbers indicate more recent versions. The Company
column is a foreign key referencing the Companies
table on the Id
column. There's another table called ProductData
with a ProductId
column referencing Products.Id
.
Now I need to find duplicates based on the DisplayId
and the corresponding Companies.Code
. The ProductData
table should be joined to show a title (ProductData.Title
), and only the most recent ones should be included in the results. So the expected results are:
| Id | DisplayId | Version | Company | Description | ProductData.Title | |---- |----------- |---------- |-----------|------------- |------------------ | | 5 | 12345 | 1 | 2 | Random 2 | Title 2 | | 7 | 12345 | 2 | 16 | Random 7 | Title 7 | | 10 | XX45 | 1 | 5 | Random 10 | Title 10 | | 11 | XX45 | 1 | 7 | Random 11 | Title 11 |
- because XX45 has 2 "entries": one with Company 5 and one with Company 7, but both companies share the same code.
- because 12345 has 2 "entries": one with Company 2 and one with Company 16, but both companies share the same code. Note that the most recent version of both differs (version 2 for company 16's entry and version 1 for company 2's entry)
- ABC123 should not be included as its 2 entries have different company codes.
I'm eager to learn your insights...