SQL Server 2016 - Inserting remaining rows into a table leading to duplicates of existing rows

问题

I have 4 tables: People Status, People, Codes and PeopleStatusCodes with the following schemas:

People:

[ID]         INT IDENTITY (1, 1) CONSTRAINT [PK_People_ID] PRIMARY KEY,
[PersonCode] VARCHAR(MAX) NOT NULL,
[FirstName]  VARCHAR(MAX) NOT NULL,
[LastName]   VARCHAR(MAX) NOT NULL

PeopleStatus:

[ID]         INT IDENTITY (1, 1) CONSTRAINT [PK_PeopleStatus_ID] PRIMARY KEY,
[PeopleID] VARCHAR(MAX) NOT NULL FOREIGN KEY REFERENCES [People]([ID]),
[Status]     INT NOT NULL

Codes:

[ID]         INT IDENTITY (1, 1) CONSTRAINT [PK_Codes_ID] PRIMARY KEY,
[CodeNumber] VARCHAR(MAX) NOT NULL,
[Name]       VARCHAR(MAX) NOT NULL

PeopleStatusCodes:

[ID]             INT IDENTITY (1, 1) CONSTRAINT [PK_PeopleStatusCodes_ID] PRIMARY KEY,
[PeopleStatusID] INT NOT NULL FOREIGN KEY REFERENCES [PeopleStatus]([ID]),
[CodeID]         INT NOT NULL FOREIGN KEY REFERENCES [Codes]([ID]),
[Result]         INT NOT NULL, --success = 1, fail=0

I am attempting to insert 3 rows of data into the PeopleStatusCodes table - 1 row where the Result = 1, and the remaining rows where Result = 0.

The code below declares 2 temporary tables - one to store the Person's PeopleStatus ID (@peopleStatus) the other to store the data (@data). It then checks that the Person does not already have an entry in the PeopleStatus table - if it does not, a new entry in the PeopleStatus table is created, and that ID is inserted into @peopleStatus. If an entry already exists, the ID of that entry is inserted into @peopleStatus.

An entry is then inserted into PeopleStatusCodes table based off @data, with Result = 1. After that, entries for the remaining Codes which do not have matching data are inserted with Result = 0.

--declare temporary tables
DECLARE @peopleStatus TABLE (peopleStatusID INT)
DECLARE @data TABLE (FirstName VARCHAR (100), LastName VARCHAR (100), Codename VARCHAR (100))

--insert data into @data
INSERT INTO @data(
    [FirstName]
   ,[LastName]
   ,[Codename]
)
VALUES(
    'John'
   ,'Smith'
   ,'02 - Code2'
)

--check if entry exists inside PeopleStatus and insert into @peopleStatus based on that
IF NOT EXISTS (SELECT [ps].[PersonCode] FROM PeopleStatus [ps], People [p], @data [d]
WHERE [ps].[PersonCode] = [p].[PersonCode]
AND [p].[FirstName] = [d].[FirstName]
AND [p].[LastName] = [d].[LastName])
    INSERT INTO PeopleStatus (
           [PersonCode]
          ,[Status]
    )
    OUTPUT inserted.[ID]
    INTO @peopleStatus
    SELECT
           [p].[PersonCode]
          ,1
    FROM [People] [p], @data [d]
    WHERE [p].[FirstName] = [d].[FirstName]
      AND [p].[LastName] = [d].[LastName]
ELSE INSERT INTO @peopleStatus (peopleStatusID)
SELECT [ps].[ID]
FROM PeopleStatus [ps], People [p], @data [d]
WHERE [ps].[PersonCode] = [p].[PersonCode]
AND [p].[FirstName] = [d].[FirstName]
AND [p].[LastName] = [d].[LastName]

--insert into PeopleStatusCodes a row of data with Result = 1 based off data stored in @data
INSERT INTO [dbo].[PeopleStatusCodes] (
     [PeopleStatusID]
    ,[CodeID]
    ,[Result]
)
SELECT
     [temp].[peopleStatusID]
    ,(SELECT ID FROM Codes WHERE CodeNumber + ' - ' + Name = [d].[Codename])
    ,1
FROM @peopleStatus [temp], @data [d]

--for every remaining Code in the Codes table which did not have a match with the data, insert into PeopleStatusCodes a row of data with Result = 0
DECLARE @IDColumn INT
SELECT @IDColumn = MIN(c.ID) 
FROM Codes [c], PeopleStatusCodes [psc], @peopleStatus [temp]
WHERE [psc].CodeID != [c].ID 
AND [psc].PeopleStatusID = [temp].peopleStatusID
WHILE @IDColumn IS NOT NULL
BEGIN
    INSERT INTO [dbo].[PeopleStatusCodes] (
         [PeopleStatusID]
        ,[CodeID]
        ,[Result]
    )
    SELECT
         [temp].peopleStatusID
        ,@IDColumn
        ,0
    FROM @peopleStatus [temp]

    SELECT @IDColumn = MIN(c.ID) 
    FROM Codes [c], PeopleStatusCodes [psc], @peopleStatus [temp]
    WHERE [psc].CodeID != [c].ID 
    AND [psc].PeopleStatusID = [temp].peopleStatusID
    AND c.ID > @IDColumn
END

My problem is that when I run the code, instead of 3 entries in the PeopleStatusCodes table, I get 4 entries, with 1 entry a duplicate.

What I get:

+----+----------------+--------+--------+
| ID | PeopleStatusID | CodeID | Result |
+----+----------------+--------+--------+
|  1 |              1 |      2 |      1 |
|  2 |              1 |      1 |      0 |
|  3 |              1 |      2 |      0 |
|  4 |              1 |      3 |      0 |
+----+----------------+--------+--------+

What I want:

+----+----------------+--------+--------+
| ID | PeopleStatusID | CodeID | Result |
+----+----------------+--------+--------+
|  1 |              1 |      2 |      1 |
|  2 |              1 |      1 |      0 |
|  3 |              1 |      3 |      0 |
+----+----------------+--------+--------+

Update: I managed to solve it by going about it in a more straight forward way - insert all rows first, then update rows where necessary.

回答1:

In the last pasrt, you could use a row number to remove duplicates:

;WITH ROW AS (
SELECT @IDColumn = MIN(c.ID),
       ROW_NUMBER () OVER (PARTITION BY PeopleStatusID, CodeID ORDER BY 
       PeopleStatusID) AS ROW
FROM Codes [c], PeopleStatusCodes [psc], @peopleStatus [temp]
WHERE [psc].CodeID != [c].ID 
AND [psc].PeopleStatusID = [temp].peopleStatusID
AND c.ID > @IDColumn )

SELECT * FROM ROW WHERE Row = 1

回答2:

I managed to solve it by going about it a different way. Instead of inserting one row with Result = 1 followed by the remaining rows, I inserted ALL rows with default Result = 0. I then Updated the row that matched the data to have Result = 1.

--Inserts a row for every Code into PeopleStatusCodes
DECLARE @IDColumn VARCHAR (10)
SELECT @IDColumn = MIN(c.ID)
FROM Codes [c]
WHILE @IDColumn IS NOT NULL
BEGIN
    INSERT INTO [dbo].[PeopleStatusCodes] (
         [PeopleStatusID]
        ,[CodeID]
        ,[Result]
    )
    SELECT
         [temp].[peopleStatusID]
        ,@IDColumn
        ,0
    FROM @peopleStatus [temp]

    SELECT @IDColumn = MIN(c.ID)
    FROM Codes [c]
    WHERE c.ID > @IDColumn
END

--Checks if the data matching row has not had Result changed to 1 already, and if so, update that row.
IF NOT EXISTS (SELECT [psc].ID
FROM PeopleStatusCodes [psc], @peopleStatus [temp]
WHERE [psc].PeopleStatusID = [temp].peopleStatusID 
AND [psc].CodeID = (SELECT [c].ID FROM Codes [c], @data [d] WHERE [c].CodeNumber + ' - ' + [c].Name = [d].[Codename])
AND [psc].Result = 1)
UPDATE [dbo].[PeopleStatusCodes] SET Result = 1 WHERE CodeID = (SELECT [c].ID FROM Codes [c], @data [d] WHERE [c].CodeNumber + ' - ' + [c].Name = [d].[Codename])

来源：https://stackoverflow.com/questions/48201111/sql-server-2016-inserting-remaining-rows-into-a-table-leading-to-duplicates-of

标签

sql-server

database

duplicates

sql-insert