I\'m trying to figure out the best way to insert a record into a single table but only if the item doesn\'t already exist. The KEY in this case is an NVARCHAR(400) field. Fo
I had similar problem and this is how I solved it
insert into Words
( selectWord , Fixword)
SELECT word,'theFixword'
FROM OldWordsTable
WHERE
(
(word LIKE 'junk%') OR
(word LIKE 'orSomthing')
)
and word not in
(
SELECT selectWord FROM words WHERE selectWord = word
)
Your solution:
INSERT INTO Words (Word)
SELECT @Word
WHERE NOT EXISTS (SELECT WordID FROM Words WHERE Word = @Word)
...is about as good as it gets. You could simplify it to this:
INSERT INTO Words (Word)
SELECT @Word
WHERE NOT EXISTS (SELECT * FROM Words WHERE Word = @Word)
...because EXISTS doesn't actually need to return any records, so the query optimiser won't bother looking at which fields you asked for.
As you mention, however, this isn't particularly performant, because it'll lock the whole table during the INSERT. Except that, if you add a unique index (it doesn't need to be the primary key) to Word, then it'll only need to lock the relevant pages.
Your best option is to simulate the expected load and look at the performance with SQL Server Profiler. As with any other field, premature optimisation is a bad thing. Define acceptable performance metrics, and then measure before doing anything else.
If that's still not giving you adequate performance, then there's a bunch of techniques from the data warehousing field that could help.
declare @Error int
begin transaction
INSERT INTO Words (Word) values(@word)
set @Error = @@ERROR
if @Error <> 0 --if error is raised
begin
goto LogError
end
commit transaction
goto ProcEnd
LogError:
rollback transaction
If you are using MS SQL Server, you can create a unique index on your table's columns that need to be unique (documented here):
CREATE UNIQUE [ CLUSTERED | NONCLUSTERED ] INDEX <index_name>
ON Words ( word [ ASC | DESC ])
Specify Clustered
or NonClustered
, depending on your case. Also, if you want it sorted (to enable faster seeking), specify ASC
or DESC
for the sort order.
See here, if you want to learn more about indexes architecture.
Otherwise, you could use UNIQUE CONSTRAINTS
like documented here:
ALTER TABLE Words
ADD CONSTRAINT UniqueWord
UNIQUE (Word);
I think I've found a better (or at least faster) answer to this. Create an index like:
CREATE UNIQUE NONCLUSTERED INDEX [IndexTableUniqueRows] ON [dbo].[table]
(
[Col1] ASC,
[Col2] ASC,
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = ON, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Include all the columns that define uniqueness. The important part is IGNORE_DUP_KEY = ON. That turns non unique inserts into warnings. SSIS ignores these warnings and you can still use fastload too.
I can't speak to the particulars of MS SQL, but one point of a primary key in SQL is to ensure uniqueness. So by definition in generic SQL terms, a primary key is one or more fields that is unique to a table. While there are different ways to enforce this behavior (replace the old entry with the new one vs. reject the new one) I would be surprised if MS SQL both didn't have a mechanism for enforcing this behavior and that it wasn't to reject the new entry. Just make sure you set the primary key to the Word field and it should work.
Once again though, I disclaim this is all from my knowledge from MySQL programming and my databases class, so apologies if I'm off on the intricacies of MS SQL.