Friday, March 30, 2012

Puzzling "NOT EXISTS" behavior

CASE CLOSED: removing the FROM clause noted below handles the problem ...
I've used "NOT EXISTS" in inserts before and thought I understood how they work, butI'm puzzled why I get three rows from the following sub-query, which I want to use to prevent errors from duplicate keys. Since it gives me three rows, it will actually try to insert three duplicate records and cause a primary-key fault, which is a twin to the very thing I'm trying to avoid. (of course, there are three records already in the table, none having the key of 20050810)
The sub-query is acting like it's using "WHERE DemoDate <> 20050810" instead of a "NOT EXISTS".
Attempts to use this statement causes a termination with no records inserted.
DemoDate is the primary key and an INT field in SQL Server. Four other int columns for this table have default values.
<code>
INSERT INTO DemoStats (DemoDate)
SELECT 20050810 AS Expr1
FROM DemoStats <-- remove this line so the effective table has only one row, when the NOT EXISTS is TRUE
WHERE (NOT EXISTS
(SELECT *
FROM DemoStats
WHERE DemoDate = 20050810))
</code>I read that as saying...
For every row in DemoStats that != 20050810 please insert 20050810
So surely you'd get loads inserts and therefore duplicate key bangs?
Why not simply
<code>
If not exists(xxxxx) insert <yyyy>
</code>
granted it introduces a condition but it's a lot easier to understand your intentions, plus it works ;)

|||I don't think that will work, since this is a web-based app, so someone else could easily insert the record between checking for exists and doing the insert, thus potentially causing the error anyway. This needs to be handled within a single sql statement.
I found the following (and several other examples that maybe suggest that EXISTS should work a little more like I'm thinking). In this case I wondering if "dual" is a reserved word with a special function, which I haven't seen before.
<code>

If you wanted to insert a single record, you could use the following statement:

INSERT INTO clients
(client_id, client_name, client_type)
SELECT 10345, 'IBM', 'advertising'
FROM dual
WHERE not exists (select * from clients
where clients.client_id = 10345);

The use of thedual table allows you to enter your values in a select statement, even though the values are not currently stored in a table.


</code>|||

Edited:
If you really want to use your approach you could do simply like this :
INSERT INTO clients
(client_id, client_name, client_type)
SELECT 10345, 'IBM', 'advertising'
WHERE not exists (select * from clients
where clients.client_id = 10345);
without dual.

|||My theory (I'd love to have someone confirm or deny this): I think it's a bug in SQL Server or maybe even SQL specs. As I understand it, EXISTS is supposed to be a single-valued boolean: TRUE ifone or more rows satisfy the sub-query, and FALSE ifno rows satsify the sub-query. I think the bug depends on the fact that the INSERT, the SELECT, and the sub-query all deal with the same table. I have other UPDATES and INSERTS that use NOT EXISTS, and they're working correctly, but they don't have the same table in the INSERT, SELECT, and sub-query. If I'm wrong I need to go back and re-examine them, but need some confidence of knowing what is right.
When I add DISTINCT below it adds just one record (doesn't try to add three identical-key records), so I'm out of trouble, but I don't know why it otherwise tries to insert three records and triggers an error and termination.
<code>
INSERT INTO DemoStats (DemoDate)
SELECT DISTINCT 20050810 AS Expr1
FROM DemoStats
WHERE (NOT EXISTS
(SELECT *
FROM DemoStats
WHERE DemoDate = 20050810))
</code>
|||

BBradshaw wrote:

I don't think that will work, since this is a web-based app, so someone else could easily insert the record between checking for exists and doing the insert,


That's what isolation levels are for or you'll probably get away with just sticking it a transaction.

No comments:

Post a Comment