We recently looked at looked at comparing schemas using Azure Data Studio. What if we need to compare tables by using a query? For this post we’ll compare using EXCEPT, NOT IN, and NOT EXISTS to find differences between two tables.
Our two tables to compare will be Comic and Comic_Copy. Based on counts, we have 48 more records in Comic than we do in Comic_Copy. Let’s find the differences.
Let’s start by comparing with EXCEPT. When using EXCEPT, we’ll want the same number and order of columns as well as the same data types. We fit that criteria for our example so let’s run our query and see the results:
SELECT Id,Title FROM Comic EXCEPT SELECT Id,Title FROM Comic_Copy; GO
We see 48 rows which makes sense based on the counts we looked at earlier.
Next up, we’ll use NOT IN to compare tables. We’ll select from the Comic table and use a subquery to pull the Id’s in Comic_Copy:
SELECT Id, Title FROM Comic WHERE Id NOT IN ( SELECT Id FROM Comic_Copy ); GO
We have our 48 rows again.
Finally, let’s test with NOT EXISTS.
SELECT Id, Title FROM Comic AS c WHERE NOT EXISTS ( SELECT 1 FROM Comic_Copy AS cc WHERE cc.Id = c.Id ); GO
Which Performs Best?
If we run all three together, what performs the best?
EXCEPT doesn’t appear to be the best option while the other other two options are showing the same plan. We’re working with a relatively small set of data so each is finishing in under a second with the same amount of logical reads.
NOT IN and NOT EXISTS do not work the same, though. For NULL values, NOT IN will not return records since NULL values are not treated as a match. When using NOT EXISTS, NULL values will be returned.
If we update one of the title values to NULL:
UPDATE Comic SET Title = NULL WHERE ID = 2210; GO
And run queries to compare based on Title, we’ll get different results. NOT IN will give us 47 rows:
While NOT EXISTS still gives us 48:
Go with NOT EXISTS
We’ve looked at three different methods to find differences between tables. Using EXCEPT, NOT IN, and NOT EXISTS can all get the job done but in different ways. NOT EXISTS would be my “go to” to optimally find differences but keep in mind the inclusion of NULL records compared to NOT IN.
Thanks for reading!
3 thoughts on “Options to Compare Table Records”