I am a consultant in Austin who can help make your data go fast, be secure and highly available. When I am engaged in a performance tuning project priority #1 isn’t to make sure your data go faster. Priority #1 is to make sure we get the same result sets while making your data go faster.
Free SQL Data Compare with T-SQL?
There are several tools out there that can be used to compare data. Today, I want to share how you can quickly do this on your own with T-SQL!
Let’s simplify the process. Our goal is to check two temp tables and validate if any of the data is different. This would include inserts, updates, and deletes. For this example, I will just do a dump of Sales.SalesOrderDetail in AdventureWorks into two temp tables as shown below.
SELECT *
INTO #Tmp1
FROM Sales.SalesOrderDetail
SELECT *
INTO #Tmp2
FROM Sales.SalesOrderDetail
Now we shouldn’t see any differences since we used the same table to create both temp tables. We are going to use two different SQL operators to compare these two temp tables while applying some data changes. We will focus on the UNION ALL and EXCEPT operators.
The Power of EXCEPT
Except is an underrated and underused SQL operation. In a nutshell, it will give you the results of the first query that are different from the next query. So, if the data of any column in #tmp1 is different from #tmp2 or if the row doesn’t exist in #tmp2 but is in #tmp1 it will get returned.
SELECT * FROM #Tmp1
EXCEPT
SELECT * FROM #Tmp2
Let’s go ahead and modify a column in #Tmp1 so you can see how this works. We are going to set OrderQty to five when SalesOrderId is 45313 and SalesOrderDetailId is 6210. This will change just one column in one row. We will then select these columns from both temp tables to see the change.
This is how most people would start using T-SQL to identify changes in data.
UPDATE #Tmp1 SET OrderQty = 5
WHERE SalesOrderID = 45313
AND SalesOrderDetailID = 6210
SELECT SalesOrderId, SalesOrderDetailID,
OrderQty FROM #Tmp1
WHERE SalesOrderID = 45313
AND SalesOrderDetailID = 6210
SELECT SalesOrderId, SalesOrderDetailID,
OrderQty FROM #Tmp2
WHERE SalesOrderID = 45313
AND SalesOrderDetailID = 6210
Finding Data Changes The Easy Way
Selecting the two tables is easy if we know what change occurred and there aren’t many changes. This can get complicated quickly. Therefore, if we just want to quickly know if we have differences lets take a look at my goto method using EXCEPT. To make this example easier to read instead of using “SELECT *” I will just focus on columns that are changing. In a real example, I would want to know if any columns changed.
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp1
EXCEPT
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp2
If an insert or a column change occurs in #tmp1 we will see it in our EXCEPT SQL statement. This isn’t true if the change is only in #tmp2.
For example, an insert in #tmp2 or delete in #tmp1 would not be shown. To see this we would have to switch the temp tables in the EXCEPT clause as shown below.
INSERT INTO #tmp2 (SalesOrderId, ProductID,
SpecialOfferID, OrderQty, UnitPrice,
UnitPriceDiscount,LineTotal,
rowguid, ModifiedDate)
VALUES (45313, 1, 3, 1,1.25,0,
1.25*1, NEWID(), GETDATE())
DELETE FROM #Tmp1
WHERE SalesOrderID = 45313
AND SalesOrderDetailID = 6211
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp1
EXCEPT
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp2
/* We will now see our insert and delete */
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp2
EXCEPT
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp1
Our first except shows us data in #tmp1 that is not in #tmp2 because the OrderQty column changed in #tmp1. The second EXCEPT shows us data in #tmp2 that isn’t in #tmp1 because of our insert into #tmp2 and also our delete from #tmp1 would be found in #tmp2 but not #tmp1.
UNION ALL for the Win!
To wrap this up now we can include a UNION ALL operation between the two EXCEPT operations. This would get us any data changes to the columns selected from the temp tables.
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp1
EXCEPT
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp2
UNION ALL
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp2
EXCEPT
SELECT SalesOrderId, SalesOrderDetailID, OrderQty
FROM #Tmp1
Typically, I need to verify is the data before and after is the same. This is a quick and easy way to get that answer. Now I know you might want to take this to the next level. You might be thinking how do I just get the unique key for the table and columns that changed. I will leave that as an exercise for you.
If you enjoyed this post subscribe to get more free SQL tips.