Union v/s Union All
Often many ask, what is the difference between union and union all?
Union operator combines multiple resultsets into one resultset. Union also removes the duplicates and returns a sorted result set.
Union All operator combine multiple resultsets into one resultset, but it does not eliminate the duplicate rows. Since the duplicate records are preserved Union all runs faster than union and this does not return a sorted resultset.
As multiple resultsets are combined, indeed the Union and Union All should abide to few basic ground rules
- The resultsets should be identical in terms of number of columns and compatible data types
- The column names of the combined resultset will be same as the column names in the first resultset
Consider the recordsets that are mutually exclusive, Run the below queries in Management studio to create the temporary tables and populate those with some data.
USE DEMODB; CREATE TABLE #temp1 (id INT); CREATE TABLE #temp2 (id INT); GO INSERT INTO #temp1 SELECT number FROM master..spt_values WHERE type = 'p' AND number BETWEEN 1 AND 499; INSERT INTO #temp2 SELECT number FROM master..spt_values WHERE type = 'p' AND number BETWEEN 500 AND 1000;
Now run the below query in the management studio and capture the execution plan
SELECT *FROM #temp2 UNION SELECT *FROM #temp1 SELECT *FROM #temp2 UNION ALL SELECT *FROM #temp1
From the execution plan it is evident that Union is costlier than union all (using 82% of total batch time). You can notice that the union does a distinct sort to get rid of the duplicate records. In case of union all it just combines the resultset.
So if you don’t care about duplicates in the final resultset or know in hand that the resultsets you combine are mutually exclusive, it is better to use union all as Union all would yield better performance.