Often many ask, what is the difference between union and union all?

Union operator combines multiple resultsets into one resultset. Union also removes the duplicates and returns a sorted result set.

Union All  operator combine multiple resultsets into one resultset, but it does not eliminate the duplicate rows.  Since the duplicate records are preserved Union all runs faster than union and this does not return a sorted resultset.

As multiple resultsets are combined, indeed the Union and  Union All should abide to few basic ground rules

  • The resultsets should be identical in terms of number of columns and compatible data types
  • The column names of the combined resultset will be same as the column names in the first resultset 

Consider the recordsets that are mutually exclusive, Run the below queries in Management studio to create the temporary tables and populate those with some data.

USE DEMODB;
CREATE TABLE #temp1 (id INT);
CREATE TABLE #temp2 (id INT);
GO
INSERT INTO #temp1
SELECT number
FROM master..spt_values
WHERE type = 'p'
	AND number BETWEEN  1 AND 499;

INSERT INTO #temp2
SELECT number
FROM master..spt_values
WHERE type = 'p'
	AND number BETWEEN 500 AND 1000;

Now run the below query in the management studio and capture the execution plan

SELECT *FROM #temp2
UNION
SELECT *FROM #temp1

SELECT *FROM #temp2
UNION ALL
SELECT *FROM #temp1

From the execution plan it is evident that Union is costlier than union all (using 82% of total batch time). You can notice that the union does a distinct sort to get rid of the duplicate records. In case of union all it just combines the resultset.

Union-vs-UnionAll-1

 

So if you don’t care about duplicates in the final resultset or know in hand that the resultsets you combine are mutually exclusive, it is better to use union all as Union all would yield better performance.

 

Leave a Reply

Your email address will not be published. Required fields are marked *