集合运算符Set operators

将两个输入关系合二为一。Combines two input relations into a single one. Spark SQL 支持 3 种类型的 set 运算符:Spark SQL supports three types of set operators:

  • EXCEPTMINUSEXCEPT or MINUS
  • INTERSECT
  • UNION

输入关系必须具有相同数量的列,并且各个列的数据类型兼容。Input relations must have the same number of columns and compatible data types for the respective columns.

EXCEPTEXCEPT

EXCEPTEXCEPT ALL 返回在一个关系中存在,但在另一关系中不存在的行。EXCEPT and EXCEPT ALL return the rows that are found in one relation but not the other. EXCEPT(或 EXCEPT DISTINCT)仅采用不同的行,而 EXCEPT ALL 不会从结果行中删除重复项。EXCEPT (alternatively, EXCEPT DISTINCT) takes only distinct rows while EXCEPT ALL does not remove duplicates from the result rows. 请注意,MINUSEXCEPT 的别名。Note that MINUS is an alias for EXCEPT.

语法Syntax

[ ( ] relation [ ) ] EXCEPT | MINUS [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例Examples

-- Use number1 and number2 tables to demonstrate set operators in this page.
SELECT * FROM number1;
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
+---+

SELECT * FROM number2;
+---+
|  c|
+---+
|  5|
|  1|
|  2|
|  2|
+---+

SELECT c FROM number1 EXCEPT SELECT c FROM number2;
+---+
|  c|
+---+
|  3|
|  4|
+---+

SELECT c FROM number1 MINUS SELECT c FROM number2;
+---+
|  c|
+---+
|  3|
|  4|
+---+

SELECT c FROM number1 EXCEPT ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+

SELECT c FROM number1 MINUS ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  3|
|  4|
+---+

INTERSECTINTERSECT

INTERSECTINTERSECT ALL 返回在两个关系中都存在的行。INTERSECT and INTERSECT ALL return the rows that are found in both relations. INTERSECT(或 INTERSECT DISTINCT)仅采用不同的行,而 INTERSECT ALL 不会从结果行中删除重复项。INTERSECT (alternatively, INTERSECT DISTINCT) takes only distinct rows while INTERSECT ALL does not remove duplicates from the result rows.

语法Syntax

[ ( ] relation [ ) ] INTERSECT [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例Examples

(SELECT c FROM number1) INTERSECT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
+---+

(SELECT c FROM number1) INTERSECT DISTINCT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
+---+

(SELECT c FROM number1) INTERSECT ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  2|
|  2|
+---+

UNIONUNION

UNIONUNION ALL 返回在任一关系中存在的行。UNION and UNION ALL return the rows that are found in either relation. UNION(或 UNION DISTINCT)仅采用不同的行,而 UNION ALL 不会从结果行中删除重复项。UNION (alternatively, UNION DISTINCT) takes only distinct rows while UNION ALL does not remove duplicates from the result rows.

语法Syntax

[ ( ] relation [ ) ] UNION [ ALL | DISTINCT ] [ ( ] relation [ ) ]

示例Examples

(SELECT c FROM number1) UNION (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+

(SELECT c FROM number1) UNION DISTINCT (SELECT c FROM number2);
+---+
|  c|
+---+
|  1|
|  3|
|  5|
|  4|
|  2|
+---+

SELECT c FROM number1 UNION ALL (SELECT c FROM number2);
+---+
|  c|
+---+
|  3|
|  1|
|  2|
|  2|
|  3|
|  4|
|  5|
|  1|
|  2|
|  2|
+---+