UPDATE(Azure Databricks 上的 Delta Lake)UPDATE (Delta Lake on Azure Databricks)

更新与谓词匹配的行的列值。Updates the column values for the rows that match a predicate. 如果未提供谓词,则更新所有行的列值。When no predicate is provided, update the column values for all rows.


UPDATE table_identifier [AS alias] SET col1 = value1 [, col2 = value2 ...] [WHERE predicate]
  • table_identifiertable_identifier

    • [database_name.] table_name:表名,可选择使用数据库名称进行限定。[database_name.] table_name: A table name, optionally qualified with a database name.
    • delta.`<path-to-table>`:现有 Delta 表的位置。delta.`<path-to-table>`: The location of an existing Delta table.
  • AS 别名AS alias

    定义表别名。Define a table alias.


    按谓词筛选行。Filter rows by predicate.


UPDATE events SET eventType = 'click' WHERE eventType = 'clk'

UPDATEWHERE 谓词支持子查询,包括 INNOT INEXISTSNOT EXISTS 和标量子查询UPDATE supports subqueries in the WHERE predicate, including IN, NOT IN, EXISTS, NOT EXISTS, and scalar subqueries.

子查询示例Subquery Examples

UPDATE all_events
  SET session_time = 0, ignored = true
  WHERE session_time < (SELECT min(session_time) FROM good_events)

UPDATE orders AS t1
  SET order_status = 'returned'
  WHERE EXISTS (SELECT oid FROM returned_orders WHERE t1.oid = oid)

UPDATE events
  SET category = 'undefined'
  WHERE category NOT IN (SELECT category FROM events2 WHERE date > '2001-01-01')


不支持以下类型的子查询:The following types of subqueries are not supported:

  • 嵌套子查询,即一个子查询内的另一个子查询Nested subqueries, that is, a subquery inside another subquery
  • OR 中的 NOT IN 子查询,例如 a = 3 OR b NOT IN (SELECT c from t)A NOT IN subquery inside an OR, for example, a = 3 OR b NOT IN (SELECT c from t)

在大多数情况下,可以使用 NOT EXISTS 重写 NOT IN 子查询。In most cases, you can rewrite NOT IN subqueries using NOT EXISTS. 建议尽可能使用 NOT EXISTS,因为执行带有 NOT IN 子查询的 UPDATE 可能会速度较慢。We recommend using NOT EXISTS whenever possible, as UPDATE with NOT IN subqueries can be slow.