Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to: Databricks SQL
Databricks Runtime
A table reference is an intermediate result table within SQL. It can be derived from other operators, such as functions, joins or a subquery, reference a base table directly, or be constructed as an inline table.
table_reference
{ table_name [ TABLESAMPLE clause ] [ table_alias ] |
{ STREAM table_name [ table_alias ] | STREAM ( table_name ) [ table_alias ] } |
view_name [ table_alias ] |
table_reference JOIN clause |
table_reference PIVOT clause |
table_reference UNPIVOT clause |
[ STREAM ] table_valued_function [ table_alias ] |
[ LATERAL ] table_valued_function [ table_alias ] |
VALUES clause |
[ LATERAL ] ( query ) [ TABLESAMPLE clause ] [ table_alias ] }
-
Identifies a table that may contain a temporal specification. See Work with Delta Lake table history for details.
If the table cannot be found Azure Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
See Column, field, parameter, and variable resolution for more information on name resolution.
Applies to: **
Databricks Runtime 16.0 and later
table_name
may include an options specification. -
Identifies a view or a common table expression (CTE). If the view cannot be found Azure Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
See Column, field, parameter, and variable resolution for more information on name resolution.
STREAM
Returns a table or a table valued function as a streaming source. Tables cannot be provided with a temporal specification when used with the
STREAM
keyword. Streaming sources are most commonly used in the definitions of streaming tables.-
Combines two or more relations using a join.
-
Applies to:
Databricks SQL
Databricks Runtime 12.2 LTS and above.
Used for data perspective; you can get the aggregated values based on specific column value.
Prior to Databricks Runtime 12.0 PIVOT is limited to SELECT following the
FROM
clause. -
Applies to:
Databricks SQL
Databricks Runtime 12.2 LTS and above.
Used for data perspective; you can split multiple column groups into rows.
[LATERAL] table_valued_function_invocation
Invokes a table valued function. To refer to columns exposed by a preceding
table_reference
in the sameFROM
clause you must specifyLATERAL
.-
Defines an inline table.
[LATERAL] ( query )
Computes a table reference using a query. A query prefixed by
LATERAL
may reference columns exposed by a precedingtable_reference
in the sameFROM
clause. Such a construct is called a correlated or dependent query.-
Optionally reduce the size of the result set by only sampling a fraction of the rows.
-
Optionally specifies a label for the
table_reference
. If thetable_alias
includescolumn_identifier
s their number must match the number of columns in thetable_reference
.
In addition to the standard SELECT
options, Delta tables support the time travel options described in this section. For details, see Work with Delta Lake table history.
table_identifier TIMESTAMP AS OF timestamp_expression
table_identifier VERSION AS OF version
timestamp_expression
can be any one of:'2018-10-18T22:15:12.013Z'
, that is, a string that can be cast to a timestampcast('2018-10-18 13:36:32 CEST' as timestamp)
'2018-10-18'
, that is, a date stringcurrent_timestamp() - interval 12 hours
date_sub(current_date(), 1)
- Any other expression that is or can be cast to a timestamp
version
is a long value that can be obtained from the output ofDESCRIBE HISTORY table_spec
.
Neither timestamp_expression
nor version
can be subqueries.
> SELECT * FROM events TIMESTAMP AS OF '2018-10-18T22:15:12.013Z'
> SELECT * FROM events VERSION AS OF 123
Use the @
syntax to specify the timestamp or version. The timestamp must be in yyyyMMddHHmmssSSS
format. You can specify a version after @
by prepending a v
to the version. For example, to query version 123
for the table events
, specify events@v123
.
> SELECT * FROM events@20190101000000000
> SELECT * FROM events@v123
-- Return a data set from a storage location using a credential.
> SELECT * FROM `csv`.`spreadsheets/data.csv` WITH(CREDENTIAL some_credential);
-- Select from a jdbc data source while controlling the fetch size.
> SELECT * FROM jdbcTable WITH(fetchSize = 0);
-- select all referencable columns from all tables
> SELECT * FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3 4
-- select all referencable columns from one table
> SELECT t2.* FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
3 4
-- select all referencable columns from all tables except t2.c4
> SELECT * EXCEPT(c4) FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3
-- select all referencable columns from a table, except a nested field.
> SELECT * EXCEPT(c2.b) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { "a" : 2 }
-- Removing all fields results in an empty struct
> SELECT * EXCEPT(c2.b, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { }
-- Overlapping names result in an error
> SELECT * EXCEPT(c2, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
Error: EXCEPT_OVERLAPPING_COLUMNS