Parameter markers

Parameter markers are named or unnamed typed placeholder variables used to supply values from the API invoking the SQL statement.

Using parameter markers protects your code from SQL injection attacks since it clearly separates provided values from the SQL statements.

You cannot mix named and unnamed parameter markers in the same SQL statement.

You must not reference a parameter marker in a DDL statement, such as a generated column or DEFAULT definition, a view, or a SQL function.

Exceptions are references to parameter markers in the IDENTIFIER clause, which can be used to parameterize table or column names in certain DDL statements. See IDENTIFIER clause.

Parameter markers can be provided by:

Named parameter markers

Applies to: Databricks Runtime check marked yes 12.1 and above

Named parameter markers are typed placeholder variables. The API invoking the SQL statement must supply name-value pairs to associate each parameter marker with a value.

Syntax

 :parameter_name

Parameters

  • named_parameter_name

    A reference to a supplied parameter marker in form of an unqualified identifier.

Notes

You can reference the same parameter marker multiple times within the same SQL Statement. If no value has been bound to the parameter marker an UNBOUND_SQL_PARAMETER error is raised. You are not required to reference all supplied parameter markers.

The mandatory preceding : (colon) differentiates the namespace of named parameter markers from that of column names and SQL parameters.

Examples

The following example defines two parameter markers:

  • later: An INTERVAL HOUR with value 3.
  • x: A DOUBLE with value 15.0

x is referenced multiple times, while later is referenced once.

SQL

> DECLARE stmtStr = 'SELECT current_timestamp() + :later, :x * :x AS square';
> EXECUTE IMMEDIATE stmtStr USING INTERVAL '3' HOURS AS later, 15.0 AS x;
  2024-01-19 16:17:16.692303  225.00

Scala

import org.apache.spark.sql.SparkSession

val spark = SparkSession
  .builder()
  .appName("Spark named parameter marker example")
  .getOrCreate()

val argMap = Map("later" -> java.time.Duration.ofHours(3), "x" -> 15.0)
spark.sql(
  sqlText = "SELECT current_timestamp() + :later, :x * :x AS square",
  args = argMap).show()
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// |                    2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+

Java

import org.apache.spark.sql.*;
import static java.util.Map.entry;

SparkSession spark = SparkSession
  .builder()
  .appName("Java Spark named parameter marker example")
  .getOrCreate();

Map<String, String> argMap = Map.ofEntries(
  entry("later", java.time.Duration.ofHours(3)),
  entry("x", 15.0)
);

spark.sql(
  sqlText = "SELECT current_timestamp() + :later, :x * :x AS square",
  args = argMap).show();
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// |                    2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+

Python

spark.sql("SELECT :x * :y * :z AS volume", args = { "x" : 3, "y" : 4, "z"  : 5 }).show()
// +------+
// |volume|
// +------+
// |    60|
// +------+

Unnamed parameter markers

Applies to: Databricks Runtime check marked yes 13.3 and above

Unnamed parameter markers are typed placeholder variables. The API invoking the SQL statement must supply an array of arguments to associate each parameter marker with a value in the order in which they appear.

Syntax

 ?

Parameters

  • ?: A reference to a supplied parameter marker in form of a question mark.

Notes

Each occurrence of an unnamed parameter marker consumes a value provided by the API invoking the SQL statement in order. If no value has been bound to the parameter marker, an UNBOUND_SQL_PARAMETER error is raised. You are not required to consume all provided values.

Examples

The following example defines three parameter markers:

  • An INTERVAL HOUR with value 3.
  • Two DOUBLE with value 15.0 each.

Since the parameters are unnamed each provided value is consumed by at most one parameter.

SQL

> DECLARE stmtStr = 'SELECT current_timestamp() + ?, ? * ? AS square';
> EXECUTE IMMEDIATE stmtStr USING INTERVAL '3' HOURS, 15.0, 15.0;
  2024-01-19 16:17:16.692303  225.00

Scala

import org.apache.spark.sql.SparkSession

val spark = SparkSession
  .builder()
  .appName("Spark unnamed parameter marker example")
  .getOrCreate()

val argArr = Array(java.time.Duration.ofHours(3), 15.0, 15.0)

spark.sql(
  sqlText = "SELECT current_timestamp() + ?, ? * ? AS square", args = argArr).show()
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// |                    2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+

Java

import org.apache.spark.sql.*;

SparkSession spark = SparkSession
  .builder()
  .appName("Java Spark unnamed parameter marker example")
  .getOrCreate();

Object[] argArr = new Object[] { java.time.Duration.ofHours(3), 15.0, 15.0 }

spark.sql(
  sqlText = "SELECT current_timestamp() + ?, ? * ? AS square",
  args = argArr).show();
// +----------------------------------------+------+
// |current_timestamp() + INTERVAL '03' HOUR|square|
// +----------------------------------------+------+
// |                    2023-02-27 17:48:...|225.00|
// +----------------------------------------+------+

Python

spark.sql("SELECT ? * ? * ? AS volume", args = { 3, 4, 5 }).show()
// +------+
// |volume|
// +------+
// |    60|
// +------+