Notebook isolation
Notebook isolation refers to the visibility of variables and classes between notebooks. Azure Databricks supports two types of isolation:
- Variable and class isolation
- Spark session isolation
Note
Azure Databricks manages user isolation using access modes configured on clusters.
- No isolation shared: Multiple users can use the same cluster. Users share credentials set at the cluster level. No data access controls are enforced.
- Single User: Only the named user can use the cluster. All commands run with that user's privileges. Table ACLs in the Hive metastore are not enforced. This access mode supports Unity Catalog.
- Shared: Multiple users can use the same cluster. Users are fully isolated from one another, and each user runs commands with their own privileges. Table ACLs in the Hive metastore are enforced. This access mode supports Unity Catalog.
Variable and class isolation
Variables and classes are available only in the current notebook. For example, two notebooks attached to the same cluster can define variables and classes with the same name, but these objects are distinct.
To define a class that is visible to all notebooks attached to the same cluster, define the class in a package cell. Then you can access the class by using its fully qualified name, which is the same as accessing a class in an attached Scala or Java library.
Spark session isolation
Every notebook attached to a cluster has a pre-defined variable named spark
that represents a SparkSession
. SparkSession
is the entry point for using Spark APIs as well as setting runtime configurations.
Spark session isolation is enabled by default. You can also use global temporary views to share temporary views across notebooks. See CREATE VIEW. To disable Spark session isolation, set spark.databricks.session.share
to true
in the Spark configuration.
Important
Setting spark.databricks.session.share
true breaks the monitoring used by both streaming notebook cells and streaming jobs. Specifically:
- The graphs in streaming cells are not displayed.
- Jobs do not block as long as a stream is running (they just finish "successfully", stopping the stream).
- Streams in jobs are not monitored for termination. Instead you must manually call
awaitTermination()
. - Calling the Create a new visualization on streaming DataFrames doesn't work.
Cells that trigger commands in other languages (that is, cells using %scala
, %python
, %r
, and %sql
) and cells that include other notebooks (that is, cells using %run
) are part of the current notebook. Thus, these cells are in the same session as other notebook cells. By contrast, a notebook workflow runs a notebook with an isolated SparkSession
, which means temporary views defined in such a notebook are not visible in other notebooks.