Databricks runtime maintenance updates

This page lists maintenance updates issued for Databricks Runtime releases. To add a maintenance update to an existing cluster, restart the cluster.

Note

This article contains references to the term whitelist, a term that Azure Databricks does not use. When the term is removed from the software, we'll remove it from this article.

Databricks Runtime releases

Maintenance updates by release:

For the original release notes, follow the link below the subheading.

Databricks Runtime 11.1

See Databricks Runtime 11.1.

  • September 22nd, 2022

    • [SPARK-40315][SQL] Add hashCode() for Literal of ArrayBasedMapData
    • [SPARK-40380][SQL] Fix constant-folding of InvokeLike to avoid non-serializable literal embedded in the plan
    • [SPARK-40089][SQL] Fix sorting for some Decimal types
    • [SPARK-39887][SQL] RemoveRedundantAliases should keep aliases that make the output of projection nodes unique
    • [SPARK-40152][SQL] Fix split_part codegen compilation issue
  • September 6th, 2022

    • We have updated the permission model in Table Access Controls (Table ACLs) so that only MODIFY permissions are needed to change a table's schema or table properties with ALTER TABLE. Previously, these operations required a user to own the table. Ownership is still required to grant permissions on a table, change its owner, change its location, or rename it. This change makes the permission model for Table ACLs more consistent with Unity Catalog.
    • [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()
    • [SPARK-40212][SQL] SparkSQL castPartValue does not properly handle byte, short, or float
    • [SPARK-40218][SQL] GROUPING SETS should preserve the grouping columns
    • [SPARK-39976][SQL] ArrayIntersect should handle null in left expression correctly
    • [SPARK-40053][CORE][SQL][TESTS] Add assume to dynamic cancel cases which requiring Python runtime environment
    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022

    • Shares, providers, and recipients now support SQL commands to change owners, comment, rename
    • [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver
    • [SPARK-39912][SPARK-39828][SQL] Refine CatalogImpl
    • [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas
    • [SPARK-39806] Fixed the issue on queries accessing METADATA struct crash on partitioned tables
    • [SPARK-39867][SQL] Global limit should not inherit OrderPreservingUnaryNode
    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • [SPARK-39839][SQL] Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check
    • [SPARK-39713][SQL] ANSI mode: add suggestion of using try_element_at for INVALID_ARRAY_INDEX error
    • [SPARK-39847][SS] Fix race condition in RocksDBLoader.loadLibrary() if caller thread is interrupted
    • [SPARK-39731][SQL] Fix issue in CSV and JSON data sources when parsing dates in "yyyyMMdd" format with CORRECTED time parser policy
    • Operating system security updates.
  • August 10, 2022

    • [SPARK-39889] Enhance the error message of division by 0
    • [SPARK-39795] [SQL] New SQL function: try_to_timestamp
    • [SPARK-39749] Always use plain string representation on casting decimal as string under ANSI mode
    • [SPARK-39625] Rename df.as to df.to
    • [SPARK-39787] [SQL] Use error class in the parsing error of function to_timestamp
    • [SPARK-39625] [SQL] Add Dataset.as(StructType)
    • [SPARK-39689] Support 2-chars lineSep in CSV datasource
    • [SPARK-39579] [SQL][PYTHON][R] Make ListFunctions/getFunction/functionExists compatible with 3 layer namespace
    • [SPARK-39702] [CORE] Reduce memory overhead of TransportCipher$EncryptedMessage by using a shared byteRawChannel
    • [SPARK-39575] [AVRO] add ByteBuffer#rewind after ByteBuffer#get in AvroDeserializer
    • [SPARK-39265] [SQL] Fix test failure when SPARK_ANSI_SQL_MODE is enabled
    • [SPARK-39441] [SQL] Speed up DeduplicateRelations
    • [SPARK-39497] [SQL] Improve the analysis exception of missing map key column
    • [SPARK-39476] [SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
    • [SPARK-39434] [SQL] Provide runtime error query context when array index is out of bounding

Databricks Runtime 11.0

See Databricks Runtime 11.0.

  • September 22nd, 2022
    • [SPARK-40315][SQL] Add hashCode() for Literal of ArrayBasedMapData
    • [SPARK-40380][SQL] Fix constant-folding of InvokeLike to avoid non-serializable literal embedded in the plan
    • [SPARK-40089][SQL] Fix sorting for some Decimal types
    • [SPARK-39887][SQL] RemoveRedundantAliases should keep aliases that make the output of projection nodes unique
    • [SPARK-40152][SQL] Fix split_part codegen compilation issue
  • September 6th, 2022
    • [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()
    • [SPARK-40212][SQL] SparkSQL castPartValue does not properly handle byte, short, or float
    • [SPARK-40218][SQL] GROUPING SETS should preserve the grouping columns
    • [SPARK-39976][SQL] ArrayIntersect should handle null in left expression correctly
    • [SPARK-40053][CORE][SQL][TESTS] Add assume to dynamic cancel cases which requiring Python runtime environment
    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022
    • [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver
    • [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas
    • [SPARK-39806] Fixed the issue on queries accessing METADATA struct crash on partitioned tables
    • [SPARK-39867][SQL] Global limit should not inherit OrderPreservingUnaryNode
    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • Operating system security updates.
  • August 9, 2022
    • [SPARK-39713][SQL] ANSI mode: add suggestion of using try_element_at for INVALID_ARRAY_INDEX error
    • [SPARK-39847] Fix race condition in RocksDBLoader.loadLibrary() if caller thread is interrupted
    • [SPARK-39731][SQL] Fix issue in CSV and JSON data sources when parsing dates in "yyyyMMdd" format with CORRECTED time parser policy
    • [SPARK-39889] Enhance the error message of division by 0
    • [SPARK-39795][SQL] New SQL function: try_to_timestamp
    • [SPARK-39749] Always use plain string representation on casting decimal as string under ANSI mode
    • [SPARK-39625][SQL] Add Dataset.to(StructType)
    • [SPARK-39787][SQL] Use error class in the parsing error of function to_timestamp
    • Operating system security updates.
  • July 27, 2022
    • [SPARK-39689]Support 2-chars lineSep in CSV datasource
    • [SPARK-39104][SQL] InMemoryRelation#isCachedColumnBuffersLoaded should be thread-safe
    • [SPARK-39702][CORE] Reduce memory overhead of TransportCipher$EncryptedMessage by using a shared byteRawChannel
    • [SPARK-39575][AVRO] add ByteBuffer#rewind after ByteBuffer#get in AvroDeserializer
    • [SPARK-39497][SQL] Improve the analysis exception of missing map key column
    • [SPARK-39441][SQL] Speed up DeduplicateRelations
    • [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
    • [SPARK-39434][SQL] Provide runtime error query context when array index is out of bounding
    • [SPARK-39570][SQL] Inline table should allow expressions with alias
    • Operating system security updates.
  • July 13, 2022
    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • Fixed an issue for the cloud_files_state TVF when running on non-DBFS paths.
    • [SPARK-38796][SQL] Update to_number and try_to_number functions to allow PR with positive numbers
    • [SPARK-39272][SQL] Increase the start position of query context by 1
    • [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null
    • Operating system security updates.
  • July 5, 2022
    • Improvement on error messages for a range of error classes.
    • [SPARK-39451][SQL] Support casting intervals to integrals in ANSI mode
    • [SPARK-39361] Don't use Log4J2's extended throwable conversion pattern in default logging configurations
    • [SPARK-39354][SQL] Ensure show Table or view not found even if there are dataTypeMismatchError related to Filter at the same time
    • [SPARK-38675][CORE] Fix race during unlock in BlockInfoManager
    • [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints
    • [SPARK-39214][SQL][3.3] Improve errors related to CAST
    • [SPARK-37939][SQL] Use error classes in the parsing errors of properties
    • [SPARK-39085][SQL] Move the error message of INCONSISTENT_BEHAVIOR_CROSS_VERSION to error-classes.json
    • [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • Operating system security updates.

Databricks Runtime 10.5

See Databricks Runtime 10.5.

  • September 22nd, 2022
    • [SPARK-40315][SQL] Add hashCode() for Literal of ArrayBasedMapData
    • [SPARK-40213][SQL] Support ASCII value conversion for Latin-1 characters
    • [SPARK-40380][SQL] Fix constant-folding of InvokeLike to avoid non-serializable literal embedded in the plan
    • [SPARK-38404][SQL] Improve CTE resolution when a nested CTE references an outer CTE
    • [SPARK-40089][SQL] Fix sorting for some Decimal types
    • [SPARK-39887][SQL] RemoveRedundantAliases should keep aliases that make the output of projection nodes unique
  • September 6th, 2022
    • [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()
    • [SPARK-39976][SQL] ArrayIntersect should handle null in left expression correctly
    • [SPARK-40053][CORE][SQL][TESTS] Add assume to dynamic cancel cases which requiring Python runtime environment
    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022
    • [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver
    • [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas
    • [SPARK-39806] Fixed the issue on queries accessing METADATA struct crash on partitioned tables
    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • [SPARK-37643][SQL] when charVarcharAsString is true, for char datatype predicate query should skip rpadding rule
    • Operating system security updates.
  • August 9, 2022
    • [SPARK-39847] Fix race condition in RocksDBLoader.loadLibrary() if caller thread is interrupted
    • [SPARK-39731][SQL] Fix issue in CSV and JSON data sources when parsing dates in "yyyyMMdd" format with CORRECTED time parser policy
    • Operating system security updates.
  • July 27, 2022
    • [SPARK-39625][SQL] Add Dataset.as(StructType)
    • [SPARK-39689]Support 2-chars lineSep in CSV datasource
    • [SPARK-39104][SQL] InMemoryRelation#isCachedColumnBuffersLoaded should be thread-safe
    • [SPARK-39570][SQL] Inline table should allow expressions with alias
    • [SPARK-39702][CORE] Reduce memory overhead of TransportCipher$EncryptedMessage by using a shared byteRawChannel
    • [SPARK-39575][AVRO] add ByteBuffer#rewind after ByteBuffer#get in AvroDeserializer
    • [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
    • Operating system security updates.
  • July 13, 2022
    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • [SPARK-39355][SQL] Single column uses quoted to construct UnresolvedAttribute
    • [SPARK-39548][SQL] CreateView Command with a window clause query hit a wrong window definition not found issue
    • [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null
    • Operating system security updates.
  • July 5, 2022
    • [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN
    • Operating system security updates.
  • June 15, 2022
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • [SPARK-34096][SQL] Improve performance for nth_value ignore nulls over offset window
    • [SPARK-36718][SQL][FOLLOWUP] Fix the isExtractOnly check in CollapseProject
  • June 2, 2022
    • [SPARK-39166][SQL] Provide runtime error query context for binary arithmetic when WSCG is off
    • [SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral
    • [SPARK-38990][SQL] Avoid NullPointerException when evaluating date_trunc/trunc format as a bound reference
    • Operating system security updates.
  • May 18, 2022
    • Fixes a potential native memory leak in Auto Loader.
    • [SPARK-38868][SQL]Don't propagate exceptions from filter predicate when optimizing outer joins
    • [SPARK-38796][SQL] Implement the to_number and try_to_number SQL functions according to a new specification
    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • [SPARK-38929][SQL] Improve error messages for cast failures in ANSI
    • [SPARK-38926][SQL] Output types in error messages in SQL style
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • [SPARK-32268][SQL] Add ColumnPruning in injectBloomFilter
    • [SPARK-38908][SQL] Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean
    • [SPARK-39046][SQL] Return an empty context string if TreeNode.origin is wrongly set
    • [SPARK-38974][SQL] Filter registered functions with a given database name in list functions
    • [SPARK-38762][SQL] Provide query context in Decimal overflow errors
    • [SPARK-38931][SS] Create root dfs directory for RocksDBFileManager with unknown number of keys on 1st checkpoint
    • [SPARK-38992][CORE] Avoid using bash -c in ShellBasedGroupsMappingProvider
    • [SPARK-38716][SQL] Provide query context in map key not exists error
    • [SPARK-38889][SQL] Compile boolean column filters to use the bit type for MSSQL data source
    • [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod
    • [SPARK-38823][SQL] Make NewInstance non-foldable to fix aggregation buffer corruption issue
    • [SPARK-38809][SS] Implement option to skip null values in symmetric hash implementation of stream-stream joins
    • [SPARK-38676][SQL] Provide SQL query context in runtime error message of Add/Subtract/Multiply
    • [SPARK-38677][PYSPARK] Python MonitorThread should detect deadlock due to blocking I/O
    • Operating system security updates.

Databricks Runtime 10.4

See Databricks Runtime 10.4 LTS.

  • September 22nd, 2022
    • Users can set spark.conf.set("spark.databricks.io.listKeysWithPrefix.azure.enabled", "true") to re-enable native listing for Auto Loader on ADLS Gen2. Native listing was previously turned off due to performance issues, but may have led to an increase in storage costs for customers.
    • [SPARK-40315][SQL] Add hashCode() for Literal of ArrayBasedMapData
    • [SPARK-40213][SQL] Support ASCII value conversion for Latin-1 characters
    • [SPARK-40380][SQL] Fix constant-folding of InvokeLike to avoid non-serializable literal embedded in the plan
    • [SPARK-38404][SQL] Improve CTE resolution when a nested CTE references an outer CTE
    • [SPARK-40089][SQL] Fix sorting for some Decimal types
    • [SPARK-39887][SQL] RemoveRedundantAliases should keep aliases that make the output of projection nodes unique
  • September 6th, 2022
    • [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()
    • [SPARK-40218][SQL] GROUPING SETS should preserve the grouping columns
    • [SPARK-39976][SQL] ArrayIntersect should handle null in left expression correctly
    • [SPARK-40053][CORE][SQL][TESTS] Add assume to dynamic cancel cases which requiring Python runtime environment
    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022
    • [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver
    • [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas
    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • [SPARK-37643][SQL] when charVarcharAsString is true, for char datatype predicate query should skip rpadding rule
    • Operating system security updates.
  • August 9, 2022
    • [SPARK-39847] Fix race condition in RocksDBLoader.loadLibrary() if caller thread is interrupted
    • [SPARK-39731][SQL] Fix issue in CSV and JSON data sources when parsing dates in "yyyyMMdd" format with CORRECTED time parser policy
    • Operating system security updates.
  • July 27, 2022
    • [SPARK-39625][SQL] Add Dataset.as(StructType)
    • [SPARK-39689]Support 2-chars lineSep in CSV datasource
    • [SPARK-39104][SQL] InMemoryRelation#isCachedColumnBuffersLoaded should be thread-safe
    • [SPARK-39570][SQL] Inline table should allow expressions with alias
    • [SPARK-39702][CORE] Reduce memory overhead of TransportCipher$EncryptedMessage by using a shared byteRawChannel
    • [SPARK-39575][AVRO] add ByteBuffer#rewind after ByteBuffer#get in AvroDeserializer
    • [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
    • [SPARK-38868][SQL] Don't propagate exceptions from filter predicate when optimizing outer joins
    • Operating system security updates.
  • July 20, 2022
    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • [SPARK-39355][SQL] Single column uses quoted to construct UnresolvedAttribute
    • [SPARK-39548][SQL] CreateView Command with a window clause query hit a wrong window definition not found issue
    • [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null
    • Disabled Auto Loader's use of native cloud APIs for directory listing on Azure.
    • Operating system security updates.
  • July 5, 2022
    • [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN
    • Operating system security updates.
  • June 15, 2022
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • [SPARK-34096][SQL] Improve performance for nth_value ignore nulls over offset window
    • [SPARK-36718][SQL][FOLLOWUP] Fix the isExtractOnly check in CollapseProject
  • June 2, 2022
    • [SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral
    • [SPARK-38990][SQL] Avoid NullPointerException when evaluating date_trunc/trunc format as a bound reference
    • Operating system security updates.
  • May 18, 2022
    • Fixes a potential native memory leak in Auto Loader.
    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • [SPARK-37593][CORE] Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • [SPARK-32268][SQL] Add ColumnPruning in injectBloomFilter
    • [SPARK-38974][SQL] Filter registered functions with a given database name in list functions
    • [SPARK-38931][SS] Create root dfs directory for RocksDBFileManager with unknown number of keys on 1st checkpoint
    • Operating system security updates.
  • April 19, 2022
    • Upgraded Java AWS SDK from version 1.11.655 to 1.12.1899.
    • Fixed an issue with notebook-scoped libraries not working in batch streaming jobs.
    • [SPARK-38616][SQL] Keep track of SQL query text in Catalyst TreeNode
    • Operating system security updates.
  • April 6, 2022
    • The following Spark SQL function are now available with this release:
      • timestampadd() and dateadd(): Add a time duration in a specified unit to a timestamp expression.
      • timestampdiff() and datediff(): Calculate the time difference in a specified unit between two timestamp expressions.
    • Parquet-MR has been upgraded to 1.12.2
    • Improved support for wide schemas in parquet files
    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • [SPARK-38509][SPARK-38481] Cherry-pick 3 timestmapadd/diff related changes
    • [SPARK-38523][SQL] Fix referring to the corrupt record column from CSV
    • [SPARK-38237][SQL][SS] Allow ClusteredDistribution to require full clustering keys
    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • Operating system security updates.

Databricks Runtime 10.3 (Unsupported)

See Databricks Runtime 10.3 (Unsupported).

  • July 27, 2022
    • [SPARK-39689]Support 2-chars lineSep in CSV datasource
    • [SPARK-39104][SQL] InMemoryRelation#isCachedColumnBuffersLoaded should be thread-safe
    • [SPARK-39702][CORE] Reduce memory overhead of TransportCipher$EncryptedMessage by using a shared byteRawChannel
    • Operating system security updates.
  • July 20, 2022
    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
    • [SPARK-39548][SQL] CreateView Command with a window clause query hit a wrong window definition not found issue
    • [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null
    • Operating system security updates.
  • July 5, 2022
    • [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN
    • Operating system security updates.
  • June 15, 2022
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • [SPARK-34096][SQL] Improve performance for nth_value ignore nulls over offset window
    • [SPARK-36718][SQL][FOLLOWUP] Fix the isExtractOnly check in CollapseProject
  • June 2, 2022
    • [SPARK-38990][SQL] Avoid NullPointerException when evaluating date_trunc/trunc format as a bound reference
    • Operating system security updates.
  • May 18, 2022
    • Fixes a potential native memory leak in Auto Loader.
    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • [SPARK-37593][CORE] Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • [SPARK-32268][SQL] Add ColumnPruning in injectBloomFilter
    • [SPARK-38974][SQL] Filter registered functions with a given database name in list functions
    • [SPARK-38889][SQL] Compile boolean column filters to use the bit type for MSSQL data source
    • Operating system security updates.
  • May 4, 2022
    • Upgraded Java AWS SDK from version 1.11.655 to 1.12.1899.
  • April 19, 2022
    • [SPARK-38616][SQL] Keep track of SQL query text in Catalyst TreeNode
    • Operating system security updates.
  • April 6, 2022
    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • Operating system security updates.
  • March 22, 2022
    • Changed the current working directory of notebooks on High Concurrency clusters with either table access control or credential passthrough enabled to the user's home directory. Previously, the working directory was /databricks/driver.
    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • [SPARK-38325][SQL] ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt()
  • March 14, 2022
    • Improved transaction conflict detection for empty transactions in Delta Lake.
    • [SPARK-38185][SQL] Fix data incorrect if aggregate function is empty
    • [SPARK-38318][SQL] regression when replacing a dataset view
    • [SPARK-38236][SQL] Absolute file paths specified in create/alter table are treated as relative
    • [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
    • [SPARK-34069][SQL] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
    • [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp
  • February 23, 2022
    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet

Databricks Runtime 10.2 (Unsupported)

See Databricks Runtime 10.2 (Unsupported).

  • June 15, 2022
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • [SPARK-34096][SQL] Improve performance for nth_value ignore nulls over offset window
  • June 2, 2022
    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • [SPARK-38990][SQL] Avoid NullPointerException when evaluating date_trunc/trunc format as a bound reference
    • Operating system security updates.
  • May 18, 2022
    • Fixes a potential native memory leak in Auto Loader.
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • [SPARK-38889][SQL] Compile boolean column filters to use the bit type for MSSQL data source
    • [SPARK-38931][SS] Create root dfs directory for RocksDBFileManager with unknown number of keys on 1st checkpoint
    • Operating system security updates.
  • May 4, 2022
    • Upgraded Java AWS SDK from version 1.11.655 to 1.12.1899.
  • April 19, 2022
    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 6, 2022
    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • Operating system security updates.
  • March 22, 2022
    • Changed the current working directory of notebooks on High Concurrency clusters with either table access control or credential passthrough enabled to the user's home directory. Previously, the working directory was /databricks/driver.
    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • [SPARK-38325][SQL] ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt()
  • March 14, 2022
    • Improved transaction conflict detection for empty transactions in Delta Lake.
    • [SPARK-38185][SQL] Fix data incorrect if aggregate function is empty
    • [SPARK-38318][SQL] regression when replacing a dataset view
    • [SPARK-38236][SQL] Absolute file paths specified in create/alter table are treated as relative
    • [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
    • [SPARK-34069][SQL] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
    • [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp
  • February 23, 2022
    • [SPARK-37577][SQL] Fix ClassCastException: ArrayType cannot be cast to StructType for Generate Pruning
  • February 8, 2022
    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet.
    • Operating system security updates.
  • February 1, 2022
    • Operating system security updates.
  • January 26, 2022
    • Fixed a bug where concurrent transactions on Delta tables could commit in a non-serializable order under certain rare conditions.
    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022
    • Introduced support for inlining temporary credentials to COPY INTO for loading the source data without requiring SQL ANY_FILE permissions
    • Bug fixes and security enhancements.
  • December 20, 2021
    • Fixed a rare bug with Parquet column index based filtering.

Databricks Runtime 10.1 (Unsupported)

See Databricks Runtime 10.1 (Unsupported).

  • June 15, 2022
    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
    • [SPARK-39285][SQL] Spark should not check field names when reading files
    • [SPARK-34096][SQL] Improve performance for nth_value ignore nulls over offset window
  • June 2, 2022
    • Operating system security updates.
  • May 18, 2022
    • Fixes a potential native memory leak in Auto Loader.
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • [SPARK-38889][SQL] Compile boolean column filters to use the bit type for MSSQL data source
    • Operating system security updates.
  • April 19, 2022
    • [SPARK-37270][SQL] Fix push foldable into CaseWhen branches if elseValue is empty
    • Operating system security updates.
  • April 6, 2022
    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • Operating system security updates.
  • March 22, 2022
    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • [SPARK-38325][SQL] ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt()
  • March 14, 2022
    • Improved transaction conflict detection for empty transactions in Delta Lake.
    • [SPARK-38185][SQL] Fix data incorrect if aggregate function is empty
    • [SPARK-38318][SQL] regression when replacing a dataset view
    • [SPARK-38236][SQL] Absolute file paths specified in create/alter table are treated as relative
    • [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
    • [SPARK-34069][SQL] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
    • [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp
  • February 23, 2022
    • [SPARK-37577][SQL] Fix ClassCastException: ArrayType cannot be cast to StructType for Generate Pruning
  • February 8, 2022
    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet.
    • Operating system security updates.
  • February 1, 2022
    • Operating system security updates.
  • January 26, 2022
    • Fixed a bug where concurrent transactions on Delta tables could commit in a non-serializable order under certain rare conditions.
    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022
    • Introduced support for inlining temporary credentials to COPY INTO for loading the source data without requiring SQL ANY_FILE permissions
    • Fixed an out of memory issue with query result caching under certain conditions.
    • Fixed an issue with USE DATABASE when a user switches the current catalog to a non-default catalog.
    • Bug fixes and security enhancements.
    • Operating system security updates.
  • December 20, 2021
    • Fixed a rare bug with Parquet column index based filtering.

Databricks Runtime 10.0 (Unsupported)

See Databricks Runtime 10.0 (Unsupported).

  • April 19, 2022

    • [SPARK-37270][SQL] Fix push foldable into CaseWhen branches if elseValue is empty
    • Operating system security updates.
  • April 6, 2022

    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • Operating system security updates.
  • March 22, 2022

    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • [SPARK-38325][SQL] ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt()
  • March 14, 2022

    • Improved transaction conflict detection for empty transactions in Delta Lake.
    • [SPARK-38185][SQL] Fix data incorrect if aggregate function is empty
    • [SPARK-38318][SQL] regression when replacing a dataset view
    • [SPARK-38236][SQL] Absolute file paths specified in create/alter table are treated as relative
    • [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
    • [SPARK-34069][SQL] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
    • [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp
  • February 23, 2022

    • [SPARK-37577][SQL] Fix ClassCastException: ArrayType cannot be cast to StructType for Generate Pruning
  • February 8, 2022

    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet.
    • [SPARK-36905][SQL] Fix reading hive views without explicit column names
    • [SPARK-37859][SQL] Fix issue that SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 26, 2022

    • Fixed a bug where concurrent transactions on Delta tables could commit in a non-serializable order under certain rare conditions.
    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022

    • Bug fixes and security enhancements.
    • Operating system security updates.
  • December 20, 2021

    • Fixed a rare bug with Parquet column index based filtering.
  • November 9, 2021

    • Introduced additional configuration flags to enable fine grained control of ANSI behaviors.
  • November 4, 2021

    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • November 30, 2021

    • Fixed an issue with timestamp parsing where a timezone string without a colon was considered invalid.
    • Fixed an out of memory issue with query result caching under certain conditions.
    • Fixed an issue with USE DATABASE when a user switches the current catalog to a non-default catalog.

Databricks Runtime 9.1 LTS

See Databricks Runtime 9.1 LTS.

  • September 22nd, 2022

    • Users can set spark.conf.set("spark.databricks.io.listKeysWithPrefix.azure.enabled", "true") to re-enable native listing for Auto Loader on ADLS Gen2. Native listing was previously turned off due to performance issues, but may have led to an increase in storage costs for customers.
    • [SPARK-40315][SQL] Add hashCode() for Literal of ArrayBasedMapData
    • [SPARK-40089][SQL] Fix sorting for some Decimal types
    • [SPARK-39887][SQL] RemoveRedundantAliases should keep aliases that make the output of projection nodes unique
  • September 6th, 2022

    • [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()
    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022

    • [SPARK-39666][SQL] Use UnsafeProjection.create to respect spark.sql.codegen.factoryMode in ExpressionEncoder
    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • Operating system security updates.
  • August 9, 2022

    • Operating system security updates.
  • July 27, 2022

    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • [SPARK-39689]Support 2-chars lineSep in CSV datasource
    • [SPARK-39575][AVRO] add ByteBuffer#rewind after ByteBuffer#get in AvroDeserializer
    • [SPARK-37392][SQL] Fix the performance bug when inferring constraints for Generate
    • Operating system security updates.
  • July 13, 2022

    • [SPARK-39419][SQL] Fix ArraySort to throw an exception when the comparator returns null
    • Disabled Auto Loader's use of native cloud APIs for directory listing on Azure.
    • Operating system security updates.
  • July 5, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • June 15, 2022

    • [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator
  • June 2, 2022

    • [SPARK-34554][SQL] Implement the copy() method in ColumnarMap
    • Operating system security updates.
  • May 18, 2022

    • Fixes a potential native memory leak in Auto Loader.
    • Upgrade AWS SDK version from 1.11.655 to 1.11.678.
    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • [SPARK-39084][PYSPARK] Fix df.rdd.isEmpty() by using TaskContext to stop iterator on task completion
    • Operating system security updates.
  • April 19, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 6, 2022

    • [SPARK-38631][CORE] Uses Java-based implementation for un-tarring at Utils.unpack
    • Operating system security updates.
  • March 22, 2022

    • Changed the current working directory of notebooks on High Concurrency clusters with either table access control or credential passthrough enabled to the user's home directory. Previously, the working directory was /databricks/driver.
    • [SPARK-38437][SQL] Lenient serialization of datetime from datasource
    • [SPARK-38180][SQL] Allow safe up-cast expressions in correlated equality predicates
    • [SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries with unsupported predicates
    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet
  • March 14, 2022

    • [SPARK-38236][SQL] Absolute file paths specified in create/alter table are treated as relative
    • [SPARK-34069][SQL] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
  • February 23, 2022

    • [SPARK-37859][SQL] Do not check for metadata during schema comparison
  • February 8, 2022

    • [SPARK-27442][SQL] Remove check field name when reading/writing data in parquet.
    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 26, 2022

    • Fixed a bug where concurrent transactions on Delta tables could commit in a non-serializable order under certain rare conditions.
    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022

    • Bug fixes and security enhancements.
    • Operating system security updates.
  • November 4, 2021

    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • October 20, 2021

    • Upgraded BigQuery connector from 0.18.1 to 0.22.2. This adds support for BigNumeric type.

Databricks Runtime 9.0 (Unsupported)

See Databricks Runtime 9.0 (Unsupported).

  • February 8, 2022

    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 26, 2022

    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022

    • Bug fixes and security enhancements.
    • Operating system security updates.
  • November 4, 2021

    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • September 22, 2021

    • Fixed a bug in cast Spark array with null to string
  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
  • September 8, 2021

    • Added support for schema name (databaseName.schemaName.tableName format) as the target table name for Azure Synapse Connector.
    • Added geometry and geography JDBC types support for Spark SQL.
    • [SPARK-33527][SQL] Extended the function of decode to be consistent with mainstream databases.
    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021

    • SQL Server driver library was upgraded to 9.2.1.jre8.
    • Snowflake connector was upgraded to 2.9.0.
    • Fixed broken link to best trial notebook on AutoML experiment page.

Databricks Runtime 8.4 (Unsupported)

See Databricks Runtime 8.4 (Unsupported).

  • January 19, 2022

    • Operating system security updates.
  • November 4, 2021

    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • September 22, 2021

    • Spark JDBC driver was upgraded to 2.6.19.1030
    • [SPARK-36734][SQL] Upgrade ORC to 1.5.1
  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
    • Operating system security updates.
  • September 8, 2021

    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021

    • SQL Server driver library was upgraded to 9.2.1.jre8.
    • Snowflake connector was upgraded to 2.9.0.
    • Fixes a bug in credential passthrough caused by the new Parquet prefetch optimization, where user's passthrough credential might not be found during file access.
  • August 11, 2021

    • Fixes a RocksDB incompatibility problem that prevents older Databricks Runtime 8.4. This fixes forward compatibility for Auto Loader, COPY INTO, and stateful streaming applications.
    • Fixes a bug when using Auto Loader to read CSV files with mismatching header files. If column names do not match, the column would be filled in with nulls. Now, if a schema is provided, it assumes the schema is the same and will only save column mismatches if rescued data columns are enabled.
    • Adds a new option called externalDataSource into the Azure Synapse connector to remove the CONTROL permission requirement on the database for PolyBase reading.
  • July 29, 2021

    • [SPARK-36034][BUILD] Rebase datetime in pushed down filters to Parquet
    • [SPARK-36163][BUILD] Propagate correct JDBC properties in JDBC connector provider and add connectionProvider option

Databricks Runtime 8.3 (Unsupported)

See Databricks Runtime 8.3 (Unsupported).

  • January 19, 2022
    • Operating system security updates.
  • November 4, 2021
    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • September 22, 2021
    • Spark JDBC driver was upgraded to 2.6.19.1030
  • September 15, 2021
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
    • Operating system security updates.
  • September 8, 2021
    • [SPARK-35700][SQL][WARMFIX] Read char/varchar orc table when created and written by external systems.
    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021
    • SQL Server driver library was upgraded to 9.2.1.jre8.
    • Snowflake connector was upgraded to 2.9.0.
    • Fixes a bug in credential passthrough caused by the new Parquet prefetch optimization, where user's passthrough credential might not be found during file access.
  • August 11, 2021
    • Fixes a bug when using Auto Loader to read CSV files with mismatching header files. If column names do not match, the column would be filled in with nulls. Now, if a schema is provided, it assumes the schema is the same and will only save column mismatches if rescued data columns are enabled.
  • July 29, 2021
    • Upgrade Databricks Snowflake Spark connector to 2.9.0-spark-3.1
    • [SPARK-36034][BUILD] Rebase datetime in pushed down filters to Parquet
    • [SPARK-36163][BUILD] Propagate correct JDBC properties in JDBC connector provider and add connectionProvider option
  • July 14, 2021
    • Fixed an issue when using column names with dots in Azure Synapse connector.
    • Introduced database.schema.table format for Synapse Connector.
    • Added support to provide databaseName.schemaName.tableName format as the target table instead of only schemaName.tableName or tableName.
  • June 15, 2021
    • Fixed a NoSuchElementException bug in Delta Lake optimized writes that can happen when writing large amounts of data and encountering executor losses
    • Adds SQL CREATE GROUP, DROP GROUP, ALTER GROUP, SHOW GROUPS, and SHOW USERS commands. For details, see Security statements and Show statements.

Databricks Runtime 8.2 (Unsupported)

See Databricks Runtime 8.2 (Unsupported).

  • September 22, 2021

    • Operating system security updates.
  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
  • September 8, 2021

    • [SPARK-35700][SQL][WARMFIX] Read char/varchar orc table when created and written by external systems.
    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021

    • Snowflake connector was upgraded to 2.9.0.
  • August 11, 2021

    • [SPARK-36034][SQL] Rebase datetime in pushed down filters to parquet.
  • July 29, 2021

    • Upgrade Databricks Snowflake Spark connector to 2.9.0-spark-3.1
    • [SPARK-36163][BUILD] Propagate correct JDBC properties in JDBC connector provider and add connectionProvider option
  • July 14, 2021

    • Fixed an issue when using column names with dots in Azure Synapse connector.
    • Introduced database.schema.table format for Synapse Connector.
    • Added support to provide databaseName.schemaName.tableName format as the target table instead of only schemaName.tableName or tableName.
    • Fixed a bug that prevents users from time traveling to older available versions with Delta tables.
  • June 15, 2021

    • Fixes a NoSuchElementException bug in Delta Lake optimized writes that can happen when writing large amounts of data and encountering executor losses
  • May 26, 2021

    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
    • Fixed an OOM issue when Auto Loader reports Structured Streaming progress metrics.

Databricks Runtime 8.1 (Unsupported)

See Databricks Runtime 8.1 (Unsupported).

  • September 22, 2021

    • Operating system security updates.
  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
  • September 8, 2021

    • [SPARK-35700][SQL][WARMFIX] Read char/varchar orc table when created and written by external systems.
    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021

    • Snowflake connector was upgraded to 2.9.0.
  • August 11, 2021

    • [SPARK-36034][SQL] Rebase datetime in pushed down filters to parquet.
  • July 29, 2021

    • Upgrade Databricks Snowflake Spark connector to 2.9.0-spark-3.1
    • [SPARK-36163][BUILD] Propagate correct JDBC properties in JDBC connector provider and add connectionProvider option
  • July 14, 2021

    • Fixed an issue when using column names with dots in Azure Synapse connector.
    • Fixed a bug that prevents users from time traveling to older available versions with Delta tables.
  • June 15, 2021

    • Fixes a NoSuchElementException bug in Delta Lake optimized writes that can happen when writing large amounts of data and encountering executor losses
  • May 26, 2021

    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • Fixed an OOM issue when Auto Loader reports Structured Streaming progress metrics.
  • April 27, 2021

    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
    • [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type
    • [SPARK-35014] Fix the PhysicalAggregation pattern to not rewrite foldable expressions
    • [SPARK-34769][SQL] AnsiTypeCoercion: return narrowest convertible type among TypeCollection
    • [SPARK-34614][SQL] ANSI mode: Casting String to Boolean will throw exception on parse error
    • [SPARK-33794][SQL] ANSI mode: Fix NextDay expression to throw runtime IllegalArgumentException when receiving invalid input under

Databricks Runtime 8.0 (Unsupported)

See Databricks Runtime 8.0 (Unsupported).

  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
  • August 25, 2021

    • Snowflake connector was upgraded to 2.9.0.
  • August 11, 2021

    • [SPARK-36034][SQL] Rebase datetime in pushed down filters to parquet.
  • July 29, 2021

    • [SPARK-36163][BUILD] Propagate correct JDBC properties in JDBC connector provider and add connectionProvider option
  • July 14, 2021

    • Fixed an issue when using column names with dots in Azure Synapse connector.
    • Fixed a bug that prevents users from time traveling to older available versions with Delta tables.
  • May 26, 2021

    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
  • March 24, 2021

    • [SPARK-34681][SQL] Fix bug for full outer shuffled hash join when building left side with non-equal condition
    • [SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
    • [SPARK-34613][SQL] Fix view does not capture disable hint config
  • March 9, 2021

    • [SPARK-34543][SQL] Respect the spark.sql.caseSensitive config while resolving partition spec in v1 SET LOCATION
    • [SPARK-34392][SQL] Support ZoneOffset +h:mm in DateTimeUtils. getZoneId
    • [UI] Fix the href link of Spark DAG Visualization
    • [SPARK-34436][SQL] DPP support LIKE ANY/ALL expression

Databricks Runtime 7.6 (Unsupported)

See Databricks Runtime 7.6 (Unsupported).

  • August 11, 2021
    • [SPARK-36034][SQL] Rebase datetime in pushed down filters to parquet.
  • July 29, 2021
    • [SPARK-32998][BUILD] Add ability to override default remote repos with internal repos only
  • July 14, 2021
    • Fixed a bug that prevents users from time traveling to older available versions with Delta tables.
  • May 26, 2021
    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021
    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
  • March 24, 2021
    • [SPARK-34768][SQL] Respect the default input buffer size in Univocity
    • [SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
  • March 9, 2021
    • (Azure only) Fixed an Auto Loader bug that can cause NullPointerException when using Databricks Runtime 7.6 to run an old Auto Loader stream created in Databricks Runtime 7.2
    • [UI] Fix the href link of Spark DAG Visualization
    • Unknown leaf-node SparkPlan is not handled correctly in SizeInBytesOnlyStatsSparkPlanVisitor
    • Restore the output schema of SHOW DATABASES
    • [Delta][8.0, 7.6] Fixed calculation bug in file size auto-tuning logic
    • Disable staleness check for Delta table files in disk cache
    • [SQL] Use correct dynamic pruning build key when range join hint is present
    • Disable char type support in non-SQL code path
    • Avoid NPE in DataFrameReader.schema
    • Fix NPE when EventGridClient response has no entity
    • Fix a read closed stream bug in Azure Auto Loader
    • [SQL] Do not generate shuffle partition number advice when AOS is enabled
  • February 24, 2021
    • Upgraded the Spark BigQuery connector to v0.18, which introduces various bug fixes and support for Arrow and Avro iterators.
    • Fixed a correctness issue that caused Spark to return incorrect results when the Parquet file's decimal precision and scale are different from the Spark schema.
    • Fixed reading failure issue on Microsoft SQL Server tables that contain spatial data types, by adding geometry and geography JDBC types support for Spark SQL.
    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
    • [SPARK-34212] Fixed issues related to reading decimal data from Parquet files.
    • [SPARK-34260][SQL] Fix UnresolvedException when creating temp view twice.

Databricks Runtime 7.5 (Unsupported)

See Databricks Runtime 7.5 (Unsupported).

  • May 26, 2021
    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021
    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
  • March 24, 2021
    • [SPARK-34768][SQL] Respect the default input buffer size in Univocity
    • [SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
  • March 9, 2021
    • (Azure only) Fixed an Auto Loader bug that can cause NullPointerException when using Databricks Runtime 7.5 to run an old Auto Loader stream created in Databricks Runtime 7.2.
    • [UI] Fix the href link of Spark DAG Visualization
    • Unknown leaf-node SparkPlan is not handled correctly in SizeInBytesOnlyStatsSparkPlanVisitor
    • Restore the output schema of SHOW DATABASES
    • Disable staleness check for Delta table files in disk cache
    • [SQL] Use correct dynamic pruning build key when range join hint is present
    • Disable char type support in non-SQL code path
    • Avoid NPE in DataFrameReader.schema
    • Fix NPE when EventGridClient response has no entity
    • Fix a read closed stream bug in Azure Auto Loader
  • February 24, 2021
    • Upgraded the Spark BigQuery connector to v0.18, which introduces various bug fixes and support for Arrow and Avro iterators.
    • Fixed a correctness issue that caused Spark to return incorrect results when the Parquet file's decimal precision and scale are different from the Spark schema.
    • Fixed reading failure issue on Microsoft SQL Server tables that contain spatial data types, by adding geometry and geography JDBC types support for Spark SQL.
    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
    • [SPARK-34212] Fixed issues related to reading decimal data from Parquet files.
    • [SPARK-34260][SQL] Fix UnresolvedException when creating temp view twice.
  • February 4, 2021
    • Fixed a regression that prevents the incremental execution of a query that sets a global limit such as SELECT * FROM table LIMIT nrows. The regression was experienced by users running queries via ODBC/JDBC with Arrow serialization enabled.
    • Introduced write time checks to the Hive client to prevent the corruption of metadata in the Hive metastore for Delta tables.
    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021
    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021
    • Upgrade Azure Storage SDK from 2.3.8 to 2.3.9.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33480][SQL] updates the error message of char/varchar table insertion length check

Databricks Runtime 7.3 LTS

See Databricks Runtime 7.3 LTS.

  • September 22nd, 2022

  • September 6th, 2022

    • [SPARK-35542][CORE][ML] Fix: Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it
    • [SPARK-40079][CORE] Add Imputer inputCols validation for empty input case
  • August 24, 2022

    • [SPARK-39962][PYTHON][SQL] Apply projection when group attributes are empty
    • Operating system security updates.
  • August 9, 2022

    • Operating system security updates.
  • July 27, 2022

    • Make Delta MERGE operation results consistent when source is non-deterministic.
    • Operating system security updates.
    • Miscellaneous bug fixes.
  • July 13, 2022

    • [SPARK-32680][SQL] Don't Preprocess V2 CTAS with Unresolved Query
    • Disabled Auto Loader's use of native cloud APIs for directory listing on Azure.
    • Operating system security updates.
  • July 5, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • June 2, 2022

    • [SPARK-38918][SQL] Nested column pruning should filter out attributes that do not belong to the current relation
    • Operating system security updates.
  • May 18, 2022

    • Upgrade AWS SDK version from 1.11.655 to 1.11.678.
    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 19, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 6, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • March 14, 2022

    • Remove vulnerable classes from log4j 1.2.17 jar
    • Miscellaneous bug fixes.
  • February 23, 2022

    • [SPARK-37859][SQL] Do not check for metadata during schema comparison
  • February 8, 2022

    • Upgrade Ubuntu JDK to 1.8.0.312.
    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 26, 2022

    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022

    • Conda defaults channel is removed from 7.3 ML LTS
    • Operating system security updates.
  • December 7, 2021

    • Operating system security updates.
  • November 4, 2021

    • Fixed a bug that could cause Structured Streaming streams to fail with an ArrayIndexOutOfBoundsException
    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: No FileSystem for scheme or that might cause modifications to sparkContext.hadoopConfiguration to not take effect in queries.
  • September 15, 2021

    • Fixed a race condition that might cause a query failure with an IOException like java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_x_piecey of broadcast_x.
    • Operating system security updates.
  • September 8, 2021

    • [SPARK-35700][SQL][WARMFIX] Read char/varchar orc table when created and written by external systems.
    • [SPARK-36532][CORE][3.1] Fixed deadlock in CoarseGrainedExecutorBackend.onDisconnected to avoid executorsconnected to prevent executor shutdown hang.
  • August 25, 2021

    • Snowflake connector was upgraded to 2.9.0.
  • July 29, 2021

    • [SPARK-36034][BUILD] Rebase datetime in pushed down filters to Parquet
    • [SPARK-34508][BUILD] Skip HiveExternalCatalogVersionsSuite if network is down
  • July 14, 2021

    • Introduced database.schema.table format for Azure Synapse connector.
    • Added support to provide databaseName.schemaName.tableName format as the target table instead of only schemaName.tableName or tableName.
    • Fixed a bug that prevents users from time traveling to older available versions with Delta tables.
  • June 15, 2021

    • Fixes a NoSuchElementException bug in Delta Lake optimized writes that can happen when writing large amounts of data and encountering executor losses
    • Updated Python with security patch to fix Python security vulnerability (CVE-2021-3177).
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
    • [SPARK-35045][SQL] Add an internal option to control input buffer in univocity
  • March 24, 2021

    • [SPARK-34768][SQL] Respect the default input buffer size in Univocity
    • [SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
    • [SPARK-33118][SQL]CREATE TEMPORARY TABLE fails with location
  • March 9, 2021

    • The updated Azure Blob File System driver for Azure Data Lake Storage Gen2 is now enabled by default. It brings multiple stability improvements.
    • Fix path separator on Windows for databricks-connect get-jar-dir
    • [UI] Fix the href link of Spark DAG Visualization
    • [DBCONNECT] Add support for FlatMapCoGroupsInPandas in Databricks Connect 7.3
    • Restore the output schema of SHOW DATABASES
    • [SQL] Use correct dynamic pruning build key when range join hint is present
    • Disable staleness check for Delta table files in disk cache
    • [SQL] Do not generate shuffle partition number advice when AOS is enable
  • February 24, 2021

    • Upgraded the Spark BigQuery connector to v0.18, which introduces various bug fixes and support for Arrow and Avro iterators.
    • Fixed a correctness issue that caused Spark to return incorrect results when the Parquet file's decimal precision and scale are different from the Spark schema.
    • Fixed reading failure issue on Microsoft SQL Server tables that contain spatial data types, by adding geometry and geography JDBC types support for Spark SQL.
    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
    • [SPARK-34212] Fixed issues related to reading decimal data from Parquet files.
    • [SPARK-33579][UI] Fix executor blank page behind proxy.
    • [SPARK-20044][UI] Support Spark UI behind front-end reverse proxy using a path prefix.
    • [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
  • February 4, 2021

    • Fixed a regression that prevents the incremental execution of a query that sets a global limit such as SELECT * FROM table LIMIT nrows. The regression was experienced by users running queries via ODBC/JDBC with Arrow serialization enabled.
    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021

    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021

    • Operating system security updates.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
    • [SPARK-33592][ML][PYTHON] Pyspark ML Validator params in estimatorParamMaps may be lost after saving and reloading
    • [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin
  • December 8, 2020

    • [SPARK-33587][CORE] Kill the executor on nested fatal errors
    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • [SPARK-33316][SQL] Support user provided nullable Avro schema for non-nullable catalyst schema in Avro writing
    • Spark Jobs launched using Databricks Connect could hang indefinitely with Executor$TaskRunner.$anonfun$copySessionState in executor stack trace
    • Operating system security updates.
  • December 1, 2020

    • [SPARK-33404][SQL][3.0] Fix incorrect results in date_trunc expression
    • [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
    • [SPARK-33183][SQL][HOTFIX] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [SPARK-33371][PYTHON][3.0] Update setup.py and tests for Python 3.9
    • [SPARK-33391][SQL] element_at with CreateArray not respect one based index.
    • [SPARK-33306][SQL]Timezone is needed when cast date to string
    • [SPARK-33260][SQL] Fix incorrect results from SortExec when sortOrder is Stream
  • November 5, 2020

    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser().
    • Fix an infinite loop bug when Avro reader reads the MAGIC bytes.
    • Add support for the USAGE privilege.
    • Performance improvements for privilege checking in table access control.
  • October 13, 2020

    • Operating system security updates.
    • You can read and write from DBFS using the FUSE mount at /dbfs/ when on a high concurrency credential passthrough enabled cluster. Regular mounts are supported but mounts that need passthrough credentials are not supported yet.
    • [SPARK-32999][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • [SPARK-32585][SQL] Support scala enumeration in ScalaReflection
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 29, 2020

    • [SPARK-32718][SQL] Remove unnecessary keywords for interval units
    • [SPARK-32635][SQL] Fix foldable propagation
    • Add a new config spark.shuffle.io.decoder.consolidateThreshold. Set the config value to Long.MAX_VALUE to skip the consolidation of netty FrameBuffers, which prevents java.lang.IndexOutOfBoundsException in corner cases.

Databricks Runtime 6.4 Extended Support (Unsupported)

See Databricks Runtime 6.4 (Unsupported) and Databricks Runtime 6.4 Extended Support (Unsupported).

  • July 5, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • June 2, 2022

    • Operating system security updates.
  • May 18, 2022

    • Operating system security updates.
  • April 19, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 6, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • March 14, 2022

    • Remove vulnerable classes from log4j 1.2.17 jar
    • Miscellaneous bug fixes.
  • February 23, 2022

    • Miscellaneous bug fixes.
  • February 8, 2022

    • Upgrade Ubuntu JDK to 1.8.0.312.
    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 26, 2022

    • Fixed a bug where the OPTIMIZE command could fail when the ANSI SQL dialect was enabled.
  • January 19, 2022

    • Operating system security updates.
  • December 8, 2021

    • Operating system security updates.
  • September 22, 2021

    • Operating system security updates.
  • June 15, 2021

    • [SPARK-35576][SQL] Redact the sensitive info in the result of Set command
  • June 7, 2021

    • Add a new config called spark.sql.maven.additionalRemoteRepositories, a comma-delimited string config of the optional additional remote maven mirror. The value defaults to https://maven-central.storage-download.googleapis.com/maven2/.
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
  • March 9, 2021

    • Port HADOOP-17215 to the Azure Blob File System driver (Support for conditional overwrite).
    • Fix path separator on Windows for databricks-connect get-jar-dir
    • Added support for Hive metastore versions 2.3.5, 2.3.6, and 2.3.7
    • Arrow "totalResultsCollected" reported incorrectly after spill
  • February 24, 2021

    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
  • February 4, 2021

    • Fixed a regression that prevents the incremental execution of a query that sets a global limit such as SELECT * FROM table LIMIT nrows. The regression was experienced by users running queries via ODBC/JDBC with Arrow serialization enabled.
    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 12, 2021

    • Operating system security updates.
  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • [SPARK-33183][SQL] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [Runtime 6.4 ML GPU] We previously installed an incorrect version (2.7.8-1+cuda11.1) of NCCL. This release corrects it to 2.4.8-1+cuda10.0 that is compatible with CUDA 10.0.
    • Operating system security updates.
  • December 1, 2020

  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL][2.4] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 24, 2020

    • Fixed a previous limitation where passthrough on standard cluster would still restrict the filesystem implementation user uses. Now users would be able to access local filesystems without restrictions.
    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
    • Update Microsoft Azure Storage SDK to 8.6.4 and enable TCP keep alive on connections made by the WASB driver
  • August 25, 2020

    • Fixed ambiguous attribute resolution in self-merge
  • August 18, 2020

    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020

    • [SPARK-28676][CORE] Avoid Excessive logging from ContextCleaner
  • August 3, 2020

    • You can now use the LDA transform function on a passthrough-enabled cluster.
    • Operating system security updates.
  • July 7, 2020

    • Upgraded Java version from 1.8.0_232 to 1.8.0_252.
  • April 21, 2020

    • [SPARK-31312][SQL] Cache Class instance for the UDF instance in HiveFunctionWrapper
  • April 7, 2020

    • To resolve an issue with pandas udf not working with PyArrow 0.15.0 and above, we added an environment variable (ARROW_PRE_0_15_IPC_FORMAT=1) to enable support for those versions of PyArrow. See the instructions in [SPARK-29367].
  • March 10, 2020

    • Optimized autoscaling is now used by default on all-purpose clusters on the Premium Plan.
    • The Snowflake connector (spark-snowflake_2.11) included in Databricks Runtime is updated to version 2.5.9. snowflake-jdbc is updated to version 3.12.0.

Databricks Runtime 5.5 LTS (Unsupported)

See Databricks Runtime 5.5 LTS (Unsupported) and Databricks Runtime 5.5 Extended Support (Unsupported).

  • December 8, 2021

    • Operating system security updates.
  • September 22, 2021

    • Operating system security updates.
  • August 25, 2021

    • Downgraded some previously upgraded python packages in 5.5 ML Extended Support Release to maintain better parity with 5.5 ML LTS (now deprecated). See [_]/release-notes/runtime/5.5xml.md) for the updated differences between the two versions.
  • June 15, 2021

    • [SPARK-35576][SQL] Redact the sensitive info in the result of Set command
  • June 7, 2021

    • Add a new config called spark.sql.maven.additionalRemoteRepositories, a comma-delimited string config of the optional additional remote maven mirror. The value defaults to https://maven-central.storage-download.googleapis.com/maven2/.
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
  • March 9, 2021

    • Port HADOOP-17215 to the Azure Blob File System driver (Support for conditional overwrite).
  • February 24, 2021

    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
  • January 12, 2021

  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • Operating system security updates.
  • December 1, 2020

  • October 29, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL][2.4] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
  • September 24, 2020

    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
  • August 18, 2020

    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020

    • [SPARK-28676][CORE] Avoid Excessive logging from ContextCleaner
  • August 3, 2020

    • Operating system security updates
  • July 7, 2020

    • Upgraded Java version from 1.8.0_232 to 1.8.0_252.
  • April 21, 2020

    • [SPARK-31312][SQL] Cache Class instance for the UDF instance in HiveFunctionWrapper
  • April 7, 2020

    • To resolve an issue with pandas udf not working with PyArrow 0.15.0 and above, we added an environment variable (ARROW_PRE_0_15_IPC_FORMAT=1) to enable support for those versions of PyArrow. See the instructions in [SPARK-29367].
  • March 25, 2020

    • The Snowflake connector (spark-snowflake_2.11) included in Databricks Runtime is updated to version 2.5.9. snowflake-jdbc is updated to version 3.12.0.
  • March 10, 2020

    • Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and marked as failed. To avoid encountering this limit, you can prevent stdout from being returned from the driver to by setting the spark.databricks.driver.disableScalaOutput Spark configuration to true. By default the flag value is false. The flag controls cell output for Scala JAR jobs and Scala notebooks. If the flag is enabled, Spark does not return job execution results to the client. The flag does not affect the data that is written in the cluster's log files. Setting this flag is recommended only for automated clusters for JAR jobs, because it will disable notebook results.
  • February 18, 2020

    • [SPARK-24783][SQL] spark.sql.shuffle.partitions=0 should throw exception
    • Credential passthrough with ADLS Gen2 has a performance degradation due to incorrect thread local handling when ADLS client prefetching is enabled. This release disables ADLS Gen2 prefetching when credential passthrough is enabled until we have a proper fix.
  • January 28, 2020

  • January 14, 2020

    • Upgraded Java version from 1.8.0_222 to 1.8.0_232.
  • November 19, 2019

    • [SPARK-29743] [SQL] sample should set needCopyResult to true if its child's needCopyResult is true
    • R version was unintendedly upgraded to 3.6.1 from 3.6.0. We downgraded it back to 3.6.0.
  • November 5, 2019

    • Upgraded Java version from 1.8.0_212 to 1.8.0_222.
  • October 23, 2019

    • [SPARK-29244][CORE] Prevent freed page in BytesToBytesMap free again
  • October 8, 2019

    • Server side changes to allow Simba Apache Spark ODBC driver to reconnect and continue after a connection failure during fetching results (requires Simba Apache Spark ODBC driver version 2.6.10).
    • Fixed an issue affecting using Optimize command with table ACL enabled clusters.
    • Fixed an issue where pyspark.ml libraries would fail due to Scala UDF forbidden error on table ACL and credential passthrough enabled clusters.
    • Allowlisted SerDe and SerDeUtil methods for credential passthrough.
    • Fixed NullPointerException when checking error code in the WASB client.
  • September 24, 2019

    • Improved stability of Parquet writer.
    • Fixed the problem that Thrift query cancelled before it starts executing may stuck in STARTED state.
  • September 10, 2019

    • Add thread safe iterator to BytesToBytesMap
    • [SPARK-27992][SPARK-28881]Allow Python to join with connection thread to propagate errors
    • Fixed a bug affecting certain global aggregation queries.
    • Improved credential redaction.
    • [SPARK-27330][SS] support task abort in foreach writer
    • [SPARK-28642]Hide credentials in SHOW CREATE TABLE
    • [SPARK-28699][SQL] Disable using radix sort for ShuffleExchangeExec in repartition case
  • August 27, 2019

    • [SPARK-20906][SQL]Allow user-specified schema in the API to_avro with schema registry
    • [SPARK-27838][SQL] Support user provided non-nullable avro schema for nullable catalyst schema without any null record
    • Improvement on Delta Lake time travel
    • Fixed an issue affecting certain transform expression
    • Supports broadcast variables when Process Isolation is enabled
  • August 13, 2019

    • Delta streaming source should check the latest protocol of a table
    • [SPARK-28260]Add CLOSED state to ExecutionState
    • [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets
  • July 30, 2019

    • [SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
    • [SPARK-28308][CORE] CalendarInterval sub-second part should be padded before parsing
    • [SPARK-27485]EnsureRequirements.reorder should handle duplicate expressions gracefully
    • [SPARK-28355][CORE][PYTHON] Use Spark conf for threshold at which UDF is compressed by broadcast

Databricks Light 2.4 Extended Support

See Databricks Light 2.4 (Unsupported) and Databricks Light 2.4 Extended Support.

  • August 24, 2022

    • Operating system security updates.
  • August 9, 2022

    • Operating system security updates.
  • July 27, 2022

    • Operating system security updates.
  • July 5, 2022

    • Operating system security updates.
  • June 2, 2022

    • Operating system security updates.
  • May 18, 2022

    • Operating system security updates.
  • April 19, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • April 6, 2022

    • Operating system security updates.
    • Miscellaneous bug fixes.
  • March 14, 2022

    • Miscellaneous bug fixes.
  • February 23, 2022

    • Miscellaneous bug fixes.
  • February 8, 2022

    • Upgrade Ubuntu JDK to 1.8.0.312.
    • Operating system security updates.
  • February 1, 2022

    • Operating system security updates.
  • January 19, 2022

    • Operating system security updates.
  • September 22, 2021

    • Operating system security updates.
  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
  • January 12, 2021

    • Operating system security updates.
  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • Operating system security updates.
  • December 1, 2020

  • [SPARK-33260][SQL] Fix incorrect results from SortExec when sortOrder is Stream

  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
  • October 13, 2020

    • Operating system security updates.

Databricks Runtime 7.4 (Unsupported)

See Databricks Runtime 7.4 (Unsupported).

  • April 30, 2021

    • Operating system security updates.
    • [SPARK-35227][BUILD] Update the resolver for spark-packages in SparkSubmit
    • [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state
    • [SPARK-35045][SQL] Add an internal option to control input buffer in univocity and a configuration for CSV input buffer size
  • March 24, 2021

    • [SPARK-34768][SQL] Respect the default input buffer size in Univocity
    • [SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
  • March 9, 2021

    • The updated Azure Blob File System driver for Azure Data Lake Storage Gen2 is now enabled by default. It brings multiple stability improvements.
    • [ES-67926][UI] Fix the href link of Spark DAG Visualization
    • [ES-65064] Restore the output schema of SHOW DATABASES
    • [SC-70522][SQL] Use correct dynamic pruning build key when range join hint is present
    • [SC-35081] Disable staleness check for Delta table files in disk cache
    • [SC-70640] Fix NPE when EventGridClient response has no entity
    • [SC-70220][SQL] Do not generate shuffle partition number advice when AOS is enabled
  • February 24, 2021

    • Upgraded the Spark BigQuery connector to v0.18, which introduces various bug fixes and support for Arrow and Avro iterators.
    • Fixed a correctness issue that caused Spark to return incorrect results when the Parquet file's decimal precision and scale are different from the Spark schema.
    • Fixed reading failure issue on Microsoft SQL Server tables that contain spatial data types, by adding geometry and geography JDBC types support for Spark SQL.
    • Introduced a new configuration spark.databricks.hive.metastore.init.reloadFunctions.enabled. This configuration controls the built in Hive initialization. When set to true, Azure Databricks reloads all functions from all databases that users have into FunctionRegistry. This is the default behavior in Hive Metastore. When set to false, Azure Databricks disables this process for optimization.
    • [SPARK-34212] Fixed issues related to reading decimal data from Parquet files.
    • [SPARK-33579][UI] Fix executor blank page behind proxy.
    • [SPARK-20044][UI] Support Spark UI behind front-end reverse proxy using a path prefix.
    • [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
  • February 4, 2021

    • Fixed a regression that prevents the incremental execution of a query that sets a global limit such as SELECT * FROM table LIMIT nrows. The regression was experienced by users running queries via ODBC/JDBC with Arrow serialization enabled.
    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021

    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021

    • Operating system security updates.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
    • [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin
  • December 8, 2020

    • [SPARK-33587][CORE] Kill the executor on nested fatal errors
    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • [SPARK-33316][SQL] Support user provided nullable Avro schema for non-nullable catalyst schema in Avro writing
    • Operating system security updates.
  • December 1, 2020

    • [SPARK-33404][SQL][3.0] Fix incorrect results in date_trunc expression
    • [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
    • [SPARK-33183][SQL][HOTFIX] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [SPARK-33371][PYTHON][3.0] Update setup.py and tests for Python 3.9
    • [SPARK-33391][SQL] element_at with CreateArray not respect one based index.
    • [SPARK-33306][SQL]Timezone is needed when cast date to string
    • [SPARK-33260][SQL] Fix incorrect results from SortExec when sortOrder is Stream
    • [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput

Databricks Runtime 7.2 (Unsupported)

See Databricks Runtime 7.2 (Unsupported).

  • February 4, 2021

    • Fixed a regression that prevents the incremental execution of a query that sets a global limit such as SELECT * FROM table LIMIT nrows. The regression was experienced by users running queries via ODBC/JDBC with Arrow serialization enabled.
    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021

    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021

    • Operating system security updates.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
    • [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin
  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • [SPARK-33404][SQL] Fix incorrect results in date_trunc expression
    • [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
    • [SPARK-33183][SQL] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [SPARK-33391][SQL] element_at with CreateArray not respect one based index.
    • Operating system security updates.
  • December 1, 2020

    • [SPARK-33306][SQL]Timezone is needed when cast date to string
    • [SPARK-33260][SQL] Fix incorrect results from SortExec when sortOrder is Stream
  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 29, 2020

    • [SPARK-28863][SQL][WARMFIX] Introduce AlreadyOptimized to prevent reanalysis of V1FallbackWriters
    • [SPARK-32635][SQL] Fix foldable propagation
    • Add a new config spark.shuffle.io.decoder.consolidateThreshold. Set the config value to Long.MAX_VALUE to skip the consolidation of netty FrameBuffers, which prevents java.lang.IndexOutOfBoundsException in corner cases.
  • September 24, 2020

    • [SPARK-32764][SQL] -0.0 should be equal to 0.0
    • [SPARK-32753][SQL] Only copy tags to node with no tags when transforming plans
    • [SPARK-32659][SQL] Fix the data issue of inserted Dynamic Partition Pruning on non-atomic type
    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.

Databricks Runtime 7.1 (Unsupported)

See Databricks Runtime 7.1 (Unsupported).

  • February 4, 2021

    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021

    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021

    • Operating system security updates.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
    • [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin
  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • Spark Jobs launched using Databricks Connect could hang indefinitely with Executor$TaskRunner.$anonfun$copySessionState in executor stack trace
    • Operating system security updates.
  • December 1, 2020

    • [SPARK-33404][SQL][3.0] Fix incorrect results in date_trunc expression
    • [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
    • [SPARK-33183][SQL][HOTFIX] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [SPARK-33371][PYTHON][3.0] Update setup.py and tests for Python 3.9
    • [SPARK-33391][SQL] element_at with CreateArray not respect one based index.
    • [SPARK-33306][SQL]Timezone is needed when cast date to string
  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 29, 2020

    • [SPARK-28863][SQL][WARMFIX] Introduce AlreadyOptimized to prevent reanalysis of V1FallbackWriters
    • [SPARK-32635][SQL] Fix foldable propagation
    • Add a new config spark.shuffle.io.decoder.consolidateThreshold. Set the config value to Long.MAX_VALUE to skip the consolidation of netty FrameBuffers, which prevents java.lang.IndexOutOfBoundsException in corner cases.
  • September 24, 2020

    • [SPARK-32764][SQL] -0.0 should be equal to 0.0
    • [SPARK-32753][SQL] Only copy tags to node with no tags when transforming plans
    • [SPARK-32659][SQL] Fix the data issue of inserted Dynamic Partition Pruning on non-atomic type
    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
  • August 25, 2020

    • [SPARK-32159][SQL] Fix integration between Aggregator[Array[_], _, _] and UnresolvedMapObjects
    • [SPARK-32559][SQL] Fix the trim logic in UTF8String.toInt/toLong, which didn't handle non-ASCII characters correctly
    • [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
    • [SPARK-32091][CORE] Ignore timeout error when removing blocks on the lost executor
    • Fixed an issue affecting Azure Synapse connector with MSI credentials
    • Fixed ambiguous attribute resolution in self-merge
  • August 18, 2020

    • [SPARK-32594][SQL] Fix serialization of dates inserted to Hive tables
    • [SPARK-32237][SQL] Resolve hint in CTE
    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • [SPARK-32467][UI] Avoid encoding URL twice on https redirect
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020

    • [SPARK-32280][SPARK-32372][SQL] ResolveReferences.dedupRight should only rewrite attributes for ancestor nodes of the conflict plan
    • [SPARK-32234][SQL] Spark SQL commands are failing on selecting the ORC tables
  • August 3, 2020

    • You can now use the LDA transform function on a passthrough-enabled cluster.

Databricks Runtime 7.0 (Unsupported)

See Databricks Runtime 7.0 (Unsupported).

  • February 4, 2021

    • Fixed a regression that caused DBFS FUSE to fail to start when cluster environment variable configurations contain invalid bash syntax.
  • January 20, 2021

    • Fixed a regression in the January 12, 2021 maintenance release that can cause an incorrect AnalysisException and say the column is ambiguous in a self join. This regression happens when a user joins a DataFrame with its derived DataFrame (a so-called self-join) with the following conditions:
      • These two DataFrames have common columns, but the output of the self join does not have common columns. For example, df.join(df.select($"col" as "new_col"), cond)
      • The derived DataFrame excludes some columns via select, groupBy, or window.
      • The join condition or the following transformation after the joined Dataframe refers to the non-common columns. For example, df.join(df.drop("a"), df("a") === 1)
  • January 12, 2021

    • Operating system security updates.
    • [SPARK-33593][SQL] Vector reader got incorrect data with binary partition value
    • [SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains any escapeChar
    • [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin
  • December 8, 2020

    • [SPARK-27421][SQL] Fix filter for int column and value class java.lang.String when pruning partition column
    • [SPARK-33404][SQL] Fix incorrect results in date_trunc expression
    • [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
    • [SPARK-33183][SQL] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts
    • [SPARK-33391][SQL] element_at with CreateArray not respect one based index.
    • Operating system security updates.
  • December 1, 2020

  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 29, 2020

    • [SPARK-28863][SQL][WARMFIX] Introduce AlreadyOptimized to prevent reanalysis of V1FallbackWriters
    • [SPARK-32635][SQL] Fix foldable propagation
    • Add a new config spark.shuffle.io.decoder.consolidateThreshold. Set the config value to Long.MAX_VALUE to skip the consolidation of netty FrameBuffers, which prevents java.lang.IndexOutOfBoundsException in corner cases.
  • September 24, 2020

    • [SPARK-32764][SQL] -0.0 should be equal to 0.0
    • [SPARK-32753][SQL] Only copy tags to node with no tags when transforming plans
    • [SPARK-32659][SQL] Fix the data issue of inserted Dynamic Partition Pruning on non-atomic type
    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
  • August 25, 2020

    • [SPARK-32159][SQL] Fix integration between Aggregator[Array[_], _, _] and UnresolvedMapObjects
    • [SPARK-32559][SQL] Fix the trim logic in UTF8String.toInt/toLong, which didn't handle non-ASCII characters correctly
    • [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
    • [SPARK-32091][CORE] Ignore timeout error when removing blocks on the lost executor
    • Fixed an issue affecting Azure Synapse connector with MSI credentials
    • Fixed ambiguous attribute resolution in self-merge
  • August 18, 2020

    • [SPARK-32594][SQL] Fix serialization of dates inserted to Hive tables
    • [SPARK-32237][SQL] Resolve hint in CTE
    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • [SPARK-32467][UI] Avoid encoding URL twice on https redirect
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020

    • [SPARK-32280][SPARK-32372][SQL] ResolveReferences.dedupRight should only rewrite attributes for ancestor nodes of the conflict plan
    • [SPARK-32234][SQL] Spark SQL commands are failing on selecting the ORC tables
    • You can now use the LDA transform function on a passthrough-enabled cluster.

Databricks Runtime 6.6 (Unsupported)

See Databricks Runtime 6.6 (Unsupported).

  • December 1, 2020

  • November 3, 2020

    • Upgraded Java version from 1.8.0_252 to 1.8.0_265.
    • Fix ABFS and WASB locking with regard to UserGroupInformation.getCurrentUser()
    • Fix an infinite loop bug of Avro reader when reading the MAGIC bytes.
  • October 13, 2020

    • Operating system security updates.
    • [SPARK-32999][SQL][2.4] Use Utils.getSimpleName to avoid hitting Malformed class name in TreeNode
    • Fixed listing directories in FUSE mount that contain file names with invalid XML characters
    • FUSE mount no longer uses ListMultipartUploads
  • September 24, 2020

    • Operating system security updates.
  • September 8, 2020

    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
    • Update Azure Storage SDK to 8.6.4 and enable TCP keep alive on connections made by the WASB driver
  • August 25, 2020

    • Fixed ambiguous attribute resolution in self-merge
  • August 18, 2020

    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020

    • [SPARK-28676][CORE] Avoid Excessive logging from ContextCleaner
    • [SPARK-31967][UI] Downgrade to vis.js 4.21.0 to fix Jobs UI loading time regression
  • August 3, 2020

    • You can now use the LDA transform function on a passthrough-enabled cluster.
    • Operating system security updates.

Databricks Runtime 6.5 (Unsupported)

See Databricks Runtime 6.5 (Unsupported).

  • September 24, 2020
    • Fixed a previous limitation where passthrough on standard cluster would still restrict the filesystem implementation user uses. Now users would be able to access local filesystems without restrictions.
    • Operating system security updates.
  • September 8, 2020
    • A new parameter was created for Azure Synapse Analytics, maxbinlength. This parameter is used to control the column length of BinaryType columns, and is translated as VARBINARY(maxbinlength). It can be set using .option("maxbinlength", n), where 0 < n <= 8000.
    • Update Azure Storage SDK to 8.6.4 and enable TCP keep alive on connections made by the WASB driver
  • August 25, 2020
    • Fixed ambiguous attribute resolution in self-merge
  • August 18, 2020
    • [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources
    • Fixed a race condition in the AQS connector when using Trigger.Once.
  • August 11, 2020
    • [SPARK-28676][CORE] Avoid Excessive logging from ContextCleaner
  • August 3, 2020
    • You can now use the LDA transform function on a passthrough-enabled cluster.
    • Operating system security updates.
  • July 7, 2020
    • Upgraded Java version from 1.8.0_242 to 1.8.0_252.
  • April 21, 2020
    • [SPARK-31312][SQL] Cache Class instance for the UDF instance in HiveFunctionWrapper

Databricks Runtime 6.3 (Unsupported)

See Databricks Runtime 6.3 (Unsupported).

  • July 7, 2020
    • Upgraded Java version from 1.8.0_232 to 1.8.0_252.
  • April 21, 2020
    • [SPARK-31312][SQL] Cache Class instance for the UDF instance in HiveFunctionWrapper
  • April 7, 2020
    • To resolve an issue with pandas udf not working with PyArrow 0.15.0 and above, we added an environment variable (ARROW_PRE_0_15_IPC_FORMAT=1) to enable support for those versions of PyArrow. See the instructions in [SPARK-29367].
  • March 10, 2020
    • The Snowflake connector (spark-snowflake_2.11) included in Databricks Runtime is updated to version 2.5.9. snowflake-jdbc is updated to version 3.12.0.
  • February 18, 2020
    • Credential passthrough with ADLS Gen2 has a performance degradation due to incorrect thread local handling when ADLS client prefetching is enabled. This release disables ADLS Gen2 prefetching when credential passthrough is enabled until we have a proper fix.
  • February 11, 2020
    • [SPARK-24783][SQL] spark.sql.shuffle.partitions=0 should throw exception
    • [SPARK-30447][SQL] Constant propagation nullability issue
    • [SPARK-28152][SQL] Add a legacy conf for old MsSqlServerDialect numeric mapping
    • Allowlisted the overwrite function so that the MLModels extends MLWriter could call the function.

Databricks Runtime 6.2 (Unsupported)

See Databricks Runtime 6.2 (Unsupported).

  • April 21, 2020
    • [SPARK-31312][SQL] Cache Class instance for the UDF instance in HiveFunctionWrapper
  • April 7, 2020
    • To resolve an issue with pandas udf not working with PyArrow 0.15.0 and above, we added an environment variable (ARROW_PRE_0_15_IPC_FORMAT=1) to enable support for those versions of PyArrow. See the instructions in [SPARK-29367].
  • March 25, 2020
    • Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and marked as failed. To avoid encountering this limit, you can prevent stdout from being returned from the driver to by setting the spark.databricks.driver.disableScalaOutput Spark configuration to true. By default the flag value is false. The flag controls cell output for Scala JAR jobs and Scala notebooks. If the flag is enabled, Spark does not return job execution results to the client. The flag does not affect the data that is written in the cluster's log files. Setting this flag is recommended only for automated clusters for JAR jobs, because it will disable notebook results.
  • March 10, 2020
    • The Snowflake connector (spark-snowflake_2.11) included in Databricks Runtime is updated to version 2.5.9. snowflake-jdbc is updated to version 3.12.0.
  • February 18, 2020
    • [SPARK-24783][SQL] spark.sql.shuffle.partitions=0 should throw exception
    • Credential passthrough with ADLS Gen2 has a performance degradation due to incorrect thread local handling when ADLS client prefetching is enabled. This release disables ADLS Gen2 prefetching when credential passthrough is enabled until we have a proper fix.
  • January 28, 2020
    • Allowlisted ML Model Writers' overwrite function for clusters enabled for credential passthrough, so that model save can use overwrite mode on credential passthrough clusters.
    • [SPARK-30447][SQL] Constant propagation nullability issue.
    • [SPARK-28152][SQL] Add a legacy conf for old MsSqlServerDialect numeric mapping.
  • January 14, 2020
    • Upgraded Java version from 1.8.0_222 to 1.8.0_232.
  • December 10, 2019
    • [SPARK-29904][SQL] Parse timestamps in microsecond precision by JSON/CSV data sources.

Databricks Runtime 6.1 (Unsupported)

See Databricks Runtime 6.1 (Unsupported).

  • April 7, 2020
    • To resolve an issue with pandas udf not working with PyArrow 0.15.0 and above, we added an environment variable (ARROW_PRE_0_15_IPC_FORMAT=1) to enable support for those versions of PyArrow. See the instructions in [SPARK-29367].
  • March 25, 2020
    • Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and marked as failed. To avoid encountering this limit, you can prevent stdout from being returned from the driver to by setting the spark.databricks.driver.disableScalaOutput Spark configuration to true. By default the flag value is false. The flag controls cell output for Scala JAR jobs and Scala notebooks. If the flag is enabled, Spark does not return job execution results to the client. The flag does not affect the data that is written in the cluster's log files. Setting this flag is recommended only for automated clusters for JAR jobs, because it will disable notebook results.
  • March 10, 2020
    • The Snowflake connector (spark-snowflake_2.11) included in Databricks Runtime is updated to version 2.5.9. snowflake-jdbc is updated to version 3.12.0.
  • February 18, 2020
    • [SPARK-24783][SQL] spark.sql.shuffle.partitions=0 should throw exception
    • Credential passthrough with ADLS Gen2 has a performance degradation due to incorrect thread local handling when ADLS client prefetching is enabled. This release disables ADLS Gen2 prefetching when credential passthrough is enabled until we have a proper fix.
  • January 28, 2020
    • [SPARK-30447][SQL] Constant propagation nullability issue.
    • [SPARK-28152][SQL] Add a legacy conf for old MsSqlServerDialect numeric mapping.
  • January 14, 2020
    • Upgraded Java version from 1.8.0_222 to 1.8.0_232.
  • November 7, 2019
  • November 5, 2019
    • Fixed a bug in DBFS FUSE to handle mount points having // in its path.
    • [SPARK-29081] Replace calls to SerializationUtils.clone on properties with a faster implementation
    • [SPARK-29244][CORE] Prevent freed page in BytesToBytesMap free again
    • (6.1 ML) Library mkl version 2019.4 was installed unintentionally. We downgraded it to mkl version 2019.3 to match Anaconda Distribution 2019.03.

Databricks Runtime 6.0 (Unsupported)

See Databricks Runtime 6.0 (Unsupported).

  • March 25, 2020
    • Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and marked as failed. To avoid encountering this limit, you can prevent stdout from being returned from the driver to by setting the spark.databricks.driver.disableScalaOutput Spark configuration to true. By default the flag value is false. The flag controls cell output for Scala JAR jobs and Scala notebooks. If the flag is enabled, Spark does not return job execution results to the client. The flag does not affect the data that is written in the cluster's log files. Setting this flag is recommended only for automated clusters for JAR jobs, because it will disable notebook results.
  • February 18, 2020
    • Credential passthrough with ADLS Gen2 has a performance degradation due to incorrect thread local handling when ADLS client prefetching is enabled. This release disables ADLS Gen2 prefetching when credential passthrough is enabled until we have a proper fix.
  • February 11, 2020
    • [SPARK-24783][SQL] spark.sql.shuffle.partitions=0 should throw exception
  • January 28, 2020
    • [SPARK-30447][SQL] Constant propagation nullability issue.
    • [SPARK-28152][SQL] Add a legacy conf for old MsSqlServerDialect numeric mapping.
  • January 14, 2020
    • Upgraded Java version from 1.8.0_222 to 1.8.0_232.
  • November 19, 2019
    • [SPARK-29743] [SQL] sample should set needCopyResult to true if its child's needCopyResult is true
  • November 5, 2019
    • dbutils.tensorboard.start() now supports TensorBoard 2.0 (if installed manually).
    • Fixed a bug in DBFS FUSE to handle mount points having // in its path.
    • [SPARK-29081]Replace calls to SerializationUtils.clone on properties with a faster implementation
  • October 23, 2019
    • [SPARK-29244][CORE] Prevent freed page in BytesToBytesMap free again
  • October 8, 2019
    • Server side changes to allow Simba Apache Spark ODBC driver to reconnect and continue after a connection failure during fetching results (requires Simba Apache Spark ODBC driver version 2.6.10).
    • Fixed an issue affecting using Optimize command with table ACL enabled clusters.
    • Fixed an issue where pyspark.ml libraries would fail due to Scala UDF forbidden error on table ACL and credential passthrough enabled clusters.
    • Allowlisted SerDe/SerDeUtil methods for credential passthrough.
    • Fixed NullPointerException when checking error code in the WASB client.
    • Fixed the issue where user credentials were not forwarded to jobs created by dbutils.notebook.run().

Databricks Runtime 5.4 ML (Unsupported)

See Databricks Runtime 5.4 for Machine Learning (Unsupported).

  • June 18, 2019
    • Improved handling of MLflow active runs in Hyperopt integration
    • Improved messages in Hyperopt
    • Updated package Marchkdown from 3.1 to 3.1.1

Databricks Runtime 5.4 (Unsupported)

See Databricks Runtime 5.4 (Unsupported).

  • November 19, 2019
    • [SPARK-29743] [SQL] sample should set needCopyResult to true if its child's needCopyResult is true
  • October 8, 2019
    • Server side changes to allow Simba Apache Spark ODBC driver to reconnect and continue after a connection failure during fetching results (requires Simba Apache Spark ODBC driver update to version 2.6.10).
    • Fixed NullPointerException when checking error code in the WASB client.
  • September 10, 2019
    • Add thread safe iterator to BytesToBytesMap
    • Fixed a bug affecting certain global aggregation queries.
    • [SPARK-27330][SS] support task abort in foreach writer
    • [SPARK-28642]Hide credentials in SHOW CREATE TABLE
    • [SPARK-28699][SQL] Disable using radix sort for ShuffleExchangeExec in repartition case
    • [SPARK-28699][CORE] Fix a corner case for aborting indeterminate stage
  • August 27, 2019
    • Fixed an issue affecting certain transform expressions
  • August 13, 2019
    • Delta streaming source should check the latest protocol of a table
    • [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets
  • July 30, 2019
    • [SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
    • [SPARK-28308][CORE] CalendarInterval sub-second part should be padded before parsing
    • [SPARK-27485]EnsureRequirements.reorder should handle duplicate expressions gracefully
  • July 2, 2019
    • Upgraded snappy-java from 1.1.7.1 to 1.1.7.3.
  • June 18, 2019
    • Improved handling of MLflow active runs in MLlib integration
    • Improved Databricks Advisor message related to using disk caching
    • Fixed a bug affecting using higher order functions
    • Fixed a bug affecting Delta metadata queries

Databricks Runtime 5.3 (Unsupported)

See Databricks Runtime 5.3 (Unsupported).

  • November 7, 2019
    • [SPARK-29743][SQL] sample should set needCopyResult to true if its child's needCopyResult is true
  • October 8, 2019
    • Server side changes to allow Simba Apache Spark ODBC driver to reconnect and continue after a connection failure during fetching results (requires Simba Apache Spark ODBC driver update to version 2.6.10).
    • Fixed NullPointerException when checking error code in the WASB client.
  • September 10, 2019
    • Add thread safe iterator to BytesToBytesMap
    • Fixed a bug affecting certain global aggregation queries.
    • [SPARK-27330][SS] support task abort in foreach writer
    • [SPARK-28642]Hide credentials in SHOW CREATE TABLE
    • [SPARK-28699][SQL] Disable using radix sort for ShuffleExchangeExec in repartition case
    • [SPARK-28699][CORE] Fix a corner case for aborting indeterminate stage
  • August 27, 2019
    • Fixed an issue affecting certain transform expressions
  • August 13, 2019
    • Delta streaming source should check the latest protocol of a table
    • [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets
  • July 30, 2019
    • [SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
    • [SPARK-28308][CORE] CalendarInterval sub-second part should be padded before parsing
    • [SPARK-27485]EnsureRequirements.reorder should handle duplicate expressions gracefully
  • June 18, 2019
    • Improved Databricks Advisor message related to using disk caching
    • Fixed a bug affecting using higher order functions
    • Fixed a bug affecting Delta metadata queries
  • May 28, 2019
    • Improved the stability of Delta
    • Tolerate IOExceptions when reading Delta LAST_CHECKPOINT file
      • Added recovery to failed library installation
  • May 7, 2019
    • Port HADOOP-15778 (ABFS: Fix client side throttling for read) to Azure Data Lake Storage Gen2 connector
    • Port HADOOP-16040 (ABFS: Bug fix for tolerateOobAppends configuration) to Azure Data Lake Storage Gen2 connector
    • Fixed a bug affecting table ACLs
    • Fixed a race condition when loading a Delta log checksum file
    • Fixed Delta conflict detection logic to not identify "insert + overwrite" as pure "append" operation
    • Ensure that disk caching is not disabled when table ACLs are enabled
    • [SPARK-27494][SS] Null keys/values don't work in Kafka source v2
    • [SPARK-27446][R] Use existing spark conf if available.
    • [SPARK-27454][SPARK-27454][ML][SQL] Spark image datasource fail when encounter some illegal images
    • [SPARK-27160][SQL] Fix DecimalType when building orc filters
    • [SPARK-27338][CORE] Fix deadlock between UnsafeExternalSorter and TaskMemoryManager

Databricks Runtime 5.2 (Unsupported)

See Databricks Runtime 5.2 (Unsupported).

  • September 10, 2019
    • Add thread safe iterator to BytesToBytesMap
    • Fixed a bug affecting certain global aggregation queries.
    • [SPARK-27330][SS] support task abort in foreach writer
    • [SPARK-28642]Hide credentials in SHOW CREATE TABLE
    • [SPARK-28699][SQL] Disable using radix sort for ShuffleExchangeExec in repartition case
    • [SPARK-28699][CORE] Fix a corner case for aborting indeterminate stage
  • August 27, 2019
    • Fixed an issue affecting certain transform expressions
  • August 13, 2019
    • Delta streaming source should check the latest protocol of a table
    • [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets
  • July 30, 2019
    • [SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
    • [SPARK-28308][CORE] CalendarInterval sub-second part should be padded before parsing
    • [SPARK-27485]EnsureRequirements.reorder should handle duplicate expressions gracefully
  • July 2, 2019
    • Tolerate IOExceptions when reading Delta LAST_CHECKPOINT file
  • June 18, 2019
    • Improved Databricks Advisor message related to using disk cache
    • Fixed a bug affecting using higher order functions
    • Fixed a bug affecting Delta metadata queries
  • May 28, 2019
    • Added recovery to failed library installation
  • May 7, 2019
    • Port HADOOP-15778 (ABFS: Fix client side throttling for read) to Azure Data Lake Storage Gen2 connector
    • Port HADOOP-16040 (ABFS: Bug fix for tolerateOobAppends configuration) to Azure Data Lake Storage Gen2 connector
    • Fixed a race condition when loading a Delta log checksum file
    • Fixed Delta conflict detection logic to not identify "insert + overwrite" as pure "append" operation
    • Ensure that disk caching is not disabled when table ACLs are enabled
    • [SPARK-27494][SS] Null keys/values don't work in Kafka source v2
    • [SPARK-27454][SPARK-27454][ML][SQL] Spark image datasource fail when encounter some illegal images
    • [SPARK-27160][SQL] Fix DecimalType when building orc filters
    • [SPARK-27338][CORE] Fix deadlock between UnsafeExternalSorter and TaskMemoryManager
  • March 26, 2019
    • Avoid embedding platform-dependent offsets literally in whole-stage generated code
    • [SPARK-26665][CORE] Fix a bug that BlockTransferService.fetchBlockSync may hang forever.
    • [SPARK-27134][SQL] array_distinct function does not work correctly with columns containing array of array.
    • [SPARK-24669][SQL] Invalidate tables in case of DROP DATABASE CASCADE.
    • [SPARK-26572][SQL] fix aggregate codegen result evaluation.
    • Fixed a bug affecting certain PythonUDFs.
  • February 26, 2019
    • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
    • [SPARK-26887][PYTHON] Create datetime.date directly instead of creating datetime64 as intermediate data.
    • Fixed a bug affecting JDBC/ODBC server.
    • Fixed a bug affecting PySpark.
    • Exclude the hidden files when building HadoopRDD.
    • Fixed a bug in Delta that caused serialization issues.
  • February 12, 2019
    • Fixed an issue affecting using Delta with Azure ADLS Gen2 mount points.
    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019
    • Fixed the StackOverflowError when putting skew join hint on cached relation.
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
    • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
    • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
    • CSV/JSON data sources should avoid globbing paths when inferring schema.
    • Fixed constraint inference on Window operator.
    • Fixed an issue affecting installing egg libraries with clusters having table ACL enabled.

Databricks Runtime 5.1 (Unsupported)

See Databricks Runtime 5.1 (Unsupported).

  • August 13, 2019
    • Delta streaming source should check the latest protocol of a table
    • [SPARK-28489][SS]Fix a bug that KafkaOffsetRangeCalculator.getRanges may drop offsets
  • July 30, 2019
    • [SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
    • [SPARK-28308][CORE] CalendarInterval sub-second part should be padded before parsing
    • [SPARK-27485]EnsureRequirements.reorder should handle duplicate expressions gracefully
  • July 2, 2019
    • Tolerate IOExceptions when reading Delta LAST_CHECKPOINT file
  • June 18, 2019
    • Fixed a bug affecting using higher order functions
    • Fixed a bug affecting Delta metadata queries
  • May 28, 2019
    • Added recovery to failed library installation
  • May 7, 2019
    • Port HADOOP-15778 (ABFS: Fix client side throttling for read) to Azure Data Lake Storage Gen2 connector
    • Port HADOOP-16040 (ABFS: Bug fix for tolerateOobAppends configuration) to Azure Data Lake Storage Gen2 connector
    • Fixed a race condition when loading a Delta log checksum file
    • Fixed Delta conflict detection logic to not identify "insert + overwrite" as pure "append" operation
    • [SPARK-27494][SS] Null keys/values don't work in Kafka source v2
    • [SPARK-27454][SPARK-27454][ML][SQL] Spark image datasource fail when encounter some illegal images
    • [SPARK-27160][SQL] Fix DecimalType when building orc filters
    • [SPARK-27338][CORE] Fix deadlock between UnsafeExternalSorter and TaskMemoryManager
  • March 26, 2019
    • Avoid embedding platform-dependent offsets literally in whole-stage generated code
    • Fixed a bug affecting certain PythonUDFs.
  • February 26, 2019
    • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
    • Fixed a bug affecting JDBC/ODBC server.
    • Exclude the hidden files when building HadoopRDD.
  • February 12, 2019
    • Fixed an issue affecting installing egg libraries with clusters having table ACL enabled.
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
    • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
    • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
    • Fixed constraint inference on Window operator.
    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019
    • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
    • Fixed an issue affecting installing wheelhouses.
    • [SPARK-26267]Retry when detecting incorrect offsets from Kafka.
    • Fixed a bug that affects multiple file stream sources in a streaming query.
    • Fixed the StackOverflowError when putting skew join hint on cached relation.
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
  • January 8, 2019
    • Fixed issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
    • [SPARK-26352]join reordering should not change the order of output attributes.
    • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
    • Stability improvement for Delta Lake.
    • Delta Lake is enabled.
    • Fixed the issue that caused failed Azure Data Lake Storage Gen2 access when Azure AD Credential Passthrough is enabled for Azure Data Lake Storage Gen1.
    • Databricks IO Cache is now enabled for Ls series worker instance types for all pricing tiers.

Databricks Runtime 5.0 (Unsupported)

See Databricks Runtime 5.0 (Unsupported).

  • June 18, 2019
    • Fixed a bug affecting using higher order functions
  • May 7, 2019
    • Fixed a race condition when loading a Delta log checksum file
    • Fixed Delta conflict detection logic to not identify "insert + overwrite" as pure "append" operation
    • [SPARK-27494][SS] Null keys/values don't work in Kafka source v2
    • [SPARK-27454][SPARK-27454][ML][SQL] Spark image datasource fail when encounter some illegal images
    • [SPARK-27160][SQL] Fix DecimalType when building orc filters
      • [SPARK-27338][CORE] Fix deadlock between UnsafeExternalSorter and TaskMemoryManager
  • March 26, 2019
    • Avoid embedding platform-dependent offsets literally in whole-stage generated code
    • Fixed a bug affecting certain PythonUDFs.
  • March 12, 2019
    • [SPARK-26864][SQL] Query may return incorrect result when python udf is used as a left-semi join condition.
  • February 26, 2019
    • Fixed a bug affecting JDBC/ODBC server.
    • Exclude the hidden files when building HadoopRDD.
  • February 12, 2019
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
    • [SPARK-26706][SQL] Fix illegalNumericPrecedence for ByteType.
    • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
    • Fixed constraint inference on Window operator.
    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019
    • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
    • [SPARK-26267]Retry when detecting incorrect offsets from Kafka.
    • Fixed a bug that affects multiple file stream sources in a streaming query.
    • Fixed the StackOverflowError when putting skew join hint on cached relation.
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
  • January 8, 2019
    • Fixed issue that caused the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
    • [SPARK-26352]join reordering should not change the order of output attributes.
    • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
    • Stability improvement for Delta Lake.
    • Delta Lake is enabled.
    • Databricks IO Cache is now enabled for Ls series worker instance types for all pricing tiers.
  • December 18, 2018
    • [SPARK-26293]Cast exception when having Python UDF in subquery
    • Fixed an issue affecting certain queries using Join and Limit.
    • Redacted credentials from RDD names in Spark UI
  • December 6, 2018
    • Fixed an issue that caused incorrect query result when using orderBy followed immediately by groupBy with group-by key as the leading part of the sort-by key.
    • Upgraded Snowflake Connector for Spark from 2.4.9.2-spark_2.4_pre_release to 2.4.10.
    • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
    • Fixed an issue affecting certain self union queries.
    • Fixed a bug with the thrift server where sessions are sometimes leaked when cancelled.
    • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
    • [SPARK-26147]Python UDFs in join condition fail even when using columns from only one side of join
    • [SPARK-26211]Fix InSet for binary, and struct and array with null.
    • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
    • Fixed an issue affecting installing Python Wheels in environments without Internet access.
  • November 20, 2018
    • Fixed an issue that caused a notebook not usable after cancelling a streaming query.
    • Fixed an issue affecting certain queries using window functions.
    • Fixed an issue affecting a stream from Delta with multiple schema changes.
    • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.

Databricks Runtime 4.3 (Unsupported)

See Databricks Runtime 4.3 (Unsupported).

  • April 9, 2019

    • [SPARK-26665][CORE] Fix a bug that can cause BlockTransferService.fetchBlockSync to hang forever.
    • [SPARK-24669][SQL] Invalidate tables in case of DROP DATABASE CASCADE.
  • March 12, 2019

    • Fixed a bug affecting code generation.
    • Fixed a bug affecting Delta.
  • February 26, 2019

    • Fixed a bug affecting JDBC/ODBC server.
  • February 12, 2019

    • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
    • Excluding the hidden files when building HadoopRDD.
    • Fixed Parquet Filter Conversion for IN predicate when its value is empty.
    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019

    • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
    • Fixed the inconsistency between a SQL cache's cached RDD and its physical plan, which causes incorrect result.
  • January 8, 2019

    • Fixed the issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
    • Redacted credentials from RDD names in Spark UI
    • [SPARK-26352]join reordering should not change the order of output attributes.
    • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
    • Delta Lake is enabled.
    • Databricks IO Cache is now enabled for Ls series worker instance types for all pricing tiers.
  • December 18, 2018

    • [SPARK-25002]Avro: revise the output record namespace.
    • Fixed an issue affecting certain queries using Join and Limit.
    • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
    • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
    • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
    • Fixed an issue affecting installing Python Wheels in environments without Internet access.
    • Fixed a performance issue in query analyzer.
    • Fixed an issue in PySpark that caused DataFrame actions failed with "connection refused" error.
    • Fixed an issue affecting certain self union queries.
  • November 20, 2018

    • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
    • [SPARK-25387]Fix for NPE caused by bad CSV input.
    • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
  • November 6, 2018

    • [SPARK-25741]Long URLs are not rendered properly in web UI.
    • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
    • Fixed an issue affecting temporary objects cleanup in Synapse Analytics connector.
    • [SPARK-25816]Fix attribute resolution in nested extractors.
  • October 16, 2018

    • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta tables.
    • Fixed a bug affecting Union operation.
  • September 25, 2018

    • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
    • [SPARK-25402][SQL] Null handling in BooleanSimplification.
    • Fixed NotSerializableException in Avro data source.
  • September 11, 2018

    • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
    • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for articlePartition.
    • Filter reduction should handle null value correctly.
    • Improved stability of execution engine.
  • August 28, 2018

    • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
    • [SPARK-25142]Add error messages when Python worker could not open socket in _load_from_socket.
  • August 23, 2018

    • [SPARK-23935]mapEntry throws org.codehaus.commons.compiler.CompileException.
    • Fixed nullable map issue in Parquet reader.
    • [SPARK-25051][SQL] FixNullability should not stop on AnalysisBarrier.
    • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
    • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
    • [SPARK-25084]"distribute by" on multiple columns (wrap in brackets) may lead to codegen issue.
    • [SPARK-25096]Loosen nullability if the cast is force-nullable.
    • Lowered the default number of threads used by the Delta Lake Optimize command, reducing memory overhead and committing data faster.
    • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
    • Fixed secret manager redaction when command partially succeed.

Databricks Runtime 4.2 (Unsupported)

See Databricks Runtime 4.2 (Unsupported).

  • February 26, 2019

    • Fixed a bug affecting JDBC/ODBC server.
  • February 12, 2019

    • [SPARK-26709][SQL] OptimizeMetadataOnlyQuery does not handle empty records correctly.
    • Excluding the hidden files when building HadoopRDD.
    • Fixed Parquet Filter Conversion for IN predicate when its value is empty.
    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019

    • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
  • January 8, 2019

    • Fixed issue that causes the error org.apache.spark.sql.expressions.Window.rangeBetween(long,long) is not whitelisted.
    • Redacted credentials from RDD names in Spark UI
    • [SPARK-26352]join reordering should not change the order of output attributes.
    • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
    • Delta Lake is enabled.
    • Databricks IO Cache is now enabled for Ls series worker instance types for all pricing tiers.
  • December 18, 2018

    • [SPARK-25002]Avro: revise the output record namespace.
    • Fixed an issue affecting certain queries using Join and Limit.
    • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
    • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
    • [SPARK-26181]the hasMinMaxStats method of ColumnStatsMap is not correct.
    • Fixed an issue affecting installing Python Wheels in environments without Internet access.
    • Fixed a performance issue in query analyzer.
    • Fixed an issue in PySpark that caused DataFrame actions failed with "connection refused" error.
    • Fixed an issue affecting certain self union queries.
  • November 20, 2018

    • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
    • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
  • November 6, 2018

    • [SPARK-25741]Long URLs are not rendered properly in web UI.
    • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
  • October 16, 2018

    • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta tables.
    • Fixed a bug affecting Union operation.
  • September 25, 2018

    • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
    • [SPARK-25402][SQL] Null handling in BooleanSimplification.
    • Fixed NotSerializableException in Avro data source.
  • September 11, 2018

    • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
    • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for articlePartition.
    • Filter reduction should handle null value correctly.
  • August 28, 2018

    • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
  • August 23, 2018

    • Fixed NoClassDefError for Delta Snapshot
    • [SPARK-23935]mapEntry throws org.codehaus.commons.compiler.CompileException.
    • [SPARK-24957][SQL] Average with decimal followed by aggregation returns wrong result. The incorrect results of AVERAGE might be returned. The CAST added in the Average operator will be bypassed if the result of Divide is the same type which it is casted to.
    • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
    • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
    • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
    • [SPARK-25084]"distribute by" on multiple columns (wrap in brackets) may lead to codegen issue.
    • [SPARK-24934][SQL] Explicitly allowlist supported types in upper/lower bounds for in-memory partition pruning. When complex data types are used in query filters against cached data, Spark always returns an empty result set. The in-memory stats-based pruning generates incorrect results, because null is set for upper/lower bounds for complex types. The fix is to not use in-memory stats-based pruning for complex types.
    • Fixed secret manager redaction when command partially succeed.
    • Fixed nullable map issue in Parquet reader.
  • August 2, 2018

    • Added writeStream.table API in Python.
    • Fixed an issue affecting Delta checkpointing.
    • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
    • Fixed an issue that could cause mergeInto command to produce incorrect results.
    • Improved stability on accessing Azure Data Lake Storage Gen1.
    • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
    • [SPARK-24878][SQL] Fix reverse function for array type of primitive type containing null.
  • July 11, 2018

    • Fixed a bug in query execution that would cause aggregations on decimal columns with different precisions to return incorrect results in some cases.
    • Fixed a NullPointerException bug that was thrown during advanced aggregation operations like grouping sets.

Databricks Runtime 4.1 ML (Unsupported)

See Databricks Runtime 4.1 ML (Unsupported).

  • July 31, 2018
    • Added Azure Synapse Analytics to ML Runtime 4.1
    • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
    • Fixed a bug affecting Spark SQL execution engine.
    • Fixed a bug affecting code generation.
    • Fixed a bug (java.lang.NoClassDefFoundError) affecting Delta Lake.
    • Improved error handling in Delta Lake.
    • Fixed a bug that caused incorrect data skipping statistics to be collected for string columns 32 characters or greater.

Databricks Runtime 4.1 (Unsupported)

See Databricks Runtime 4.1 (Unsupported).

  • January 8, 2019

    • [SPARK-26366]ReplaceExceptWithFilter should consider NULL as False.
    • Delta Lake is enabled.
  • December 18, 2018

    • [SPARK-25002]Avro: revise the output record namespace.
    • Fixed an issue affecting certain queries using Join and Limit.
    • [SPARK-26307]Fixed CTAS when INSERT a partitioned table using Hive SerDe.
    • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
    • Fixed an issue affecting installing Python Wheels in environments without Internet access.
    • Fixed an issue in PySpark that caused DataFrame actions failed with "connection refused" error.
    • Fixed an issue affecting certain self union queries.
  • November 20, 2018

    • [SPARK-17916][SPARK-25241]Fix empty string being parsed as null when nullValue is set.
    • Fixed an issue affecting certain aggregation queries with Left Semi/Anti joins.
  • November 6, 2018

    • [SPARK-25741]Long URLs are not rendered properly in web UI.
    • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
  • October 16, 2018

    • Fixed a bug affecting the output of running SHOW CREATE TABLE on Delta tables.
    • Fixed a bug affecting Union operation.
  • September 25, 2018

    • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
    • [SPARK-25402][SQL] Null handling in BooleanSimplification.
    • Fixed NotSerializableException in Avro data source.
  • September 11, 2018

    • [SPARK-25214][SS] Fix the issue that Kafka v2 source may return duplicated records when failOnDataLoss=false.
    • [SPARK-24987][SS] Fix Kafka consumer leak when no new offsets for articlePartition.
    • Filter reduction should handle null value correctly.
  • August 28, 2018

    • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
    • [SPARK-25084]"distribute by" on multiple columns (wrap in brackets) may lead to codegen issue.
    • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
  • August 23, 2018

    • Fixed NoClassDefError for Delta Snapshot.
    • [SPARK-24957][SQL] Average with decimal followed by aggregation returns wrong result. The incorrect results of AVERAGE might be returned. The CAST added in the Average operator will be bypassed if the result of Divide is the same type which it is casted to.
    • Fixed nullable map issue in Parquet reader.
    • [SPARK-24934][SQL] Explicitly allowlist supported types in upper/lower bounds for in-memory partition pruning. When complex data types are used in query filters against cached data, Spark always returns an empty result set. The in-memory stats-based pruning generates incorrect results, because null is set for upper/lower bounds for complex types. The fix is to not use in-memory stats-based pruning for complex types.
    • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
    • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
    • Fixed secret manager redaction when command partially succeed
  • August 2, 2018

    • [SPARK-24613][SQL] Cache with UDF could not be matched with subsequent dependent caches. Wraps the logical plan with a AnalysisBarrier for execution plan compilation in CacheManager, in order to avoid the plan being analyzed again. This is also a regression of Spark 2.3.
    • Fixed a Synapse Analytics connector issue affecting timezone conversion for writing DateType data.
    • Fixed an issue affecting Delta checkpointing.
    • Fixed an issue that could cause mergeInto command to produce incorrect results.
    • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
    • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
  • July 11, 2018

    • Fixed a bug in query execution that would cause aggregations on decimal columns with different precisions to return incorrect results in some cases.
    • Fixed a NullPointerException bug that was thrown during advanced aggregation operations like grouping sets.
  • June 28, 2018

    • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
  • June 7, 2018

    • Fixed a bug affecting Spark SQL execution engine.
    • Fixed a bug affecting code generation.
    • Fixed a bug (java.lang.NoClassDefFoundError) affecting Delta Lake.
    • Improved error handling in Delta Lake.
  • May 17, 2018

    • Fixed a bug that caused incorrect data skipping statistics to be collected for string columns 32 characters or greater.

Databricks Runtime 4.0 (Unsupported)

See Databricks Runtime 4.0 (Unsupported).

  • November 6, 2018

    • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
  • October 16, 2018

    • Fixed a bug affecting Union operation.
  • September 25, 2018

    • [SPARK-25368][SQL] Incorrect constraint inference returns wrong result.
    • [SPARK-25402][SQL] Null handling in BooleanSimplification.
    • Fixed NotSerializableException in Avro data source.
  • September 11, 2018

    • Filter reduction should handle null value correctly.
  • August 28, 2018

    • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
  • August 23, 2018

    • Fixed nullable map issue in Parquet reader.
    • Fixed secret manager redaction when command partially succeed
    • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
    • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
    • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
  • August 2, 2018

    • [SPARK-24452]Avoid possible overflow in int add or multiple.
    • [SPARK-24588]Streaming join should require HashClusteredPartitioning from children.
    • Fixed an issue that could cause mergeInto command to produce incorrect results.
    • [SPARK-24867][SQL] Add AnalysisBarrier to DataFrameWriter. SQL cache is not being used when using DataFrameWriter to write a DataFrame with UDF. This is a regression caused by the changes we made in AnalysisBarrier, since not all the Analyzer rules are idempotent.
    • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
  • June 28, 2018

    • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
  • June 7, 2018

    • Fixed a bug affecting Spark SQL execution engine.
    • Improved error handling in Delta Lake.
  • May 17, 2018

    • Bug fixes for Databricks secret management.
    • Improved stability on reading data stored in Azure Data Lake Store.
    • Fixed a bug affecting RDD caching.
    • Fixed a bug affecting Null-safe Equal in Spark SQL.
  • April 24, 2018

    • Upgraded Azure Data Lake Store SDK from 2.0.11 to 2.2.8 to improve the stability of access to Azure Data Lake Store.
    • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
    • Fixed an issue that failed task serialization.
    • Improved Delta Lake stability.
  • March 14, 2018

    • Prevent unnecessary metadata updates when writing into Delta Lake.
    • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.

Databricks Runtime 3.5 LTS (Unsupported)

See Databricks Runtime 3.5 LTS (Unsupported).

  • November 7, 2019

    • [SPARK-29743][SQL] sample should set needCopyResult to true if its child's needCopyResult is true
  • October 8, 2019

    • Server side changes to allow Simba Apache Spark ODBC driver to reconnect and continue after a connection failure during fetching results (requires Simba Apache Spark ODBC driver update to version 2.6.10).
  • September 10, 2019

    • [SPARK-28699][SQL] Disable using radix sort for ShuffleExchangeExec in repartition case
  • April 9, 2019

    • [SPARK-26665][CORE] Fix a bug that can cause BlockTransferService.fetchBlockSync to hang forever.
  • February 12, 2019

    • Fixed an issue that Spark low level network protocol may be broken when sending large RPC error messages with encryption enabled (when spark.network.crypto.enabled is set to true).
  • January 30, 2019

    • Fixed an issue that can cause df.rdd.count() with UDT to return incorrect answer for certain cases.
  • December 18, 2018

    • Only ignore corrupt files after one or more retries when spark.sql.files.ignoreCorruptFiles or spark.sql.files.ignoreMissingFiles flag is enabled.
    • Fixed an issue affecting certain self union queries.
  • November 20, 2018

  • November 6, 2018

    • [SPARK-25714]Fix Null Handling in the Optimizer rule BooleanSimplification.
  • October 16, 2018

    • Fixed a bug affecting Union operation.
  • September 25, 2018

    • [SPARK-25402][SQL] Null handling in BooleanSimplification.
    • Fixed NotSerializableException in Avro data source.
  • September 11, 2018

    • Filter reduction should handle null value correctly.
  • August 28, 2018

    • Fixed a bug in Delta Lake Delete command that would incorrectly delete the rows where the condition evaluates to null.
    • [SPARK-25114]Fix RecordBinaryComparator when subtraction between two words is divisible by Integer.MAX_VALUE.
  • August 23, 2018

    • [SPARK-24809]Serializing LongHashedRelation in executor may result in data error.
    • Fixed nullable map issue in Parquet reader.
    • [SPARK-25081]Fixed a bug where ShuffleExternalSorter may access a released memory page when spilling fails to allocate memory.
    • Fixed an interaction between Databricks Delta and Pyspark which could cause transient read failures.
  • June 28, 2018

    • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
  • June 28, 2018

    • Fixed a bug that could cause incorrect query results when the name of a partition column used in a predicate differs from the case of that column in the schema of the table.
  • June 7, 2018

    • Fixed a bug affecting Spark SQL execution engine.
    • Improved error handling in Delta Lake.
  • May 17, 2018

    • Improved stability on reading data stored in Azure Data Lake Store.
    • Fixed a bug affecting RDD caching.
    • Fixed a bug affecting Null-safe Equal in Spark SQL.
    • Fixed a bug affecting certain aggregations in streaming queries.
  • April 24, 2018

    • Upgraded Azure Data Lake Store SDK from 2.0.11 to 2.2.8 to improve the stability of access to Azure Data Lake Store.
    • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
    • Fixed an issue that failed task serialization.
  • March 09, 2018

    • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
  • March 01, 2018

    • Improved the efficiency of handling streams that can take a long time to stop.
    • Fixed an issue affecting Python autocomplete.
    • Applied Ubuntu security patches.
    • Fixed an issue affecting certain queries using Python UDFs and window functions.
    • Fixed an issue affecting the use of UDFs on a cluster with table access control enabled.
  • January 29, 2018

    • Fixed an issue affecting the manipulation of tables stored in Azure Blob storage.
    • Fixed aggregation after dropDuplicates on empty DataFrame.

Databricks Runtime 3.4 (Unsupported)

See Databricks Runtime 3.4 (Unsupported).

  • June 7, 2018

    • Fixed a bug affecting Spark SQL execution engine.
    • Improved error handling in Delta Lake.
  • May 17, 2018

    • Improved stability on reading data stored in Azure Data Lake Store.
    • Fixed a bug affecting RDD caching.
    • Fixed a bug affecting Null-safe Equal in Spark SQL.
  • April 24, 2018

    • Fixed a bug affecting the insertion of overwrites to partitioned Hive tables when spark.databricks.io.hive.fastwriter.enabled is false.
  • March 09, 2018

    • Fixed an issue caused by a race condition that could, in rare circumstances, lead to loss of some output files.
  • December 13, 2017

    • Fixed an issue affecting UDFs in Scala.
    • Fixed an issue affecting the use of Data Skipping Index on data source tables stored in non-DBFS paths.
  • December 07, 2017

    • Improved shuffle stability.

Unsupported Databricks Runtime releases

For the original release notes, follow the link below the subheading.