Databricks Runtime 12.1 (EoS)
Note
Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.
The following release notes provide information about Databricks Runtime 12.1, powered by Apache Spark 3.3.1.
Databricks released this version in January 2023.
New features and improvements
- Delta Lake table features supported for protocol management
- Predictive I/O for updates is in public preview
- Catalog Explorer is now available to all personas
- Support for multiple stateful operators in a single streaming query
- Support for protocol buffers is in Public Preview
- Support for Confluent Schema Registry authentication
- Support for sharing table history with Delta Sharing shares
- Support for streaming with Delta Sharing shares
- Table version using timestamp now supported for Delta Sharing tables in catalogs
- Support for WHEN NOT MATCHED BY SOURCE for MERGE INTO
- Optimized statistics collection for CONVERT TO DELTA
- Unity Catalog support for undropping tables
Delta Lake table features supported for protocol management
Azure Databricks has introduced support for Delta Lake table features, which introduce granular flags specifying which features are supported by a given table. See How does Azure Databricks manage Delta Lake feature compatibility?.
Predictive I/O for updates is in public preview
Predictive I/O now accelerates DELETE
, MERGE
, and UPDATE
operations for Delta tables with deletion vectors enabled on Photon enabled compute. See What is predictive I/O?.
Catalog Explorer is now available to all personas
Catalog Explorer is now available to all Azure Databricks personas when using Databricks Runtime 7.3 LTS and above.
Support for multiple stateful operators in a single streaming query
Users can now chain stateful operators with append mode in the streaming query. Not all operators are fully supported. Stream-stream time interval join and flatMapGroupsWithState
don't allow other stateful operators to be chained.
Support for protocol buffers is in Public Preview
You can use the from_protobuf
and to_protobuf
functions to exchange data between binary and struct types. See Read and write protocol buffers.
Support for Confluent Schema Registry authentication
Azure Databricks integration with Confluent Schema Registry now supports external schema registry addresses with authentication. This feature is available for from_avro
, to_avro
, from_protobuf
, and to_protobuf
functions. See Protobuf or Avro.
Support for sharing table history with Delta Sharing shares
You can now share a table with full history using Delta Sharing, allowing recipients to perform time travel queries and query the table using Spark Structured Streaming. WITH HISTORY
is recommended instead of CHANGE DATA FEED
, although the latter continues to be supported. See ALTER SHARE and Add tables to a share.
Support for streaming with Delta Sharing shares
Spark Structured Streaming now works with the format deltasharing
on a source Delta Sharing table that has been shared using WITH HISTORY
.
Table version using timestamp now supported for Delta Sharing tables in catalogs
You can now use the SQL syntax TIMESTAMP AS OF
in SELECT
statements to specify the version of a Delta Sharing table that's mounted in a catalog. Tables must be shared using WITH HISTORY
.
Support for WHEN NOT MATCHED BY SOURCE for MERGE INTO
You can now add WHEN NOT MATCHED BY SOURCE
clauses to MERGE INTO
to update or delete rows in the chosen table that don't have matches in the source table based on the merge condition. The new clause is available in SQL, Python, Scala, and Java. See MERGE INTO.
Optimized statistics collection for CONVERT TO DELTA
Statistics collection for the CONVERT TO DELTA
operation is now much faster. This reduces the number of workloads that might use NO STATISTICS
for efficiency.
Unity Catalog support for undropping tables
This feature was initially released in Public Preview. It is GA as of October 25, 2023.
You can now undrop a dropped managed or external table in an existing schema within seven days of dropping. See UNDROP TABLE and SHOW TABLES DROPPED.
Library upgrades
- Upgraded Python libraries:
- filelock from 3.8.0 to 3.8.2
- platformdirs from 2.5.4 to 2.6.0
- setuptools from 58.0.4 to 61.2.0
- Upgraded R libraries:
- Upgraded Java libraries:
- io.delta.delta-sharing-spark_2.12 from 0.5.2 to 0.6.2
- org.apache.hive.hive-storage-api from 2.7.2 to 2.8.1
- org.apache.parquet.parquet-column from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-common from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-encoding from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-format-structures from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-hadoop from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-jackson from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.tukaani.xz from 1.8 to 1.9
Apache Spark
Databricks Runtime 12.1 includes Apache Spark 3.3.1. This release includes all Spark fixes and improvements included in Databricks Runtime 12.0 (EoS), as well as the following additional bug fixes and improvements made to Spark:
- [SPARK-41405] [SC-119769][12.1.0] Revert "[SC-119411][SQL] Centralize the column resolution logic" and "[SC-117170][SPARK-41338][SQL] Resolve outer references and normal columns in the same analyzer batch"
- [SPARK-41405] [SC-119411][SQL] Centralize the column resolution logic
- [SPARK-41859] [SC-119514][SQL] CreateHiveTableAsSelectCommand should set the overwrite flag correctly
- [SPARK-41659] [SC-119526][CONNECT][12.X] Enable doctests in pyspark.sql.connect.readwriter
- [SPARK-41858] [SC-119427][SQL] Fix ORC reader perf regression due to DEFAULT value feature
- [SPARK-41807] [SC-119399][CORE] Remove non-existent error class: UNSUPPORTED_FEATURE.DISTRIBUTE_BY
- [SPARK-41578] [12.x][SC-119273][SQL] Assign name to _LEGACY_ERROR_TEMP_2141
- [SPARK-41571] [SC-119362][SQL] Assign name to _LEGACY_ERROR_TEMP_2310
- [SPARK-41810] [SC-119373][CONNECT] Infer names from a list of dictionaries in SparkSession.createDataFrame
- [SPARK-40993] [SC-119504][SPARK-41705][CONNECT][12.X] Move Spark Connect documentation and script to dev/ and Python documentation
- [SPARK-41534] [SC-119456][CONNECT][SQL][12.x] Setup initial client module for Spark Connect
- [SPARK-41365] [SC-118498][UI][3.3] Stages UI page fails to load for proxy in specific yarn environment
- [SPARK-41481] [SC-118150][CORE][SQL] Reuse
INVALID_TYPED_LITERAL
instead of_LEGACY_ERROR_TEMP_0020
- [SPARK-41049] [SC-119305][SQL] Revisit stateful expression handling
- [SPARK-41726] [SC-119248][SQL] Remove
OptimizedCreateHiveTableAsSelectCommand
- [SPARK-41271] [SC-118648][SC-118348][SQL] Support parameterized SQL queries by
sql()
- [SPARK-41066] [SC-119344][CONNECT][PYTHON] Implement
DataFrame.sampleBy
andDataFrame.stat.sampleBy
- [SPARK-41407] [SC-119402][SC-119012][SQL][ALL TESTS] Pull out v1 write to WriteFiles
- [SPARK-41565] [SC-118868][SQL] Add the error class
UNRESOLVED_ROUTINE
- [SPARK-41668] [SC-118925][SQL] DECODE function returns wrong results when passed NULL
- [SPARK-41554] [SC-119274] fix changing of Decimal scale when scale decreased by m…
- [SPARK-41065] [SC-119324][CONNECT][PYTHON] Implement
DataFrame.freqItems
andDataFrame.stat.freqItems
- [SPARK-41742] [SC-119404][SPARK-41745][CONNECT][12.X] Reenable doc tests and add missing column alias to count()
- [SPARK-41069] [SC-119310][CONNECT][PYTHON] Implement
DataFrame.approxQuantile
andDataFrame.stat.approxQuantile
- [SPARK-41809] [SC-119367][CONNECT][PYTHON] Make function
from_json
support DataType Schema - [SPARK-41804] [SC-119382][SQL] Choose correct element size in
InterpretedUnsafeProjection
for array of UDTs - [SPARK-41786] [SC-119308][CONNECT][PYTHON] Deduplicate helper functions
- [SPARK-41745] [SC-119378][SPARK-41789][12.X] Make
createDataFrame
support list of Rows - [SPARK-41344] [SC-119217][SQL] Make error clearer when table not found in SupportsCatalogOptions catalog
- [SPARK-41803] [SC-119380][CONNECT][PYTHON] Add missing function
log(arg1, arg2)
- [SPARK-41808] [SC-119356][CONNECT][PYTHON] Make JSON functions support options
- [SPARK-41779] [SC-119275][SPARK-41771][CONNECT][PYTHON] Make
__getitem__
support filter and select - [SPARK-41783] [SC-119288][SPARK-41770][CONNECT][PYTHON] Make column op support None
- [SPARK-41440] [SC-119279][CONNECT][PYTHON] Avoid the cache operator for general Sample.
- [SPARK-41785] [SC-119290][CONNECT][PYTHON] Implement
GroupedData.mean
- [SPARK-41629] [SC-119276][CONNECT] Support for Protocol Extensions in Relation and Expression
- [SPARK-41417] [SC-118000][CORE][SQL] Rename
_LEGACY_ERROR_TEMP_0019
toINVALID_TYPED_LITERAL
- [SPARK-41533] [SC-119342][CONNECT][12.X] Proper Error Handling for Spark Connect Server / Client
- [SPARK-41292] [SC-119357][CONNECT][12.X] Support Window in pyspark.sql.window namespace
- [SPARK-41493] [SC-119339][CONNECT][PYTHON] Make csv functions support options
- [SPARK-39591] [SC-118675][SS] Async Progress Tracking
- [SPARK-41767] [SC-119337][CONNECT][PYTHON][12.X] Implement
Column.{withField, dropFields}
- [SPARK-41068] [SC-119268][CONNECT][PYTHON] Implement
DataFrame.stat.corr
- [SPARK-41655] [SC-119323][CONNECT][12.X] Enable doctests in pyspark.sql.connect.column
- [SPARK-41738] [SC-119170][CONNECT] Mix ClientId in SparkSession cache
- [SPARK-41354] [SC-119194][CONNECT] Add
RepartitionByExpression
to proto - [SPARK-41784] [SC-119289][CONNECT][PYTHON] Add missing
__rmod__
in Column - [SPARK-41778] [SC-119262][SQL] Add an alias "reduce" to ArrayAggregate
- [SPARK-41067] [SC-119171][CONNECT][PYTHON] Implement
DataFrame.stat.cov
- [SPARK-41764] [SC-119216][CONNECT][PYTHON] Make the internal string op name consistent with FunctionRegistry
- [SPARK-41734] [SC-119160][CONNECT] Add a parent message for Catalog
- [SPARK-41742] [SC-119263] Support df.groupBy().agg({"*":"count"})
- [SPARK-41761] [SC-119213][CONNECT][PYTHON] Fix arithmetic ops:
__neg__
,__pow__
,__rpow__
- [SPARK-41062] [SC-118182][SQL] Rename
UNSUPPORTED_CORRELATED_REFERENCE
toCORRELATED_REFERENCE
- [SPARK-41751] [SC-119211][CONNECT][PYTHON] Fix
Column.{isNull, isNotNull, eqNullSafe}
- [SPARK-41728] [SC-119164][CONNECT][PYTHON][12.X] Implement
unwrap_udt
function - [SPARK-41333] [SC-119195][SPARK-41737] Implement
GroupedData.{min, max, avg, sum}
- [SPARK-41751] [SC-119206][CONNECT][PYTHON] Fix
Column.{bitwiseAND, bitwiseOR, bitwiseXOR}
- [SPARK-41631] [SC-101081][SQL] Support implicit lateral column alias resolution on Aggregate
- [SPARK-41529] [SC-119207][CONNECT][12.X] Implement SparkSession.stop
- [SPARK-41729] [SC-119205][CORE][SQL][12.X] Rename
_LEGACY_ERROR_TEMP_0011
toUNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES
- [SPARK-41717] [SC-119078][CONNECT][12.X] Deduplicate print and repr_html at LogicalPlan
- [SPARK-41740] [SC-119169][CONNECT][PYTHON] Implement
Column.name
- [SPARK-41733] [SC-119163][SQL][SS] Apply tree-pattern based pruning for the rule ResolveWindowTime
- [SPARK-41732] [SC-119157][SQL][SS] Apply tree-pattern based pruning for the rule SessionWindowing
- [SPARK-41498] [SC-119018] Propagate metadata through Union
- [SPARK-41731] [SC-119166][CONNECT][PYTHON][12.X] Implement the column accessor
- [SPARK-41736] [SC-119161][CONNECT][PYTHON]
pyspark_types_to_proto_types
should supportsArrayType
- [SPARK-41473] [SC-119092][CONNECT][PYTHON] Implement
format_number
function - [SPARK-41707] [SC-119141][CONNECT][12.X] Implement Catalog API in Spark Connect
- [SPARK-41710] [SC-119062][CONNECT][PYTHON] Implement
Column.between
- [SPARK-41235] [SC-119088][SQL][PYTHON]High-order function: array_compact implementation
- [SPARK-41518] [SC-118453][SQL] Assign a name to the error class
_LEGACY_ERROR_TEMP_2422
- [SPARK-41723] [SC-119091][CONNECT][PYTHON] Implement
sequence
function - [SPARK-41703] [SC-119060][CONNECT][PYTHON] Combine NullType and typed_null in Literal
- [SPARK-41722] [SC-119090][CONNECT][PYTHON] Implement 3 missing time window functions
- [SPARK-41503] [SC-119043][CONNECT][PYTHON] Implement Partition Transformation Functions
- [SPARK-41413] [SC-118968][SQL] Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but join expressions are compatible
- [SPARK-41700] [SC-119046][CONNECT][PYTHON] Remove
FunctionBuilder
- [SPARK-41706] [SC-119094][CONNECT][PYTHON]
pyspark_types_to_proto_types
should supportsMapType
- [SPARK-41702] [SC-119049][CONNECT][PYTHON] Add invalid column ops
- [SPARK-41660] [SC-118866][SQL] Only propagate metadata columns if they are used
- [SPARK-41637] [SC-119003][SQL] ORDER BY ALL
- [SPARK-41513] [SC-118945][SQL] Implement an accumulator to collect per mapper row count metrics
- [SPARK-41647] [SC-119064][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.functions
- [SPARK-41701] [SC-119048][CONNECT][PYTHON] Make column op support
decimal
- [SPARK-41383] [SC-119015][SPARK-41692][SPARK-41693] Implement
rollup
,cube
andpivot
- [SPARK-41635] [SC-118944][SQL] GROUP BY ALL
- [SPARK-41645] [SC-119057][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.dataframe
- [SPARK-41688] [SC-118951][CONNECT][PYTHON] Move Expressions to expressions.py
- [SPARK-41687] [SC-118949][CONNECT] Deduplicate docstrings in pyspark.sql.connect.group
- [SPARK-41649] [SC-118950][CONNECT] Deduplicate docstrings in pyspark.sql.connect.window
- [SPARK-41681] [SC-118939][CONNECT] Factor GroupedData out to group.py
- [SPARK-41292] [SC-119038][SPARK-41640][SPARK-41641][CONNECT][PYTHON][12.X] Implement
Window
functions - [SPARK-41675] [SC-119031][SC-118934][CONNECT][PYTHON][12.X] Make Column op support
datetime
- [SPARK-41672] [SC-118929][CONNECT][PYTHON] Enable the deprecated functions
- [SPARK-41673] [SC-118932][CONNECT][PYTHON] Implement
Column.astype
- [SPARK-41364] [SC-118865][CONNECT][PYTHON] Implement
broadcast
function - [SPARK-41648] [SC-118914][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.readwriter
- [SPARK-41646] [SC-118915][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.session
- [SPARK-41643] [SC-118862][CONNECT][12.X] Deduplicate docstrings in pyspark.sql.connect.column
- [SPARK-41663] [SC-118936][CONNECT][PYTHON][12.X] Implement the rest of Lambda functions
- [SPARK-41441] [SC-118557][SQL] Support Generate with no required child output to host outer references
- [SPARK-41669] [SC-118923][SQL] Early pruning in canCollapseExpressions
- [SPARK-41639] [SC-118927][SQL][PROTOBUF] : Remove ScalaReflectionLock from SchemaConverters
- [SPARK-41464] [SC-118861][CONNECT][PYTHON] Implement
DataFrame.to
- [SPARK-41434] [SC-118857][CONNECT][PYTHON] Initial
LambdaFunction
implementation - [SPARK-41539] [SC-118802][SQL] Remap stats and constraints against output in logical plan for LogicalRDD
- [SPARK-41396] [SC-118786][SQL][PROTOBUF] OneOf field support and recursion checks
- [SPARK-41528] [SC-118769][CONNECT][12.X] Merge namespace of Spark Connect and PySpark API
- [SPARK-41568] [SC-118715][SQL] Assign name to _LEGACY_ERROR_TEMP_1236
- [SPARK-41440] [SC-118788][CONNECT][PYTHON] Implement
DataFrame.randomSplit
- [SPARK-41583] [SC-118718][SC-118642][CONNECT][PROTOBUF] Add Spark Connect and protobuf into setup.py with specifying dependencies
- [SPARK-27561] [SC-101081][12.x][SQL] Support implicit lateral column alias resolution on Project
- [SPARK-41535] [SC-118645][SQL] Set null correctly for calendar interval fields in
InterpretedUnsafeProjection
andInterpretedMutableProjection
- [SPARK-40687] [SC-118439][SQL] Support data masking built-in function 'mask'
- [SPARK-41520] [SC-118440][SQL] Split AND_OR TreePattern to separate AND and OR TreePatterns
- [SPARK-41349] [SC-118668][CONNECT][PYTHON] Implement DataFrame.hint
- [SPARK-41546] [SC-118541][CONNECT][PYTHON]
pyspark_types_to_proto_types
should support StructType. - [SPARK-41334] [SC-118549][CONNECT][PYTHON] Move
SortOrder
proto from relations to expressions - [SPARK-41387] [SC-118450][SS] Assert current end offset from Kafka data source for Trigger.AvailableNow
- [SPARK-41508] [SC-118445][CORE][SQL] Rename
_LEGACY_ERROR_TEMP_1180
toUNEXPECTED_INPUT_TYPE
and remove_LEGACY_ERROR_TEMP_1179
- [SPARK-41319] [SC-118441][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function
when
withUnresolvedFunction
- [SPARK-41541] [SC-118460][SQL] Fix call to wrong child method in SQLShuffleWriteMetricsReporter.decRecordsWritten()
- [SPARK-41453] [SC-118458][CONNECT][PYTHON] Implement
DataFrame.subtract
- [SPARK-41248] [SC-118436][SC-118303][SQL] Add "spark.sql.json.enablePartialResults" to enable/disable JSON partial results
- [SPARK-41437] Revert "[SC-117601][SQL] Do not optimize the inputquery twice for v1 write fallback"
- [SPARK-41472] [SC-118352][CONNECT][PYTHON] Implement the rest of string/binary functions
- [SPARK-41526] [SC-118355][CONNECT][PYTHON] Implement
Column.isin
- [SPARK-32170] [SC-118384] [CORE] Improve the speculation through the stage task metrics.
- [SPARK-41524] [SC-118399][SS] Differentiate SQLConf and extraOptions in StateStoreConf for its usage in RocksDBConf
- [SPARK-41465] [SC-118381][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1235
- [SPARK-41511] [SC-118365][SQL] LongToUnsafeRowMap support ignoresDuplicatedKey
- [SPARK-41409] [SC-118302][CORE][SQL] Rename
_LEGACY_ERROR_TEMP_1043
toWRONG_NUM_ARGS.WITHOUT_SUGGESTION
- [SPARK-41438] [SC-118344][CONNECT][PYTHON] Implement
DataFrame.colRegex
- [SPARK-41437] [SC-117601][SQL] Do not optimize the input query twice for v1 write fallback
- [SPARK-41314] [SC-117172][SQL] Assign a name to the error class
_LEGACY_ERROR_TEMP_1094
- [SPARK-41443] [SC-118004][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1061
- [SPARK-41506] [SC-118241][CONNECT][PYTHON] Refactor LiteralExpression to support DataType
- [SPARK-41448] [SC-118046] Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
- [SPARK-41456] [SC-117970][SQL] Improve the performance of try_cast
- [SPARK-41495] [SC-118125][CONNECT][PYTHON] Implement
collection
functions: P~Z - [SPARK-41478] [SC-118167][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1234
- [SPARK-41406] [SC-118161][SQL] Refactor error message for
NUM_COLUMNS_MISMATCH
to make it more generic - [SPARK-41404] [SC-118016][SQL] Refactor
ColumnVectorUtils#toBatch
to makeColumnarBatchSuite#testRandomRows
test more primitive dataType - [SPARK-41468] [SC-118044][SQL] Fix PlanExpression handling in EquivalentExpressions
- [SPARK-40775] [SC-118045][SQL] Fix duplicate description entries for V2 file scans
- [SPARK-41492] [SC-118042][CONNECT][PYTHON] Implement MISC functions
- [SPARK-41459] [SC-118005][SQL] fix thrift server operation log output is empty
- [SPARK-41395] [SC-117899][SQL]
InterpretedMutableProjection
should usesetDecimal
to set null values for decimals in an unsafe row - [SPARK-41376] [SC-117840][CORE][3.3] Correct the Netty preferDirectBufs check logic on executor start
- [SPARK-41484] [SC-118159][SC-118036][CONNECT][PYTHON][12.x] Implement
collection
functions: E~M - [SPARK-41389] [SC-117426][CORE][SQL] Reuse
WRONG_NUM_ARGS
instead of_LEGACY_ERROR_TEMP_1044
- [SPARK-41462] [SC-117920][SQL] Date and timestamp type can up cast to TimestampNTZ
- [SPARK-41435] [SC-117810][SQL] Change to call
invalidFunctionArgumentsError
forcurdate()
whenexpressions
is not empty - [SPARK-41187] [SC-118030][CORE] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen
- [SPARK-41360] [SC-118083][CORE] Avoid BlockManager re-registration if the executor has been lost
- [SPARK-41378] [SC-117686][SQL] Support Column Stats in DS v2
- [SPARK-41402] [SC-117910][SQL][CONNECT][12.X] Override prettyName of StringDecode
- [SPARK-41414] [SC-118041][CONNECT][PYTHON][12.x] Implement date/timestamp functions
- [SPARK-41329] [SC-117975][CONNECT] Resolve circular imports in Spark Connect
- [SPARK-41477] [SC-118025][CONNECT][PYTHON] Correctly infer the datatype of literal integers
- [SPARK-41446] [SC-118024][CONNECT][PYTHON][12.x] Make
createDataFrame
support schema and more input dataset types - [SPARK-41475] [SC-117997][CONNECT] Fix lint-scala command error and typo
- [SPARK-38277] [SC-117799][SS] Clear write batch after RocksDB state store's commit
- [SPARK-41375] [SC-117801][SS] Avoid empty latest KafkaSourceOffset
- [SPARK-41412] [SC-118015][CONNECT] Implement
Column.cast
- [SPARK-41439] [SC-117893][CONNECT][PYTHON] Implement
DataFrame.melt
andDataFrame.unpivot
- [SPARK-41399] [SC-118007][SC-117474][CONNECT] Refactor column related tests to test_connect_column
- [SPARK-41351] [SC-117957][SC-117412][CONNECT][12.x] Column should support != operator
- [SPARK-40697] [SC-117806][SC-112787][SQL] Add read-side char padding to cover external data files
- [SPARK-41349] [SC-117594][CONNECT][12.X] Implement DataFrame.hint
- [SPARK-41338] [SC-117170][SQL] Resolve outer references and normal columns in the same analyzer batch
- [SPARK-41436] [SC-117805][CONNECT][PYTHON] Implement
collection
functions: A~C - [SPARK-41445] [SC-117802][CONNECT] Implement DataFrameReader.parquet
- [SPARK-41452] [SC-117865][SQL]
to_char
should return null when format is null - [SPARK-41444] [SC-117796][CONNECT] Support read.json()
- [SPARK-41398] [SC-117508][SQL] Relax constraints on Storage-Partitioned Join when partition keys after runtime filtering do not match
- [SPARK-41228] [SC-117169][SQL] Rename & Improve error message for
COLUMN_NOT_IN_GROUP_BY_CLAUSE
. - [SPARK-41381] [SC-117593][CONNECT][PYTHON] Implement
count_distinct
andsum_distinct
functions - [SPARK-41433] [SC-117596][CONNECT] Make Max Arrow BatchSize configurable
- [SPARK-41397] [SC-117590][CONNECT][PYTHON] Implement part of string/binary functions
- [SPARK-41382] [SC-117588][CONNECT][PYTHON] Implement
product
function - [SPARK-41403] [SC-117595][CONNECT][PYTHON] Implement
DataFrame.describe
- [SPARK-41366] [SC-117580][CONNECT] DF.groupby.agg() should be compatible
- [SPARK-41369] [SC-117584][CONNECT] Add connect common to servers' shaded jar
- [SPARK-41411] [SC-117562][SS] Multi-Stateful Operator watermark support bug fix
- [SPARK-41176] [SC-116630][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1042
- [SPARK-41380] [SC-117476][CONNECT][PYTHON][12.X] Implement aggregation functions
- [SPARK-41363] [SC-117470][CONNECT][PYTHON][12.X] Implement normal functions
- [SPARK-41305] [SC-117411][CONNECT] Improve Documentation for Command proto
- [SPARK-41372] [SC-117427][CONNECT][PYTHON] Implement DataFrame TempView
- [SPARK-41379] [SC-117420][SS][PYTHON] Provide cloned spark session in DataFrame in user function for foreachBatch sink in PySpark
- [SPARK-41373] [SC-117405][SQL][ERROR] Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION
- [SPARK-41358] [SC-117417][SQL] Refactor
ColumnVectorUtils#populate
method to usePhysicalDataType
instead ofDataType
- [SPARK-41355] [SC-117423][SQL] Workaround hive table name validation issue
- [SPARK-41390] [SC-117429][SQL] Update the script used to generate
register
function inUDFRegistration
- [SPARK-41206] [SC-117233][SC-116381][SQL] Rename the error class
_LEGACY_ERROR_TEMP_1233
toCOLUMN_ALREADY_EXISTS
- [SPARK-41357] [SC-117310][CONNECT][PYTHON][12.X] Implement math functions
- [SPARK-40970] [SC-117308][CONNECT][PYTHON] Support List[Column] for Join's on argument
- [SPARK-41345] [SC-117178][CONNECT] Add Hint to Connect Proto
- [SPARK-41226] [SC-117194][SQL][12.x] Refactor Spark types by introducing physical types
- [SPARK-41317] [SC-116902][CONNECT][PYTHON][12.X] Add basic support for DataFrameWriter
- [SPARK-41347] [SC-117173][CONNECT] Add Cast to Expression proto
- [SPARK-41323] [SC-117128][SQL] Support current_schema
- [SPARK-41339] [SC-117171][SQL] Close and recreate RocksDB write batch instead of just clearing
- [SPARK-41227] [SC-117165][CONNECT][PYTHON] Implement DataFrame cross join
- [SPARK-41346] [SC-117176][CONNECT][PYTHON] Implement
asc
anddesc
functions - [SPARK-41343] [SC-117166][CONNECT] Move FunctionName parsing to server side
- [SPARK-41321] [SC-117163][CONNECT] Support target field for UnresolvedStar
- [SPARK-41237] [SC-117167][SQL] Reuse the error class
UNSUPPORTED_DATATYPE
for_LEGACY_ERROR_TEMP_0030
- [SPARK-41309] [SC-116916][SQL] Reuse
INVALID_SCHEMA.NON_STRING_LITERAL
instead of_LEGACY_ERROR_TEMP_1093
- [SPARK-41276] [SC-117136][SQL][ML][MLLIB][PROTOBUF][PYTHON][R][SS][AVRO] Optimize constructor use of
StructType
- [SPARK-41335] [SC-117135][CONNECT][PYTHON] Support IsNull and IsNotNull in Column
- [SPARK-41332] [SC-117131][CONNECT][PYTHON] Fix
nullOrdering
inSortOrder
- [SPARK-41325] [SC-117132][CONNECT][12.X] Fix missing avg() for GroupBy on DF
- [SPARK-41327] [SC-117137][CORE] Fix
SparkStatusTracker.getExecutorInfos
by switch On/OffHeapStorageMemory info - [SPARK-41315] [SC-117129][CONNECT][PYTHON] Implement
DataFrame.replace
andDataFrame.na.replace
- [SPARK-41328] [SC-117125][CONNECT][PYTHON] Add logical and string API to Column
- [SPARK-41331] [SC-117127][CONNECT][PYTHON] Add
orderBy
anddrop_duplicates
- [SPARK-40987] [SC-117124][CORE]
BlockManager#removeBlockInternal
should ensure the lock is unlocked gracefully - [SPARK-41268] [SC-117102][SC-116970][CONNECT][PYTHON] Refactor "Column" for API Compatibility
- [SPARK-41312] [SC-116881][CONNECT][PYTHON][12.X] Implement DataFrame.withColumnRenamed
- [SPARK-41221] [SC-116607][SQL] Add the error class
INVALID_FORMAT
- [SPARK-41272] [SC-116742][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2019
- [SPARK-41180] [SC-116760][SQL] Reuse
INVALID_SCHEMA
instead of_LEGACY_ERROR_TEMP_1227
- [SPARK-41260] [SC-116880][PYTHON][SS][12.X] Cast NumPy instances to Python primitive types in GroupState update
- [SPARK-41174] [SC-116609][CORE][SQL] Propagate an error class to users for invalid
format
ofto_binary()
- [SPARK-41264] [SC-116971][CONNECT][PYTHON] Make Literal support more datatypes
- [SPARK-41326] [SC-116972] [CONNECT] Fix deduplicate is missing input
- [SPARK-41316] [SC-116900][SQL] Enable tail-recursion wherever possible
- [SPARK-41297] [SC-116931] [CONNECT] [PYTHON] Support String Expressions in filter.
- [SPARK-41256] [SC-116932][SC-116883][CONNECT] Implement DataFrame.withColumn(s)
- [SPARK-41182] [SC-116632][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1102
- [SPARK-41181] [SC-116680][SQL] Migrate the map options errors onto error classes
- [SPARK-40940] [SC-115993][12.x] Remove Multi-stateful operator checkers for streaming queries.
- [SPARK-41310] [SC-116885][CONNECT][PYTHON] Implement DataFrame.toDF
- [SPARK-41179] [SC-116631][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1092
- [SPARK-41003] [SC-116741][SQL] BHJ LeftAnti does not update numOutputRows when codegen is disabled
- [SPARK-41148] [SC-116878][CONNECT][PYTHON] Implement
DataFrame.dropna
andDataFrame.na.drop
- [SPARK-41217] [SC-116380][SQL] Add the error class
FAILED_FUNCTION_CALL
- [SPARK-41308] [SC-116875][CONNECT][PYTHON] Improve DataFrame.count()
- [SPARK-41301] [SC-116786] [CONNECT] Homogenize Behavior for SparkSession.range()
- [SPARK-41306] [SC-116860][CONNECT] Improve Connect Expression proto documentation
- [SPARK-41280] [SC-116733][CONNECT] Implement DataFrame.tail
- [SPARK-41300] [SC-116751] [CONNECT] Unset schema is interpreted as Schema
- [SPARK-41255] [SC-116730][SC-116695] [CONNECT] Rename RemoteSparkSession
- [SPARK-41250] [SC-116788][SC-116633][CONNECT][PYTHON] DataFrame. toPandas should not return optional pandas dataframe
- [SPARK-41291] [SC-116738][CONNECT][PYTHON]
DataFrame.explain
should print and return None - [SPARK-41278] [SC-116732][CONNECT] Clean up unused QualifiedAttribute in Expression.proto
- [SPARK-41097] [SC-116653][CORE][SQL][SS][PROTOBUF] Remove redundant collection conversion base on Scala 2.13 code
- [SPARK-41261] [SC-116718][PYTHON][SS] Fix issue for applyInPandasWithState when the columns of grouping keys are not placed in order from earliest
- [SPARK-40872] [SC-116717][3.3] Fallback to original shuffle block when a push-merged shuffle chunk is zero-size
- [SPARK-41114] [SC-116628][CONNECT] Support local data for LocalRelation
- [SPARK-41216] [SC-116678][CONNECT][PYTHON] Implement
DataFrame.{isLocal, isStreaming, printSchema, inputFiles}
- [SPARK-41238] [SC-116670][CONNECT][PYTHON] Support more built-in datatypes
- [SPARK-41230] [SC-116674][CONNECT][PYTHON] Remove
str
from Aggregate expression type - [SPARK-41224] [SC-116652][SPARK-41165][SPARK-41184][CONNECT] Optimized Arrow-based collect implementation to stream from server to client
- [SPARK-41222] [SC-116625][CONNECT][PYTHON] Unify the typing definitions
- [SPARK-41225] [SC-116623] [CONNECT] [PYTHON] Disable unsupported functions.
- [SPARK-41201] [SC-116526][CONNECT][PYTHON] Implement
DataFrame.SelectExpr
in Python client - [SPARK-41203] [SC-116258] [CONNECT] Support Dataframe.tansform in Python client.
- [SPARK-41213] [SC-116375][CONNECT][PYTHON] Implement
DataFrame.__repr__
andDataFrame.dtypes
- [SPARK-41169] [SC-116378][CONNECT][PYTHON] Implement
DataFrame.drop
- [SPARK-41172] [SC-116245][SQL] Migrate the ambiguous ref error to an error class
- [SPARK-41122] [SC-116141][CONNECT] Explain API can support different modes
- [SPARK-41209] [SC-116584][SC-116376][PYTHON] Improve PySpark type inference in _merge_type method
- [SPARK-41196] [SC-116555][SC-116179] [CONNECT] Homogenize the protobuf version across the Spark connect server to use the same major version.
- [SPARK-35531] [SC-116409][SQL] Update hive table stats without unnecessary convert
- [SPARK-41154] [SC-116289][SQL] Incorrect relation caching for queries with time travel spec
- [SPARK-41212] [SC-116554][SC-116389][CONNECT][PYTHON] Implement
DataFrame.isEmpty
- [SPARK-41135] [SC-116400][SQL] Rename
UNSUPPORTED_EMPTY_LOCATION
toINVALID_EMPTY_LOCATION
- [SPARK-41183] [SC-116265][SQL] Add an extension API to do plan normalization for caching
- [SPARK-41054] [SC-116447][UI][CORE] Support RocksDB as KVStore in live UI
- [SPARK-38550] [SC-115223]Revert "[SQL][CORE] Use a disk-based store to save more debug information for live UI"
- [SPARK-41173] [SC-116185][SQL] Move
require()
out from the constructors of string expressions - [SPARK-41188] [SC-116242][CORE][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes
- [SPARK-41130] [SC-116155][SQL] Rename
OUT_OF_DECIMAL_TYPE_RANGE
toNUMERIC_OUT_OF_SUPPORTED_RANGE
- [SPARK-41175] [SC-116238][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1078
- [SPARK-41106] [SC-116073][SQL] Reduce collection conversion when create AttributeMap
- [SPARK-41139] [SC-115983][SQL] Improve error class:
PYTHON_UDF_IN_ON_CLAUSE
- [SPARK-40657] [SC-115997][PROTOBUF] Require shading for Java class jar, improve error handling
- [SPARK-40999] [SC-116168] Hint propagation to subqueries
- [SPARK-41017] [SC-116054][SQL] Support column pruning with multiple nondeterministic Filters
- [SPARK-40834] [SC-114773][SQL] Use SparkListenerSQLExecutionEnd to track final SQL status in UI
- [SPARK-41118] [SC-116027][SQL]
to_number
/try_to_number
should returnnull
when format isnull
- [SPARK-39799] [SC-115984][SQL] DataSourceV2: View catalog interface
- [SPARK-40665] [SC-116210][SC-112300][CONNECT] Avoid embedding Spark Connect in the Apache Spark binary release
- [SPARK-41048] [SC-116043][SQL] Improve output partitioning and ordering with AQE cache
- [SPARK-41198] [SC-116256][SS] Fix metrics in streaming query having CTE and DSv1 streaming source
- [SPARK-41199] [SC-116244][SS] Fix metrics issue when DSv1 streaming source and DSv2 streaming source are co-used
- [SPARK-40957] [SC-116261][SC-114706] Add in memory cache in HDFSMetadataLog
- [SPARK-40940] Revert "[SC-115993] Remove Multi-stateful operator checkers for streaming queries."
- [SPARK-41090] [SC-116040][SQL] Throw Exception for
db_name.view_name
when creating temp view by Dataset API - [SPARK-41133] [SC-116085][SQL] Integrate
UNSCALED_VALUE_TOO_LARGE_FOR_PRECISION
intoNUMERIC_VALUE_OUT_OF_RANGE
- [SPARK-40557] [SC-116182][SC-111442][CONNECT] Code Dump 9 Commits
- [SPARK-40448] [SC-114447][SC-111314][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies
- [SPARK-41096] [SC-115812][SQL] Support reading parquet FIXED_LEN_BYTE_ARRAY type
- [SPARK-41140] [SC-115879][SQL] Rename the error class
_LEGACY_ERROR_TEMP_2440
toINVALID_WHERE_CONDITION
- [SPARK-40918] [SC-114438][SQL] Mismatch between FileSourceScanExec and Orc and ParquetFileFormat on producing columnar output
- [SPARK-41155] [SC-115991][SQL] Add error message to SchemaColumnConvertNotSupportedException
- [SPARK-40940] [SC-115993] Remove Multi-stateful operator checkers for streaming queries.
- [SPARK-41098] [SC-115790][SQL] Rename
GROUP_BY_POS_REFERS_AGG_EXPR
toGROUP_BY_POS_AGGREGATE
- [SPARK-40755] [SC-115912][SQL] Migrate type check failures of number formatting onto error classes
- [SPARK-41059] [SC-115658][SQL] Rename
_LEGACY_ERROR_TEMP_2420
toNESTED_AGGREGATE_FUNCTION
- [SPARK-41044] [SC-115662][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR
- [SPARK-40973] [SC-115132][SQL] Rename
_LEGACY_ERROR_TEMP_0055
toUNCLOSED_BRACKETED_COMMENT
Maintenance updates
See Databricks Runtime 12.1 maintenance updates.
System environment
- Operating System: Ubuntu 20.04.5 LTS
- Java: Zulu 8.64.0.19-CA-linux64
- Scala: 2.12.14
- Python: 3.9.5
- R: 4.2.2
- Delta Lake: 2.2.0
Installed Python libraries
Library | Version | Library | Version | Library | Version |
---|---|---|---|---|---|
argon2-cffi | 21.3.0 | argon2-cffi-bindings | 21.2.0 | asttokens | 2.0.5 |
attrs | 21.4.0 | backcall | 0.2.0 | backports.entry-points-selectable | 1.2.0 |
beautifulsoup4 | 4.11.1 | black | 22.3.0 | bleach | 4.1.0 |
boto3 | 1.21.32 | botocore | 1.24.32 | certifi | 2021.10.8 |
cffi | 1.15.0 | chardet | 4.0.0 | charset-normalizer | 2.0.4 |
click | 8.0.4 | cryptography | 3.4.8 | cycler | 0.11.0 |
Cython | 0.29.28 | dbus-python | 1.2.16 | debugpy | 1.5.1 |
decorator | 5.1.1 | defusedxml | 0.7.1 | distlib | 0.3.6 |
docstring-to-markdown | 0.11 | entrypoints | 0.4 | executing | 0.8.3 |
facets-overview | 1.0.0 | fastjsonschema | 2.16.2 | filelock | 3.8.2 |
fonttools | 4.25.0 | idna | 3.3 | ipykernel | 6.15.3 |
ipython | 8.5.0 | ipython-genutils | 0.2.0 | ipywidgets | 7.7.2 |
jedi | 0.18.1 | Jinja2 | 2.11.3 | jmespath | 0.10.0 |
joblib | 1.1.0 | jsonschema | 4.4.0 | jupyter-client | 6.1.12 |
jupyter_core | 4.11.2 | jupyterlab-pygments | 0.1.2 | jupyterlab-widgets | 1.0.0 |
kiwisolver | 1.3.2 | MarkupSafe | 2.0.1 | matplotlib | 3.5.1 |
matplotlib-inline | 0.1.2 | mccabe | 0.7.0 | mistune | 0.8.4 |
mypy-extensions | 0.4.3 | nbclient | 0.5.13 | nbconvert | 6.4.4 |
nbformat | 5.3.0 | nest-asyncio | 1.5.5 | nodeenv | 1.7.0 |
notebook | 6.4.8 | numpy | 1.21.5 | packaging | 21.3 |
pandas | 1.4.2 | pandocfilters | 1.5.0 | parso | 0.8.3 |
pathspec | 0.9.0 | patsy | 0.5.2 | pexpect | 4.8.0 |
pickleshare | 0.7.5 | Pillow | 9.0.1 | pip | 21.2.4 |
platformdirs | 2.6.0 | plotly | 5.6.0 | pluggy | 1.0.0 |
prometheus-client | 0.13.1 | prompt-toolkit | 3.0.20 | protobuf | 3.19.4 |
psutil | 5.8.0 | psycopg2 | 2.9.3 | ptyprocess | 0.7.0 |
pure-eval | 0.2.2 | pyarrow | 7.0.0 | pycparser | 2.21 |
pyflakes | 2.5.0 | Pygments | 2.11.2 | PyGObject | 3.36.0 |
pyodbc | 4.0.32 | pyparsing | 3.0.4 | pyright | 1.1.283 |
pyrsistent | 0.18.0 | python-dateutil | 2.8.2 | python-lsp-jsonrpc | 1.0.0 |
python-lsp-server | 1.6.0 | pytz | 2021.3 | pyzmq | 22.3.0 |
requests | 2.27.1 | requests-unixsocket | 0.2.0 | rope | 0.22.0 |
s3transfer | 0.5.0 | scikit-learn | 1.0.2 | scipy | 1.7.3 |
seaborn | 0.11.2 | Send2Trash | 1.8.0 | setuptools | 61.2.0 |
six | 1.16.0 | soupsieve | 2.3.1 | ssh-import-id | 5.10 |
stack-data | 0.2.0 | statsmodels | 0.13.2 | tenacity | 8.0.1 |
terminado | 0.13.1 | testpath | 0.5.0 | threadpoolctl | 2.2.0 |
tokenize-rt | 4.2.1 | tomli | 1.2.2 | tornado | 6.1 |
traitlets | 5.1.1 | typing_extensions | 4.1.1 | ujson | 5.1.0 |
unattended-upgrades | 0.1 | urllib3 | 1.26.9 | virtualenv | 20.8.0 |
wcwidth | 0.2.5 | webencodings | 0.5.1 | whatthepatch | 1.0.3 |
wheel | 0.37.0 | widgetsnbextension | 3.6.1 | yapf | 0.31.0 |
Installed R libraries
R libraries are installed from the Microsoft CRAN snapshot on 2022-11-11.
Library | Version | Library | Version | Library | Version |
---|---|---|---|---|---|
arrow | 10.0.0 | askpass | 1.1 | assertthat | 0.2.1 |
backports | 1.4.1 | base | 4.2.2 | base64enc | 0.1-3 |
bit | 4.0.4 | bit64 | 4.0.5 | blob | 1.2.3 |
boot | 1.3-28 | brew | 1.0-8 | brio | 1.1.3 |
broom | 1.0.1 | bslib | 0.4.1 | cachem | 1.0.6 |
callr | 3.7.3 | caret | 6.0-93 | cellranger | 1.1.0 |
chron | 2.3-58 | class | 7.3-20 | cli | 3.4.1 |
clipr | 0.8.0 | clock | 0.6.1 | cluster | 2.1.4 |
codetools | 0.2-18 | colorspace | 2.0-3 | commonmark | 1.8.1 |
compiler | 4.2.2 | config | 0.3.1 | cpp11 | 0.4.3 |
crayon | 1.5.2 | credentials | 1.3.2 | curl | 4.3.3 |
data.table | 1.14.4 | datasets | 4.2.2 | DBI | 1.1.3 |
dbplyr | 2.2.1 | desc | 1.4.2 | devtools | 2.4.5 |
diffobj | 0.3.5 | digest | 0.6.30 | downlit | 0.4.2 |
dplyr | 1.0.10 | dtplyr | 1.2.2 | e1071 | 1.7-12 |
ellipsis | 0.3.2 | evaluate | 0.18 | fansi | 1.0.3 |
farver | 2.1.1 | fastmap | 1.1.0 | fontawesome | 0.4.0 |
forcats | 0.5.2 | foreach | 1.5.2 | foreign | 0.8-82 |
forge | 0.2.0 | fs | 1.5.2 | future | 1.29.0 |
future.apply | 1.10.0 | gargle | 1.2.1 | generics | 0.1.3 |
gert | 1.9.1 | ggplot2 | 3.4.0 | gh | 1.3.1 |
gitcreds | 0.1.2 | glmnet | 4.1-4 | globals | 0.16.1 |
glue | 1.6.2 | googledrive | 2.0.0 | googlesheets4 | 1.0.1 |
gower | 1.0.0 | graphics | 4.2.2 | grDevices | 4.2.2 |
grid | 4.2.2 | gridExtra | 2.3 | gsubfn | 0.7 |
gtable | 0.3.1 | hardhat | 1.2.0 | haven | 2.5.1 |
highr | 0.9 | hms | 1.1.2 | htmltools | 0.5.3 |
htmlwidgets | 1.5.4 | httpuv | 1.6.6 | httr | 1.4.4 |
ids | 1.0.1 | ini | 0.3.1 | ipred | 0.9-13 |
isoband | 0.2.6 | iterators | 1.0.14 | jquerylib | 0.1.4 |
jsonlite | 1.8.3 | KernSmooth | 2.23-20 | knitr | 1.40 |
labeling | 0.4.2 | later | 1.3.0 | lattice | 0.20-45 |
lava | 1.7.0 | lifecycle | 1.0.3 | listenv | 0.8.0 |
lubridate | 1.9.0 | magrittr | 2.0.3 | markdown | 1.3 |
MASS | 7.3-58 | Matrix | 1.5-1 | memoise | 2.0.1 |
methods | 4.2.2 | mgcv | 1.8-41 | mime | 0.12 |
miniUI | 0.1.1.1 | ModelMetrics | 1.2.2.2 | modelr | 0.1.9 |
munsell | 0.5.0 | nlme | 3.1-160 | nnet | 7.3-18 |
numDeriv | 2016.8-1.1 | openssl | 2.0.4 | parallel | 4.2.2 |
parallelly | 1.32.1 | pillar | 1.8.1 | pkgbuild | 1.3.1 |
pkgconfig | 2.0.3 | pkgdown | 2.0.6 | pkgload | 1.3.1 |
plogr | 0.2.0 | plyr | 1.8.7 | praise | 1.0.0 |
prettyunits | 1.1.1 | pROC | 1.18.0 | processx | 3.8.0 |
prodlim | 2019.11.13 | profvis | 0.3.7 | progress | 1.2.2 |
progressr | 0.11.0 | promises | 1.2.0.1 | proto | 1.0.0 |
proxy | 0.4-27 | ps | 1.7.2 | purrr | 0.3.5 |
r2d3 | 0.2.6 | R6 | 2.5.1 | ragg | 1.2.4 |
randomForest | 4.7-1.1 | rappdirs | 0.3.3 | rcmdcheck | 1.4.0 |
RColorBrewer | 1.1-3 | Rcpp | 1.0.9 | RcppEigen | 0.3.3.9.3 |
readr | 2.1.3 | readxl | 1.4.1 | recipes | 1.0.3 |
rematch | 1.0.1 | rematch2 | 2.1.2 | remotes | 2.4.2 |
reprex | 2.0.2 | reshape2 | 1.4.4 | rlang | 1.0.6 |
rmarkdown | 2.18 | RODBC | 1.3-19 | roxygen2 | 7.2.1 |
rpart | 4.1.19 | rprojroot | 2.0.3 | Rserve | 1.8-11 |
RSQLite | 2.2.18 | rstudioapi | 0.14 | rversions | 2.1.2 |
rvest | 1.0.3 | sass | 0.4.2 | scales | 1.2.1 |
selectr | 0.4-2 | sessioninfo | 1.2.2 | shape | 1.4.6 |
shiny | 1.7.3 | sourcetools | 0.1.7 | sparklyr | 1.7.8 |
SparkR | 3.3.1 | spatial | 7.3-11 | splines | 4.2.2 |
sqldf | 0.4-11 | SQUAREM | 2021.1 | stats | 4.2.2 |
stats4 | 4.2.2 | stringi | 1.7.8 | stringr | 1.4.1 |
survival | 3.4-0 | sys | 3.4.1 | systemfonts | 1.0.4 |
tcltk | 4.2.2 | testthat | 3.1.5 | textshaping | 0.3.6 |
tibble | 3.1.8 | tidyr | 1.2.1 | tidyselect | 1.2.0 |
tidyverse | 1.3.2 | timechange | 0.1.1 | timeDate | 4021.106 |
tinytex | 0.42 | tools | 4.2.2 | tzdb | 0.3.0 |
urlchecker | 1.0.1 | usethis | 2.1.6 | utf8 | 1.2.2 |
utils | 4.2.2 | uuid | 1.1-0 | vctrs | 0.5.0 |
viridisLite | 0.4.1 | vroom | 1.6.0 | waldo | 0.4.0 |
whisker | 0.4 | withr | 2.5.0 | xfun | 0.34 |
xml2 | 1.3.3 | xopen | 1.0.0 | xtable | 1.8-4 |
yaml | 2.3.6 | zip | 2.2.2 |
Installed Java and Scala libraries (Scala 2.12 cluster version)
Group ID | Artifact ID | Version |
---|---|---|
antlr | antlr | 2.7.7 |
com.amazonaws | amazon-kinesis-client | 1.12.0 |
com.amazonaws | aws-java-sdk-autoscaling | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudformation | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudfront | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudhsm | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudsearch | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudtrail | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudwatch | 1.12.189 |
com.amazonaws | aws-java-sdk-cloudwatchmetrics | 1.12.189 |
com.amazonaws | aws-java-sdk-codedeploy | 1.12.189 |
com.amazonaws | aws-java-sdk-cognitoidentity | 1.12.189 |
com.amazonaws | aws-java-sdk-cognitosync | 1.12.189 |
com.amazonaws | aws-java-sdk-config | 1.12.189 |
com.amazonaws | aws-java-sdk-core | 1.12.189 |
com.amazonaws | aws-java-sdk-datapipeline | 1.12.189 |
com.amazonaws | aws-java-sdk-directconnect | 1.12.189 |
com.amazonaws | aws-java-sdk-directory | 1.12.189 |
com.amazonaws | aws-java-sdk-dynamodb | 1.12.189 |
com.amazonaws | aws-java-sdk-ec2 | 1.12.189 |
com.amazonaws | aws-java-sdk-ecs | 1.12.189 |
com.amazonaws | aws-java-sdk-efs | 1.12.189 |
com.amazonaws | aws-java-sdk-elasticache | 1.12.189 |
com.amazonaws | aws-java-sdk-elasticbeanstalk | 1.12.189 |
com.amazonaws | aws-java-sdk-elasticloadbalancing | 1.12.189 |
com.amazonaws | aws-java-sdk-elastictranscoder | 1.12.189 |
com.amazonaws | aws-java-sdk-emr | 1.12.189 |
com.amazonaws | aws-java-sdk-glacier | 1.12.189 |
com.amazonaws | aws-java-sdk-glue | 1.12.189 |
com.amazonaws | aws-java-sdk-iam | 1.12.189 |
com.amazonaws | aws-java-sdk-importexport | 1.12.189 |
com.amazonaws | aws-java-sdk-kinesis | 1.12.189 |
com.amazonaws | aws-java-sdk-kms | 1.12.189 |
com.amazonaws | aws-java-sdk-lambda | 1.12.189 |
com.amazonaws | aws-java-sdk-logs | 1.12.189 |
com.amazonaws | aws-java-sdk-machinelearning | 1.12.189 |
com.amazonaws | aws-java-sdk-opsworks | 1.12.189 |
com.amazonaws | aws-java-sdk-rds | 1.12.189 |
com.amazonaws | aws-java-sdk-redshift | 1.12.189 |
com.amazonaws | aws-java-sdk-route53 | 1.12.189 |
com.amazonaws | aws-java-sdk-s3 | 1.12.189 |
com.amazonaws | aws-java-sdk-ses | 1.12.189 |
com.amazonaws | aws-java-sdk-simpledb | 1.12.189 |
com.amazonaws | aws-java-sdk-simpleworkflow | 1.12.189 |
com.amazonaws | aws-java-sdk-sns | 1.12.189 |
com.amazonaws | aws-java-sdk-sqs | 1.12.189 |
com.amazonaws | aws-java-sdk-ssm | 1.12.189 |
com.amazonaws | aws-java-sdk-storagegateway | 1.12.189 |
com.amazonaws | aws-java-sdk-sts | 1.12.189 |
com.amazonaws | aws-java-sdk-support | 1.12.189 |
com.amazonaws | aws-java-sdk-swf-libraries | 1.11.22 |
com.amazonaws | aws-java-sdk-workspaces | 1.12.189 |
com.amazonaws | jmespath-java | 1.12.189 |
com.chuusai | shapeless_2.12 | 2.3.3 |
com.clearspring.analytics | stream | 2.9.6 |
com.databricks | Rserve | 1.8-3 |
com.databricks | jets3t | 0.7.1-0 |
com.databricks.scalapb | compilerplugin_2.12 | 0.4.15-10 |
com.databricks.scalapb | scalapb-runtime_2.12 | 0.4.15-10 |
com.esotericsoftware | kryo-shaded | 4.0.2 |
com.esotericsoftware | minlog | 1.3.0 |
com.fasterxml | classmate | 1.3.4 |
com.fasterxml.jackson.core | jackson-annotations | 2.13.4 |
com.fasterxml.jackson.core | jackson-core | 2.13.4 |
com.fasterxml.jackson.core | jackson-databind | 2.13.4.2 |
com.fasterxml.jackson.dataformat | jackson-dataformat-cbor | 2.13.4 |
com.fasterxml.jackson.datatype | jackson-datatype-joda | 2.13.4 |
com.fasterxml.jackson.datatype | jackson-datatype-jsr310 | 2.13.4 |
com.fasterxml.jackson.module | jackson-module-paranamer | 2.13.4 |
com.fasterxml.jackson.module | jackson-module-scala_2.12 | 2.13.4 |
com.github.ben-manes.caffeine | caffeine | 2.3.4 |
com.github.fommil | jniloader | 1.1 |
com.github.fommil.netlib | core | 1.1.2 |
com.github.fommil.netlib | native_ref-java | 1.1 |
com.github.fommil.netlib | native_ref-java-natives | 1.1 |
com.github.fommil.netlib | native_system-java | 1.1 |
com.github.fommil.netlib | native_system-java-natives | 1.1 |
com.github.fommil.netlib | netlib-native_ref-linux-x86_64-natives | 1.1 |
com.github.fommil.netlib | netlib-native_system-linux-x86_64-natives | 1.1 |
com.github.luben | zstd-jni | 1.5.2-1 |
com.github.wendykierp | JTransforms | 3.1 |
com.google.code.findbugs | jsr305 | 3.0.0 |
com.google.code.gson | gson | 2.8.6 |
com.google.crypto.tink | tink | 1.6.1 |
com.google.flatbuffers | flatbuffers-java | 1.12.0 |
com.google.guava | guava | 15.0 |
com.google.protobuf | protobuf-java | 2.6.1 |
com.h2database | h2 | 2.0.204 |
com.helger | profiler | 1.1.1 |
com.jcraft | jsch | 0.1.50 |
com.jolbox | bonecp | 0.8.0.RELEASE |
com.lihaoyi | sourcecode_2.12 | 0.1.9 |
com.microsoft.azure | azure-data-lake-store-sdk | 2.3.9 |
com.ning | compress-lzf | 1.1 |
com.sun.mail | javax.mail | 1.5.2 |
com.tdunning | json | 1.8 |
com.thoughtworks.paranamer | paranamer | 2.8 |
com.trueaccord.lenses | lenses_2.12 | 0.4.12 |
com.twitter | chill-java | 0.10.0 |
com.twitter | chill_2.12 | 0.10.0 |
com.twitter | util-app_2.12 | 7.1.0 |
com.twitter | util-core_2.12 | 7.1.0 |
com.twitter | util-function_2.12 | 7.1.0 |
com.twitter | util-jvm_2.12 | 7.1.0 |
com.twitter | util-lint_2.12 | 7.1.0 |
com.twitter | util-registry_2.12 | 7.1.0 |
com.twitter | util-stats_2.12 | 7.1.0 |
com.typesafe | config | 1.2.1 |
com.typesafe.scala-logging | scala-logging_2.12 | 3.7.2 |
com.uber | h3 | 3.7.0 |
com.univocity | univocity-parsers | 2.9.1 |
com.zaxxer | HikariCP | 4.0.3 |
commons-cli | commons-cli | 1.5.0 |
commons-codec | commons-codec | 1.15 |
commons-collections | commons-collections | 3.2.2 |
commons-dbcp | commons-dbcp | 1.4 |
commons-fileupload | commons-fileupload | 1.3.3 |
commons-httpclient | commons-httpclient | 3.1 |
commons-io | commons-io | 2.11.0 |
commons-lang | commons-lang | 2.6 |
commons-logging | commons-logging | 1.1.3 |
commons-pool | commons-pool | 1.5.4 |
dev.ludovic.netlib | arpack | 2.2.1 |
dev.ludovic.netlib | blas | 2.2.1 |
dev.ludovic.netlib | lapack | 2.2.1 |
info.ganglia.gmetric4j | gmetric4j | 1.0.10 |
io.airlift | aircompressor | 0.21 |
io.delta | delta-sharing-spark_2.12 | 0.6.2 |
io.dropwizard.metrics | metrics-core | 4.1.1 |
io.dropwizard.metrics | metrics-graphite | 4.1.1 |
io.dropwizard.metrics | metrics-healthchecks | 4.1.1 |
io.dropwizard.metrics | metrics-jetty9 | 4.1.1 |
io.dropwizard.metrics | metrics-jmx | 4.1.1 |
io.dropwizard.metrics | metrics-json | 4.1.1 |
io.dropwizard.metrics | metrics-jvm | 4.1.1 |
io.dropwizard.metrics | metrics-servlets | 4.1.1 |
io.netty | netty-all | 4.1.74.Final |
io.netty | netty-buffer | 4.1.74.Final |
io.netty | netty-codec | 4.1.74.Final |
io.netty | netty-common | 4.1.74.Final |
io.netty | netty-handler | 4.1.74.Final |
io.netty | netty-resolver | 4.1.74.Final |
io.netty | netty-tcnative-classes | 2.0.48.Final |
io.netty | netty-transport | 4.1.74.Final |
io.netty | netty-transport-classes-epoll | 4.1.74.Final |
io.netty | netty-transport-classes-kqueue | 4.1.74.Final |
io.netty | netty-transport-native-epoll-linux-aarch_64 | 4.1.74.Final |
io.netty | netty-transport-native-epoll-linux-x86_64 | 4.1.74.Final |
io.netty | netty-transport-native-kqueue-osx-aarch_64 | 4.1.74.Final |
io.netty | netty-transport-native-kqueue-osx-x86_64 | 4.1.74.Final |
io.netty | netty-transport-native-unix-common | 4.1.74.Final |
io.prometheus | simpleclient | 0.7.0 |
io.prometheus | simpleclient_common | 0.7.0 |
io.prometheus | simpleclient_dropwizard | 0.7.0 |
io.prometheus | simpleclient_pushgateway | 0.7.0 |
io.prometheus | simpleclient_servlet | 0.7.0 |
io.prometheus.jmx | collector | 0.12.0 |
jakarta.annotation | jakarta.annotation-api | 1.3.5 |
jakarta.servlet | jakarta.servlet-api | 4.0.3 |
jakarta.validation | jakarta.validation-api | 2.0.2 |
jakarta.ws.rs | jakarta.ws.rs-api | 2.1.6 |
javax.activation | activation | 1.1.1 |
javax.el | javax.el-api | 2.2.4 |
javax.jdo | jdo-api | 3.0.1 |
javax.transaction | jta | 1.1 |
javax.transaction | transaction-api | 1.1 |
javax.xml.bind | jaxb-api | 2.2.11 |
javolution | javolution | 5.5.1 |
jline | jline | 2.14.6 |
joda-time | joda-time | 2.10.13 |
net.java.dev.jna | jna | 5.8.0 |
net.razorvine | pickle | 1.2 |
net.sf.jpam | jpam | 1.1 |
net.sf.opencsv | opencsv | 2.3 |
net.sf.supercsv | super-csv | 2.2.0 |
net.snowflake | snowflake-ingest-sdk | 0.9.6 |
net.snowflake | snowflake-jdbc | 3.13.22 |
net.sourceforge.f2j | arpack_combined_all | 0.1 |
org.acplt.remotetea | remotetea-oncrpc | 1.1.2 |
org.antlr | ST4 | 4.0.4 |
org.antlr | antlr-runtime | 3.5.2 |
org.antlr | antlr4-runtime | 4.8 |
org.antlr | stringtemplate | 3.2.1 |
org.apache.ant | ant | 1.9.2 |
org.apache.ant | ant-jsch | 1.9.2 |
org.apache.ant | ant-launcher | 1.9.2 |
org.apache.arrow | arrow-format | 7.0.0 |
org.apache.arrow | arrow-memory-core | 7.0.0 |
org.apache.arrow | arrow-memory-netty | 7.0.0 |
org.apache.arrow | arrow-vector | 7.0.0 |
org.apache.avro | avro | 1.11.0 |
org.apache.avro | avro-ipc | 1.11.0 |
org.apache.avro | avro-mapred | 1.11.0 |
org.apache.commons | commons-collections4 | 4.4 |
org.apache.commons | commons-compress | 1.21 |
org.apache.commons | commons-crypto | 1.1.0 |
org.apache.commons | commons-lang3 | 3.12.0 |
org.apache.commons | commons-math3 | 3.6.1 |
org.apache.commons | commons-text | 1.10.0 |
org.apache.curator | curator-client | 2.13.0 |
org.apache.curator | curator-framework | 2.13.0 |
org.apache.curator | curator-recipes | 2.13.0 |
org.apache.derby | derby | 10.14.2.0 |
org.apache.hadoop | hadoop-client-api | 3.3.4-databricks |
org.apache.hadoop | hadoop-client-runtime | 3.3.4 |
org.apache.hive | hive-beeline | 2.3.9 |
org.apache.hive | hive-cli | 2.3.9 |
org.apache.hive | hive-jdbc | 2.3.9 |
org.apache.hive | hive-llap-client | 2.3.9 |
org.apache.hive | hive-llap-common | 2.3.9 |
org.apache.hive | hive-serde | 2.3.9 |
org.apache.hive | hive-shims | 2.3.9 |
org.apache.hive | hive-storage-api | 2.8.1 |
org.apache.hive.shims | hive-shims-0.23 | 2.3.9 |
org.apache.hive.shims | hive-shims-common | 2.3.9 |
org.apache.hive.shims | hive-shims-scheduler | 2.3.9 |
org.apache.httpcomponents | httpclient | 4.5.13 |
org.apache.httpcomponents | httpcore | 4.4.14 |
org.apache.ivy | ivy | 2.5.0 |
org.apache.logging.log4j | log4j-1.2-api | 2.18.0 |
org.apache.logging.log4j | log4j-api | 2.18.0 |
org.apache.logging.log4j | log4j-core | 2.18.0 |
org.apache.logging.log4j | log4j-slf4j-impl | 2.18.0 |
org.apache.mesos | mesos-shaded-protobuf | 1.4.0 |
org.apache.orc | orc-core | 1.7.6 |
org.apache.orc | orc-mapreduce | 1.7.6 |
org.apache.orc | orc-shims | 1.7.6 |
org.apache.parquet | parquet-column | 1.12.3-databricks-0002 |
org.apache.parquet | parquet-common | 1.12.3-databricks-0002 |
org.apache.parquet | parquet-encoding | 1.12.3-databricks-0002 |
org.apache.parquet | parquet-format-structures | 1.12.3-databricks-0002 |
org.apache.parquet | parquet-hadoop | 1.12.3-databricks-0002 |
org.apache.parquet | parquet-jackson | 1.12.3-databricks-0002 |
org.apache.thrift | libfb303 | 0.9.3 |
org.apache.thrift | libthrift | 0.12.0 |
org.apache.xbean | xbean-asm9-shaded | 4.20 |
org.apache.yetus | audience-annotations | 0.13.0 |
org.apache.zookeeper | zookeeper | 3.6.2 |
org.apache.zookeeper | zookeeper-jute | 3.6.2 |
org.checkerframework | checker-qual | 3.5.0 |
org.codehaus.jackson | jackson-core-asl | 1.9.13 |
org.codehaus.jackson | jackson-mapper-asl | 1.9.13 |
org.codehaus.janino | commons-compiler | 3.0.16 |
org.codehaus.janino | janino | 3.0.16 |
org.datanucleus | datanucleus-api-jdo | 4.2.4 |
org.datanucleus | datanucleus-core | 4.1.17 |
org.datanucleus | datanucleus-rdbms | 4.1.19 |
org.datanucleus | javax.jdo | 3.2.0-m3 |
org.eclipse.jetty | jetty-client | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-continuation | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-http | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-io | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-jndi | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-plus | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-proxy | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-security | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-server | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-servlet | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-servlets | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-util | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-util-ajax | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-webapp | 9.4.46.v20220331 |
org.eclipse.jetty | jetty-xml | 9.4.46.v20220331 |
org.eclipse.jetty.websocket | websocket-api | 9.4.46.v20220331 |
org.eclipse.jetty.websocket | websocket-client | 9.4.46.v20220331 |
org.eclipse.jetty.websocket | websocket-common | 9.4.46.v20220331 |
org.eclipse.jetty.websocket | websocket-server | 9.4.46.v20220331 |
org.eclipse.jetty.websocket | websocket-servlet | 9.4.46.v20220331 |
org.fusesource.leveldbjni | leveldbjni-all | 1.8 |
org.glassfish.hk2 | hk2-api | 2.6.1 |
org.glassfish.hk2 | hk2-locator | 2.6.1 |
org.glassfish.hk2 | hk2-utils | 2.6.1 |
org.glassfish.hk2 | osgi-resource-locator | 1.0.3 |
org.glassfish.hk2.external | aopalliance-repackaged | 2.6.1 |
org.glassfish.hk2.external | jakarta.inject | 2.6.1 |
org.glassfish.jersey.containers | jersey-container-servlet | 2.36 |
org.glassfish.jersey.containers | jersey-container-servlet-core | 2.36 |
org.glassfish.jersey.core | jersey-client | 2.36 |
org.glassfish.jersey.core | jersey-common | 2.36 |
org.glassfish.jersey.core | jersey-server | 2.36 |
org.glassfish.jersey.inject | jersey-hk2 | 2.36 |
org.hibernate.validator | hibernate-validator | 6.1.0.Final |
org.javassist | javassist | 3.25.0-GA |
org.jboss.logging | jboss-logging | 3.3.2.Final |
org.jdbi | jdbi | 2.63.1 |
org.jetbrains | annotations | 17.0.0 |
org.joda | joda-convert | 1.7 |
org.jodd | jodd-core | 3.5.2 |
org.json4s | json4s-ast_2.12 | 3.7.0-M11 |
org.json4s | json4s-core_2.12 | 3.7.0-M11 |
org.json4s | json4s-jackson_2.12 | 3.7.0-M11 |
org.json4s | json4s-scalap_2.12 | 3.7.0-M11 |
org.lz4 | lz4-java | 1.8.0 |
org.mariadb.jdbc | mariadb-java-client | 2.7.4 |
org.mlflow | mlflow-spark | 1.27.0 |
org.objenesis | objenesis | 2.5.1 |
org.postgresql | postgresql | 42.3.3 |
org.roaringbitmap | RoaringBitmap | 0.9.25 |
org.roaringbitmap | shims | 0.9.25 |
org.rocksdb | rocksdbjni | 6.24.2 |
org.rosuda.REngine | REngine | 2.1.0 |
org.scala-lang | scala-compiler_2.12 | 2.12.14 |
org.scala-lang | scala-library_2.12 | 2.12.14 |
org.scala-lang | scala-reflect_2.12 | 2.12.14 |
org.scala-lang.modules | scala-collection-compat_2.12 | 2.4.3 |
org.scala-lang.modules | scala-parser-combinators_2.12 | 1.1.2 |
org.scala-lang.modules | scala-xml_2.12 | 1.2.0 |
org.scala-sbt | test-interface | 1.0 |
org.scalacheck | scalacheck_2.12 | 1.14.2 |
org.scalactic | scalactic_2.12 | 3.0.8 |
org.scalanlp | breeze-macros_2.12 | 1.2 |
org.scalanlp | breeze_2.12 | 1.2 |
org.scalatest | scalatest_2.12 | 3.0.8 |
org.slf4j | jcl-over-slf4j | 1.7.36 |
org.slf4j | jul-to-slf4j | 1.7.36 |
org.slf4j | slf4j-api | 1.7.36 |
org.spark-project.spark | unused | 1.0.0 |
org.threeten | threeten-extra | 1.5.0 |
org.tukaani | xz | 1.9 |
org.typelevel | algebra_2.12 | 2.0.1 |
org.typelevel | cats-kernel_2.12 | 2.1.1 |
org.typelevel | macro-compat_2.12 | 1.1.1 |
org.typelevel | spire-macros_2.12 | 0.17.0 |
org.typelevel | spire-platform_2.12 | 0.17.0 |
org.typelevel | spire-util_2.12 | 0.17.0 |
org.typelevel | spire_2.12 | 0.17.0 |
org.wildfly.openssl | wildfly-openssl | 1.0.7.Final |
org.xerial | sqlite-jdbc | 3.8.11.2 |
org.xerial.snappy | snappy-java | 1.1.8.4 |
org.yaml | snakeyaml | 1.24 |
oro | oro | 2.0.8 |
pl.edu.icm | JLargeArrays | 1.5 |
software.amazon.ion | ion-java | 1.0.2 |
stax | stax-api | 1.0.1 |