AI and machine learning integrations
Azure Databricks has validated integrations with various third-party solutions that enable common machine learning scenarios.
Ray integration
Ray is an open source framework for scaling Python applications. It includes libraries specific to AI workloads, making it especially suited for developing AI applications. Running Ray on Azure Databricks allows you to leverage the breadth of the Azure Databricks ecosystem, enhancing data processing and machine learning workflows with services and integrations unavailable in open source Ray.
See What is Ray on Azure Databricks? for more information.
Graphframes integration
GraphFrames is a package for Apache Spark that provides DataFrame-based graphs. It provides high-level APIs in Java, Python, and Scala. It aims to provide both the functionality of GraphX and extended functionality, taking advantage of Spark DataFrames. This extended functionality includes motif finding, DataFrame-based serialization, and highly expressive graph queries.
Data labeling
Labeling additional training data is an important step for many machine learning workflows, such as classification or computer vision applications. Azure Databricks does not directly support data labeling; however, the Databricks partnership with Labelbox simplifies the process.