开始使用 Apache Spark MLlib 进行机器学习Get started with Apache Spark MLlib for machine learning

备注

Databricks Runtime ML 是使用 Azure Databricks 开发和部署机器学习模型的综合性工具。Databricks Runtime ML is a comprehensive tool for developing and deploying machine learning models with Azure Databricks. 它包括最常用的机器学习和深度学习库,以及 MLflow(一种用于跟踪和管理端到端机器学习生命周期的机器学习平台 API)。It includes the most popular machine learning and deep learning libraries, as well as MLflow, a machine learning platform API for tracking and managing the end-to-end machine learning lifecycle. 有关详细信息,请参阅机器学习和深度学习See Machine learning and deep learning for details.

Apache Spark 机器学习库 (MLlib) 使数据科学家能够专注于其数据问题和模型,而不是专注于解决围绕分布式数据的复杂性问题(例如基础结构、配置等)。The Apache Spark machine learning library (MLlib) allows data scientists to focus on their data problems and models instead of solving the complexities surrounding distributed data (such as infrastructure, configurations, and so on). 教程笔记本会引导你完成以下步骤:加载数据、直观显示数据和准备用于 ML 算法的数据、运行和评估简单的线性回归模型以及直观显示结果。The tutorial notebook takes you through the steps of loading data, visualizing the data and preparing it for ML algorithms, running and evaluating a simple linear regression model, and visualizing the results.

笔记本Notebook

若要访问所有这些代码示例,请导入以下笔记本。To access all of these code examples, import the following notebook. 如需更多机器学习示例,请参阅机器学习和深度学习For more machine learning examples, see Machine learning and deep learning.

Apache Spark 机器学习笔记本Apache Spark machine learning notebook

获取笔记本Get notebook