使用 GraphFrames 执行图分析的教程Graph analysis tutorial with GraphFrames

本教程笔记本介绍如何使用 GraphFrames 执行图分析。This tutorial notebook shows you how to use GraphFrames to perform graph analysis. Databricks 建议使用运行用于机器学习的 Databricks Runtime 的群集,因为它包括 GraphFrames 的优化安装。Databricks recommends using a cluster running Databricks Runtime for Machine Learning, as it includes an optimized installation of GraphFrames.

运行笔记本:To run the notebook:

  1. 如果未使用运行 Databricks Runtime ML 的群集,请使用这些方法中的一种安装 GraphFrames 库If you are not using a cluster running Databricks Runtime ML, use one of these methods to install the GraphFrames library.

  2. 从 Kaggle 下载旧金山湾区共享单车数据,并将其解压。Download the SF Bay Area Bike Share data from Kaggle and unzip it. 必须使用第三方身份验证登录 Kaggle,或创建 Kaggle 帐户并登录。You must sign into Kaggle using third-party authentication or create and sign into a Kaggle account.

  3. 使用 创建表 UI 上传 station.csvtrip.csvUpload station.csv and trip.csv using the Create table UI.

    这些表名为 station_csvtrip_csvThe tables are named station_csv and trip_csv.

使用 GraphFrames 笔记本执行图分析Graph Analysis with GraphFrames notebook

获取笔记本Get notebook