使用 GraphFrames 执行图分析的教程Graph analysis tutorial with GraphFrames
本教程笔记本介绍如何使用 GraphFrames 执行图分析。This tutorial notebook shows you how to use GraphFrames to perform graph analysis. Databricks 建议使用运行用于机器学习的 Databricks Runtime 的群集,因为它包括 GraphFrames 的优化安装。Databricks recommends using a cluster running Databricks Runtime for Machine Learning, as it includes an optimized installation of GraphFrames.
运行笔记本:To run the notebook:
如果未使用运行 Databricks Runtime ML 的群集,请使用这些方法中的一种安装 GraphFrames 库。If you are not using a cluster running Databricks Runtime ML, use one of these methods to install the GraphFrames library.
从 Kaggle 下载旧金山湾区共享单车数据,并将其解压。Download the SF Bay Area Bike Share data from Kaggle and unzip it. 必须使用第三方身份验证登录 Kaggle,或创建 Kaggle 帐户并登录。You must sign into Kaggle using third-party authentication or create and sign into a Kaggle account.
使用 创建表 UI 上传
station.csv
和trip.csv
。Uploadstation.csv
andtrip.csv
using the Create table UI.这些表名为
station_csv
和trip_csv
。The tables are namedstation_csv
andtrip_csv
.