Apache Spark 简介Introduction to Apache Spark

此自定进度指南是适用于使用 Azure Databricks 的 Apache Spark 的“Hello World”教程。This self-paced guide is the “Hello World” tutorial for Apache Spark using Azure Databricks. 以下教程模块介绍有关创建 Spark 作业、加载数据和处理数据的基础知识。In the following tutorial modules, you will learn the basics of creating Spark jobs, loading data, and working with data. 其中还介绍了如何运行机器学习算法和处理流数据。You’ll also get an introduction to running machine learning algorithms and working with streaming data. 在 Azure Databricks 中可以立即开始编写 Spark 查询,因此,你可以将注意力放在解决数据问题上。Azure Databricks lets you start writing Spark queries instantly so you can focus on your data problems.

在边栏和本页上,你可看到 5 个教程模板,其中每个模块代表了 Azure Databricks 上的 Apache Spark 入门过程的一个阶段。In the sidebar and on this page you can see five tutorial modules, each representing a stage in the process of getting started with Apache Spark on Azure Databricks. 每个模块表示使用随时可运行的笔记本和预加载的数据集的独立使用方案;如果你对基础知识非常了解,可以直接跳到所需的模块。Each of these modules refers to standalone usage scenarios with ready-to-run notebooks and preloaded datasets; you can jump ahead if you feel comfortable with the basics.