Snowflake 是一种基于云的 SQL 数据仓库,侧重于卓越的性能、零优化、多样性的数据源和安全性。Snowflake is a cloud-based SQL data warehouse that focuses on great performance, zero-tuning, diversity of data sources, and security. 本文介绍如何使用 Databricks Snowflake 连接器从 Snowflake 中读取数据并将数据写入到 Snowflake。This article explains how to read data from and write data to Snowflake using the Databricks Snowflake connector.

Azure Databricks 和 Snowflake 合作为 Azure Databricks 和 Snowflake 的客户带来一流的连接器体验,使你无需将库导入并加载到群集中,从而避免了版本冲突和配置错误。Azure Databricks and Snowflake have partnered to bring a first-class connector experience for customers of both Azure Databricks and Snowflake, saving you from having to import and load libraries into your clusters, and therefore preventing version conflicts and misconfiguration.

Spark 笔记本的 Snowflake 连接器Snowflake Connector for Spark notebooks

以下笔记本提供有关如何将数据写入到 Snowflake 或从 Snowflake 读取数据的简单示例。The following notebooks provide simple examples of how to write data to and read data from Snowflake. 有关详细信息,请参阅使用 Spark 连接器See Using the Spark Connector for more details. 具体而言,请参阅设置连接器的配置选项获取所有配置选项。In particular, see Setting Configuration Options for the Connector for all configuration options.


使用笔记本中演示的机密,避免在笔记本中公开 Snowflake 用户名和密码。Avoid exposing your Snowflake username and password in notebooks by using Secrets, which are demonstrated in the notebooks.

本节内容:In this section:

Snowflake Scala 笔记本Snowflake Scala notebook

获取笔记本Get notebook

Snowflake Python 笔记本Snowflake Python notebook

获取笔记本Get notebook

Snowflake R 笔记本Snowflake R notebook

获取笔记本Get notebook

训练机器学习模型,并将结果保存到 SnowflakeTrain a machine learning model and save results to Snowflake

以下笔记本介绍如何使用适用于 Spark 的 Snowflake 连接器的最佳做法。The following notebook walks through best practices for using the Snowflake Connector for Spark. 它将数据写入到 Snowflake,使用 Snowflake 进行一些基本的数据操作,训练 Azure Databricks 中的机器学习模型,并将结果写回 Snowflake。It writes data to Snowflake, uses Snowflake for some basic data manipulation, trains a machine learning model in Azure Databricks, and writes the results back to Snowflake.

在 Snowflake 笔记本中存储 ML 训练结果Store ML training results in Snowflake notebook

获取笔记本Get notebook

常见问题 (FAQ)Frequently asked questions (FAQ)

为什么我的 Spark 数据帧列在 Snowflake 中的顺序不相同?Why don’t my Spark DataFrame columns appear in the same order in Snowflake?

适用于 Spark 的 Snowflake 连接器不遵循要写入的表中的列的顺序;必须显式指定数据帧和 Snowflake 列之间的映射。The Snowflake Connector for Spark doesn’t respect the order of the columns in the table being written to; you must explicitly specify the mapping between DataFrame and Snowflake columns. 若要指定此映射,请使用 columnmap 参数To specify this mapping, use the columnmap parameter.

为什么向 Snowflake 写入的 INTEGER 数据总是作为 DECIMAL 读回?Why is INTEGER data written to Snowflake always read back as DECIMAL?

Snowflak 将所有 INTEGER 类型都表示为 NUMBER,这可能会导致在将数据写入到 Snowflak 并从中读取数据时数据类型发生更改。Snowflake represents all INTEGER types as NUMBER, which can cause a change in data type when you write data to and read data from Snowflake. 例如,在写入 Snowflak 时可以将 INTEGER 数据转换为 DECIMAL,因为 INTEGERDECIMAL 在 Snowflak 中是等效的(请参阅 Snowflak 数字数据类型)。For example, INTEGER data can be converted to DECIMAL when writing to Snowflake, because INTEGER and DECIMAL are semantically equivalent in Snowflake (see Snowflake Numeric Data Types).

为什么 Snowflak 表架构中的字段总是大写的?Why are the fields in my Snowflake table schema always uppercase?

默认情况下,Snowflak 使用大写字段,这意味着表架构将转换为大写字母。Snowflake uses uppercase fields by default, which means that the table schema is converted to uppercase.