JDBC 和 ODBC 驱动器以及配置参数JDBC and ODBC drivers and configuration parameters

可以将商业智能 (BI) 工具连接到 Azure Databricks 群集,以查询表中的数据。You can connect business intelligence (BI) tools to Azure Databricks clusters to query data in tables. 每个 Azure Databricks 群集都在驱动程序节点上运行 JDBC 和 ODBC 服务器。Every Azure Databricks cluster runs JDBC and ODBC servers on the driver node. 本文介绍了如何获取 JDBC 和 ODBC 驱动程序以及配置参数以连接到 Azure Databricks 群集。This article describes how to get JDBC and ODBC drivers and configuration parameters to connect to Azure Databricks clusters. 有关特定于工具的连接说明,请参阅商业智能工具For tool-specific connection instructions, see Business intelligence tools.

群集要求Cluster requirements

若要使用 JDBC 或 ODBC 访问群集,必须具有“可附加到”权限To access a cluster using JDBC or ODBC you must have Can Attach To permission.

如果使用 JDBC 或 ODBC 连接到已终止的群集,并且具有“可重启”权限,则将重启群集。If you connect to a terminated cluster using JDBC or ODBC and have Can Restart permission, the cluster will be restarted.

步骤 1:下载并安装 JDBC 或 ODBC 驱动程序 Step 1: Download and install a JDBC or ODBC driver

对于所有 BI 工具,都需要一个 JDBC 或 ODBC 驱动程序来连接到 Azure Databricks 群集。For all BI tools, you need a JDBC or ODBC driver to make a connection to Azure Databricks clusters.

  1. 转到 Databricks JDBC/ODBC 驱动程序下载页面Go to the Databricks JDBC / ODBC Driver Download page.
  2. 填写表格并提交。Fill out the form and submit it. 页面将更新并包含指向多个下载选项的链接。The page will update with links to multiple download options.
  3. 选择一个驱动程序并下载它。Select a driver and download it.
  4. 安装驱动程序。Install the driver. 对于 JDBC,提供了不需要安装的 JAR。For JDBC, a JAR is provided which does not require installation. 对于 ODBC,将为所选平台提供一个安装包,该平台必须安装在系统上。For ODBC, an installation package is provided for your chosen platform that must be installed on your system.

步骤 2:收集 JDBC 或 ODBC 连接参数Step 2: Gather JDBC or ODBC connection parameters

若要配置 JDBC 或 ODBC 驱动程序,必须从 Azure Databricks 收集连接参数。To configure a JDBC or ODBC driver, you must collect connection parameters from Azure Databricks. 以下是 JDBC 或 ODBC 驱动程序可能需要的一些参数:Here are some of the parameters a JDBC or ODBC driver might require:

参数Parameters Value
用户名和密码Username and password 请参阅用户名和密码See Username and password.
主机、端口、HTTP 路径、JDBC URLHost, port, HTTP path, JDBC URL 请参阅服务器主机名、端口、HTTP 路径和 JDBC URLSee Server hostname, port, HTTP path, and JDBC URL.

对于 ODBC 的 JDBC 和 DSN 配置,通常会在 httpPath 中指定以下内容:The following are usually specified in the httpPath for JDBC and the DSN conf for ODBC:

参数Parameters Value
Spark 服务器类型Spark Server Type Spark Thrift 服务器Spark Thrift Server
架构/数据库Schema/Database defaultdefault
身份验证机制 (AuthMech)Authentication Mechanism (AuthMech) 用户名和密码身份验证Username and password authentication
Thrift 传输Thrift Transport httphttp
SSLSSL 11

用户名和密码 Username and password

JDBC/ODBC 驱动程序对 Azure Databricks 的凭据传递或 Azure Active Directory 用户名和密码身份验证都不支持。No JDBC/ODBC driver supports either credential passthrough or Azure Active Directory username-and-password authentication to Azure Databricks.

服务器主机名、端口、HTTP 路径和 JDBC URL Server hostname, port, HTTP path, and JDBC URL

  1. 在群集配置页面上,单击“高级选项”切换开关。On the cluster configuration page, click the Advanced Options toggle.

  2. 单击“JDBC/ODBC”选项卡。它包含主机名、端口、协议、HTTP 路径和 JDBC URL。Click the JDBC/ODBC tab. It contains the hostname, port, protocol, HTTP path, and JDBC URL.

    JDBC-ODBC 选项卡JDBC-ODBC tab

  3. 复制 BI 工具所需的参数。Copy the parameters required by your BI tool.

  4. 如果你的工具需要 JDBC URL,请将 <personal-access-token> 替换为你在用户名和密码中创建的令牌。If your tool requires the JDBC URL, replace <personal-access-token> with the token you created in Username and password. 例如:For example:

    jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=dapideadbeefdeadbeefdeadbeefdeadbeef
    
  5. JDBC 和 ODBC 驱动程序接受 ANSI SQL-92 方言中的 SQL 查询,并将查询转换为 Spark SQL。JDBC and ODBC drivers accept SQL queries in ANSI SQL-92 dialect and translate the queries to Spark SQL. 如果应用程序直接生成 Spark SQL,或者应用程序使用特定于 Databricks Runtime 的任何非 ANSI SQL-92 标准 SQL 语法,Databricks 建议你将 ;UseNativeQuery=1 添加到连接配置中。If your application generates Spark SQL directly or your application uses any non-ANSI SQL-92 standard SQL syntax specific to Databricks Runtime, Databricks recommends that you add ;UseNativeQuery=1 to the connection configuration. 通过该设置,驱动程序会将 SQL 查询逐字传递到 Databricks Runtime。With that setting, drivers pass the SQL queries verbatim to Databricks Runtime.

Simba ODBC 驱动程序的 ODBC 数据源名称配置ODBC Data Source Name configuration for the Simba ODBC driver

数据源名称 (DSN) 配置包含用于与特定数据库通信的参数。The Data Source Name (DSN) configuration contains the parameters for communicating with a specific database. BI 工具(如 Tableau)通常会提供一个用户界面,用于输入这些参数。BI tools like Tableau usually provide a user interface for entering these parameters. 如果你必须自行安装和管理 Simba ODBC 驱动程序,则可能需要创建配置文件,并允许驱动程序管理器(Windows 上的 ODBC 数据源管理器和 Unix 上的 unixODBC/iODBC)来访问这些配置文件。If you have to install and manage the Simba ODBC driver yourself, you might need to create the configuration files and also allow your Driver Manager (ODBC Data Source Administrator on Windows and unixODBC/iODBC on Unix) to access them. 创建两个文件:/etc/odbc.ini/etc/odbcinst.iniCreate two files: /etc/odbc.ini and /etc/odbcinst.ini.

/etc/odbc.ini

/etc/odbc.ini 的内容可以是:The content of /etc/odbc.ini can be:

[Databricks-Spark]
Driver=Simba
Server=<server-hostname>
HOST=<server-hostname>
PORT=<port>
SparkServerType=3
Schema=default
ThriftTransport=2
SSL=1
AuthMech=3
UID=token
PWD=<personal-access-token>
HTTPPath=<http-path>

<personal-access-token> 设置为在用户名和密码中检索到的令牌,其他参数设置为在服务器主机名、端口、HTTP 路径和 JDBC URL 中检索到的令牌。Set <personal-access-token> to the token you retrieved in Username and password and the other parameters to the ones you retrieved in Server hostname, port, HTTP path, and JDBC URL.

/etc/odbcinst.ini

/etc/odbcinst.ini 的内容可以是:The content of /etc/odbcinst.ini can be:

[ODBC Drivers]
Simba = Installed
[Simba Spark ODBC Driver 64-bit]
Driver = <driver-path>

根据在步骤 1 下载驱动程序时选择的操作系统设置 <driver-path>Set <driver-path> according to the operating system you chose when you downloaded the driver in Step 1:

  • MacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylibMacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylib
  • Linux (64-bit) /opt/simba/spark/lib/64/libsparkodbc_sb64.soLinux (64-bit) /opt/simba/spark/lib/64/libsparkodbc_sb64.so
  • Linux (32-bit) /opt/simba/spark/lib/32/libsparkodbc_sb32.soLinux (32-bit) /opt/simba/spark/lib/32/libsparkodbc_sb32.so

配置路径Configure paths

在环境变量中指定两个文件的路径,以便驱动程序管理器可以使用它们:Specify the paths of the two files in environment variables so that they can be used by the Driver Manager:

export ODBCINI=/etc/odbc.ini
export ODBCSYSINI=/etc/odbcinst.ini
export SIMBASPARKINI=<simba-ini-path>/simba.sparkodbc.ini # (Contains the configuration for debugging the Simba driver)

其中,<simba-ini-path>where <simba-ini-path> is

  • MacOS /Library/simba/spark/libMacOS /Library/simba/spark/lib
  • Linux (64-bit) /opt/simba/sparkodbc/lib/64Linux (64-bit) /opt/simba/sparkodbc/lib/64
  • Linux (32-bit) /opt/simba/sparkodbc/lib/32Linux (32-bit) /opt/simba/sparkodbc/lib/32

疑难解答Troubleshooting

请参阅排查 JDBC 和 ODBC 连接问题See Troubleshooting JDBC and ODBC connections.