快速入门:使用 Apache Phoenix 在 Azure HDInsight 中查询 Apache HBaseQuickstart: Query Apache HBase in Azure HDInsight with Apache Phoenix

在本快速入门中,你将学习如何使用 Apache Phoeni 在 Azure HDInsight 中运行 HBase 查询。In this quickstart, you learn how to use the Apache Phoenix to run HBase queries in Azure HDInsight. Apache Phoenix 是 Apache HBase 的 SQL 查询引擎。Apache Phoenix is a SQL query engine for Apache HBase. 该引擎以 JDBC 驱动程序的形式供用户访问,并且支持使用 SQL 来查询和管理 HBase 表。It is accessed as a JDBC driver, and it enables querying and managing HBase tables by using SQL. SQLLine 是用于执行 SQL 的命令行实用工具。SQLLine is a command-line utility to execute SQL.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a trial account before you begin.

先决条件Prerequisites

识别 ZooKeeper 节点Identify a ZooKeeper node

在连接到 HBase 群集时,需要连接到 Apache ZooKeeper 节点之一。When you connect to an HBase cluster, you need to connect to one of the Apache ZooKeeper nodes. 每个 HDInsight 群集都有三个 ZooKeeper 节点。Each HDInsight cluster has three ZooKeeper nodes. 可以使用 Curl 来快速识别 ZooKeeper 节点。Curl can be used to quickly identify a ZooKeeper node. 编辑以下 curl 命令,将 PASSWORDCLUSTERNAME 替换为相关的值,然后在命令提示符下输入该命令:Edit the curl command below by replacing PASSWORD and CLUSTERNAME with the relevant values, and then enter the command in a command prompt:

curl -u admin:PASSWORD -sS -G https://CLUSTERNAME.azurehdinsight.cn/api/v1/clusters/CLUSTERNAME/services/ZOOKEEPER/components/ZOOKEEPER_SERVER

输出的一部分将类似于以下内容:A portion of the output will look similar to:

    {
      "href" : "http://hn1-brim.432dc3rlshou3ocf251eycoapa.bx.internal.chinacloudapp.cn:8080/api/v1/clusters/myCluster/hosts/zk0-brim.432dc3rlshou3ocf251eycoapa.bx.internal.chinacloudapp.cn/host_components/ZOOKEEPER_SERVER",
      "HostRoles" : {
        "cluster_name" : "myCluster",
        "component_name" : "ZOOKEEPER_SERVER",
        "host_name" : "zk0-brim.432dc3rlshou3ocf251eycoapa.bx.internal.chinacloudapp.cn"
      }

记下 host_name 的值供以后使用。Take note of the value for host_name for later use.

创建表并操作数据Create a table and manipulate data

可以使用 SSH 连接到 HBase 群集,然后使用 Apache Phoenix 来创建 HBase 表以及插入和查询数据。You can use SSH to connect to HBase clusters, and then use Apache Phoenix to create HBase tables, insert data, and query data.

  1. 使用 ssh 命令连接到 HBase 群集。Use ssh command to connect to your HBase cluster. 编辑以下命令,将 CLUSTERNAME 替换为群集的名称,然后输入该命令:Edit the command below by replacing CLUSTERNAME with the name of your cluster, and then enter the command:

    ssh sshuser@CLUSTERNAME-ssh.azurehdinsight.cn
    
  2. 将目录更改到 Phoenix 客户端。Change directory to the Phoenix client. 输入以下命令:Enter the following command:

    cd /usr/hdp/current/phoenix-client/bin
    
  3. 启动 SQLLineLaunch SQLLine. 编辑以下命令,将 ZOOKEEPER 替换为群集的名称,然后输入该命令:Edit the command below by replacing ZOOKEEPER with the ZooKeeper node identified earlier, then enter the command:

    ./sqlline.py ZOOKEEPER:2181:/hbase-unsecure
    
  4. 创建一个 HBase 表。Create an HBase table. 输入以下命令:Enter the following command:

    CREATE TABLE Company (company_id INTEGER PRIMARY KEY, name VARCHAR(225));
    
  5. 使用 SQLLine !tables 命令列出 HBase 中的所有表。Use the SQLLine !tables command to list all tables in HBase. 输入以下命令:Enter the following command:

    !tables
    
  6. 在表中插入值。Insert values in the table. 输入以下命令:Enter the following command:

    UPSERT INTO Company VALUES(1, 'Microsoft');
    UPSERT INTO Company VALUES(2, 'Apache');
    
  7. 查询表。Query the table. 输入以下命令:Enter the following command:

    SELECT * FROM Company;
    
  8. 删除记录。Delete a record. 输入以下命令:Enter the following command:

    DELETE FROM Company WHERE COMPANY_ID=1;
    
  9. 删除表。Drop the table. 输入以下命令:Enter the following command:

    DROP TABLE Company;
    
  10. 使用 SQLLine !quit 命令退出 SQLLine。Use the SQLLine !quit command to exit SQLLine. 输入以下命令:Enter the following command:

    !quit
    

清理资源Clean up resources

完成本快速入门后,可以删除群集。After you complete the quickstart, you may want to delete the cluster. 有了 HDInsight,便可以将数据存储在 Azure 存储中,因此可以在群集不用时安全地删除群集。With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. 此外,还需要支付 HDInsight 群集费用,即使未使用。You are also charged for an HDInsight cluster, even when it is not in use. 由于群集费用高于存储空间费用数倍,因此在不使用群集时将其删除可以节省费用。Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.

若要删除群集,请参阅使用浏览器、PowerShell 或 Azure CLI 删除 HDInsight 群集To delete a cluster, see Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI.

后续步骤Next steps

在本快速入门中,你已学习了如何使用 Apache Phoenix 在 Azure HDInsight 中运行 HBase 查询。In this quickstart, you learned how to use the Apache Phoenix to run HBase queries in Azure HDInsight. 若要详细了解 Apache Phoenix,下一篇文章将提供更深层次的介绍。To learn more about Apache Phoenix, the next article will provide a deeper examination.