构建适用于 Apache HBase 的 Java 应用程序Build Java applications for Apache HBase

了解如何使用 Java 创建 Apache HBase 应用程序。Learn how to create an Apache HBase application in Java. 然后,将该应用程序用于 Azure HDInsight 上的 HBase。Then use the application with HBase on Azure HDInsight.

本文档中的步骤使用 Apache Maven 创建和构建项目。The steps in this document use Apache Maven to create and build the project. Maven 是一种软件项目管理和综合工具,可用于为 Java 项目构建软件、文档和报告。Maven is a software project management and comprehension tool that allows you to build software, documentation, and reports for Java projects.

先决条件Prerequisites

测试环境Test environment

本文使用的环境是一台运行 Windows 10 的计算机。The environment used for this article was a computer running Windows 10. 命令在命令提示符下执行,各种文件使用记事本进行编辑。The commands were executed in a command prompt, and the various files were edited with Notepad. 针对环境进行相应的修改。Modify accordingly for your environment.

在命令提示符下,输入以下命令以创建工作环境:From a command prompt, enter the commands below to create a working environment:

IF NOT EXIST C:\HDI MKDIR C:\HDI
cd C:\HDI

创建 Maven 项目Create a Maven project

  1. 输入以下命令,创建名为 hbaseapp 的 Maven 项目:Enter the following command to create a Maven project named hbaseapp:

    mvn archetype:generate -DgroupId=com.microsoft.examples -DartifactId=hbaseapp -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
    
    cd hbaseapp
    mkdir conf
    

    该命令会在当前位置创建名为 hbaseapp 的目录,其中包含基本 Maven 项目。This command creates a directory named hbaseapp at the current location, which contains a basic Maven project. 第二条命令将工作目录更改为 hbaseappThe second command changes the working directory to hbaseapp. 第三条命令创建稍后要使用的新目录 confThe third command creates a new directory, conf, which will be used later. hbaseapp 目录包含以下项:The hbaseapp directory contains the following items:

    • pom.xml:项目对象模型 (POM),其中包含用于生成项目的信息和配置详细信息。pom.xml: The Project Object Model (POM) contains information and configuration details used to build the project.
    • src\main\java\com\microsoft\examples:包含应用程序代码。src\main\java\com\microsoft\examples: Contains your application code.
    • src\test\java\com\microsoft\examples:包含应用程序的测试。src\test\java\com\microsoft\examples: Contains tests for your application.
  2. 删除生成的示例代码。Remove the generated example code. 输入以下命令,删除生成的测试和应用程序文件 AppTest.javaApp.javaDelete the generated test and application files AppTest.java, and App.java by entering the commands below:

    DEL src\main\java\com\microsoft\examples\App.java
    DEL src\test\java\com\microsoft\examples\AppTest.java
    

更新项目对象模型Update the Project Object Model

有关 pom.xml 文件的完整参考,请参阅 https://maven.apache.org/pom.htmlFor a full reference of the pom.xml file, see https://maven.apache.org/pom.html. 输入以下命令打开 pom.xmlOpen pom.xml by entering the command below:

notepad pom.xml

添加依赖项Add dependencies

pom.xml<dependencies> 节中添加以下文本:In pom.xml, add the following text in the <dependencies> section:

<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.1.2</version>
</dependency>
<dependency>
    <groupId>org.apache.phoenix</groupId>
    <artifactId>phoenix-core</artifactId>
    <version>4.14.1-HBase-1.1</version>
</dependency>

此部分指示项目需要 hbase-clientphoenix-core 组件。This section indicates that the project needs hbase-client and phoenix-core components. 在编译时,会从默认 Maven 存储库下载这些依赖项。At compile time, these dependencies are downloaded from the default Maven repository. 可以使用 Maven 中央存储库 搜索来了解有关此依赖性的详细信息。You can use the Maven Central Repository Search to learn more about this dependency.

Important

hbase-client 的版本号必须与 HDInsight 群集随附的 Apache HBase 版本匹配。The version number of the hbase-client must match the version of Apache HBase that is provided with your HDInsight cluster. 可以使用下表来查找正确的版本号。Use the following table to find the correct version number.

HDInsight 群集版本HDInsight cluster version 要使用的 Apache HBase 版本Apache HBase version to use
3.63.6 1.1.21.1.2
4.04.0 2.0.02.0.0
For more information on HDInsight versions and components, see [What are the different Apache Hadoop components available with HDInsight](../hdinsight-component-versioning.md).

生成配置Build configuration

Maven 插件可用于自定义项目的生成阶段。Maven plug-ins allow you to customize the build stages of the project. 此节用于添加插件、资源和其他生成配置选项。This section is used to add plug-ins, resources, and other build configuration options.

将以下代码添加到 pom.xml 文件,然后保存并关闭该文件。Add the following code to the pom.xml file, and then save and close the file. 此文本必须位于文件中的 <project>...</project> 标记内,例如 </dependencies></project> 之间。This text must be inside the <project>...</project> tags in the file, for example, between </dependencies> and </project>.

<build>
    <sourceDirectory>src</sourceDirectory>
    <resources>
    <resource>
        <directory>${basedir}/conf</directory>
        <filtering>false</filtering>
        <includes>
        <include>hbase-site.xml</include>
        </includes>
    </resource>
    </resources>
    <plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
                <version>3.8.0</version>
        <configuration>
            <source>1.8</source>
            <target>1.8</target>
        </configuration>
        </plugin>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.2.1</version>
        <configuration>
        <transformers>
            <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
            </transformer>
        </transformers>
        </configuration>
        <executions>
        <execution>
            <phase>package</phase>
            <goals>
            <goal>shade</goal>
            </goals>
        </execution>
        </executions>
    </plugin>
    </plugins>
</build>

此部分将配置包含与 HBase 有关的配置信息的资源 (conf/hbase-site.xml)。This section configures a resource (conf/hbase-site.xml) that contains configuration information for HBase.

Note

也可以通过代码设置配置值。You can also set configuration values via code. 请参阅 CreateTable 示例中的注释。See the comments in the CreateTable example.

此部分还将配置 Apache Maven 编译器插件Apache Maven 阴影插件This section also configures the Apache Maven Compiler Plugin and Apache Maven Shade Plugin. 该编译器插件用于编译拓扑。The compiler plug-in is used to compile the topology. 该阴影插件用于防止在由 Maven 构建的 JAR 程序包中复制许可证。The shade plug-in is used to prevent license duplication in the JAR package that is built by Maven. 此插件用于防止在 HDInsight 群集上运行时出现“重复的许可证文件”错误。This plugin is used to prevent a "duplicate license files" error at run time on the HDInsight cluster. 将 maven-shade-plugin 用于 ApacheLicenseResourceTransformer 实现可防止发生此错误。Using maven-shade-plugin with the ApacheLicenseResourceTransformer implementation prevents the error.

maven-shade-plugin 还会生成 uber jar,其中包含应用程序所需的所有依赖项。The maven-shade-plugin also produces an uber jar that contains all the dependencies required by the application.

下载 hbase-site.xmlDownload the hbase-site.xml

使用以下命令将 HBase 配置从 HBase 群集复制到 conf 目录。Use the following command to copy the HBase configuration from the HBase cluster to the conf directory. CLUSTERNAME 替换为 HDInsight 群集名称,然后输入以下命令:Replace CLUSTERNAME with your HDInsight cluster name and then enter the command:

scp sshuser@CLUSTERNAME-ssh.azurehdinsight.cn:/etc/hbase/conf/hbase-site.xml ./conf/hbase-site.xml

创建应用程序Create the application

实现 CreateTable 类Implement a CreateTable class

输入以下命令,以创建并打开新文件 CreateTable.javaEnter the command below to create and open a new file CreateTable.java. 根据提示选择“是”,以创建新文件。 Select Yes at the prompt to create a new file.

notepad src\main\java\com\microsoft\examples\CreateTable.java

将以下 Java 代码复制并粘贴到新文件中。Then copy and paste the java code below into the new file. 然后关闭该文件。Then close the file.

 package com.microsoft.examples;
 import java.io.IOException;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.client.HBaseAdmin;
 import org.apache.hadoop.hbase.HTableDescriptor;
 import org.apache.hadoop.hbase.TableName;
 import org.apache.hadoop.hbase.HColumnDescriptor;
 import org.apache.hadoop.hbase.client.HTable;
 import org.apache.hadoop.hbase.client.Put;
 import org.apache.hadoop.hbase.util.Bytes;

 public class CreateTable {
     public static void main(String[] args) throws IOException {
     Configuration config = HBaseConfiguration.create();

     // Example of setting zookeeper values for HDInsight
     // in code instead of an hbase-site.xml file
     //
     // config.set("hbase.zookeeper.quorum",
     //            "zookeepernode0,zookeepernode1,zookeepernode2");
     //config.set("hbase.zookeeper.property.clientPort", "2181");
     //config.set("hbase.cluster.distributed", "true");
     //
     //NOTE: Actual zookeeper host names can be found using Ambari:
     //curl -u admin:PASSWORD -G "https://CLUSTERNAME.azurehdinsight.cn/api/v1/clusters/CLUSTERNAME/hosts"

     //Linux-based HDInsight clusters use /hbase-unsecure as the znode parent
     config.set("zookeeper.znode.parent","/hbase-unsecure");

     // create an admin object using the config
     HBaseAdmin admin = new HBaseAdmin(config);

     // create the table...
     HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf("people"));
     // ... with two column families
     tableDescriptor.addFamily(new HColumnDescriptor("name"));
     tableDescriptor.addFamily(new HColumnDescriptor("contactinfo"));
     admin.createTable(tableDescriptor);

     // define some people
     String[][] people = {
         { "1", "Marcel", "Haddad", "marcel@fabrikam.com"},
         { "2", "Franklin", "Holtz", "franklin@contoso.com" },
         { "3", "Dwayne", "McKee", "dwayne@fabrikam.com" },
         { "4", "Rae", "Schroeder", "rae@contoso.com" },
         { "5", "Rosalie", "burton", "rosalie@fabrikam.com"},
         { "6", "Gabriela", "Ingram", "gabriela@contoso.com"} };

     HTable table = new HTable(config, "people");

     // Add each person to the table
     //   Use the `name` column family for the name
     //   Use the `contactinfo` column family for the email
     for (int i = 0; i< people.length; i++) {
         Put person = new Put(Bytes.toBytes(people[i][0]));
         person.add(Bytes.toBytes("name"), Bytes.toBytes("first"), Bytes.toBytes(people[i][1]));
         person.add(Bytes.toBytes("name"), Bytes.toBytes("last"), Bytes.toBytes(people[i][2]));
         person.add(Bytes.toBytes("contactinfo"), Bytes.toBytes("email"), Bytes.toBytes(people[i][3]));
         table.put(person);
     }
     // flush commits and close the table
     table.flushCommits();
     table.close();
     }
 }

此代码是 CreateTable 类,该类会创建名为 people 的表,并使用一些预定义的用户填充它。This code is the CreateTable class, which creates a table named people and populate it with some predefined users.

实现 SearchByEmail 类Implement a SearchByEmail class

输入以下命令,以创建并打开新文件 SearchByEmail.javaEnter the command below to create and open a new file SearchByEmail.java. 根据提示选择“是”,以创建新文件。 Select Yes at the prompt to create a new file.

notepad src\main\java\com\microsoft\examples\SearchByEmail.java

将以下 Java 代码复制并粘贴到新文件中。Then copy and paste the java code below into the new file. 然后关闭该文件。Then close the file.

 package com.microsoft.examples;
 import java.io.IOException;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.client.HTable;
 import org.apache.hadoop.hbase.client.Scan;
 import org.apache.hadoop.hbase.client.ResultScanner;
 import org.apache.hadoop.hbase.client.Result;
 import org.apache.hadoop.hbase.filter.RegexStringComparator;
 import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;
 import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
 import org.apache.hadoop.hbase.util.Bytes;
 import org.apache.hadoop.util.GenericOptionsParser;

 public class SearchByEmail {
     public static void main(String[] args) throws IOException {
     Configuration config = HBaseConfiguration.create();

     // Use GenericOptionsParser to get only the parameters to the class
     // and not all the parameters passed (when using WebHCat for example)
     String[] otherArgs = new GenericOptionsParser(config, args).getRemainingArgs();
     if (otherArgs.length != 1) {
         System.out.println("usage: [regular expression]");
         System.exit(-1);
     }

     // Open the table
     HTable table = new HTable(config, "people");

     // Define the family and qualifiers to be used
     byte[] contactFamily = Bytes.toBytes("contactinfo");
     byte[] emailQualifier = Bytes.toBytes("email");
     byte[] nameFamily = Bytes.toBytes("name");
     byte[] firstNameQualifier = Bytes.toBytes("first");
     byte[] lastNameQualifier = Bytes.toBytes("last");

     // Create a regex filter
     RegexStringComparator emailFilter = new RegexStringComparator(otherArgs[0]);
     // Attach the regex filter to a filter
     //   for the email column
     SingleColumnValueFilter filter = new SingleColumnValueFilter(
         contactFamily,
         emailQualifier,
         CompareOp.EQUAL,
         emailFilter
     );

     // Create a scan and set the filter
     Scan scan = new Scan();
     scan.setFilter(filter);

     // Get the results
     ResultScanner results = table.getScanner(scan);
     // Iterate over results and print  values
     for (Result result : results ) {
         String id = new String(result.getRow());
         byte[] firstNameObj = result.getValue(nameFamily, firstNameQualifier);
         String firstName = new String(firstNameObj);
         byte[] lastNameObj = result.getValue(nameFamily, lastNameQualifier);
         String lastName = new String(lastNameObj);
         System.out.println(firstName + " " + lastName + " - ID: " + id);
         byte[] emailObj = result.getValue(contactFamily, emailQualifier);
         String email = new String(emailObj);
         System.out.println(firstName + " " + lastName + " - " + email + " - ID: " + id);
     }
     results.close();
     table.close();
     }
 }

SearchByEmail 类可用于按电子邮件地址查询行。The SearchByEmail class can be used to query for rows by email address. 由于它使用正则表达式筛选器,因此,可以在使用类时提供字符串或正则表达式。Because it uses a regular expression filter, you can provide either a string or a regular expression when using the class.

实现 DeleteTable 类Implement a DeleteTable class

输入以下命令,以创建并打开新文件 DeleteTable.javaEnter the command below to create and open a new file DeleteTable.java. 根据提示选择“是”,以创建新文件。 Select Yes at the prompt to create a new file.

notepad src\main\java\com\microsoft\examples\DeleteTable.java

将以下 Java 代码复制并粘贴到新文件中。Then copy and paste the java code below into the new file. 然后关闭该文件。Then close the file.

 package com.microsoft.examples;
 import java.io.IOException;

 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.client.HBaseAdmin;

 public class DeleteTable {
     public static void main(String[] args) throws IOException {
     Configuration config = HBaseConfiguration.create();

     // Create an admin object using the config
     HBaseAdmin admin = new HBaseAdmin(config);

     // Disable, and then delete the table
     admin.disableTable("people");
     admin.deleteTable("people");
     }
 }

DeleteTable 类将通过禁用并删除由 CreateTable 类创建的表清除在此示例中创建的 HBase 表。The DeleteTable class cleans up the HBase tables created in this example by disabling and dropping the table created by the CreateTable class.

生成并打包应用程序Build and package the application

  1. hbaseapp 目录中,使用以下命令来构建包含应用程序的 JAR 文件:From the hbaseapp directory, use the following command to build a JAR file that contains the application:

    mvn clean package
    

    此命令构建应用程序并将其打包到一个 .jar 文件中。This command builds and packages the application into a .jar file.

  2. 命令完成之后,hbaseapp/target 目录包含一个名为 hbaseapp-1.0-SNAPSHOT.jar 的文件。When the command completes, the hbaseapp/target directory contains a file named hbaseapp-1.0-SNAPSHOT.jar.

    Note

    hbaseapp-1.0-SNAPSHOT.jar 文件是一个 uber jar。The hbaseapp-1.0-SNAPSHOT.jar file is an uber jar. 它包含运行应用程序所需的所有依赖项。It contains all the dependencies required to run the application.

上传 JAR 并运行作业 (SSH)Upload the JAR and run jobs (SSH)

以下步骤使用 scp 将 JAR 复制到 Apache HBase on HDInsight 群集的主要头节点。The following steps use scp to copy the JAR to the primary head node of your Apache HBase on HDInsight cluster. 然后,使用 ssh 命令连接到群集并直接在头节点上运行此示例。The ssh command is then used to connect to the cluster and run the example directly on the head node.

  1. 将该 jar 上传到群集。Upload the jar to the cluster. CLUSTERNAME 替换为 HDInsight 群集名称,然后输入以下命令:Replace CLUSTERNAME with your HDInsight cluster name and then enter the following command:

    scp ./target/hbaseapp-1.0-SNAPSHOT.jar sshuser@CLUSTERNAME-ssh.azurehdinsight.cn:hbaseapp-1.0-SNAPSHOT.jar
    
  2. 连接到 HBase 群集。Connect to the HBase cluster. CLUSTERNAME 替换为 HDInsight 群集名称,然后输入以下命令:Replace CLUSTERNAME with your HDInsight cluster name and then enter the following command:

    ssh sshuser@CLUSTERNAME-ssh.azurehdinsight.cn
    
  3. 若要使用 Java 应用程序创建 HBase 表,请在打开的 ssh 连接中使用以下命令:To create an HBase table using the Java application, use the following command in your open ssh connection:

    yarn jar hbaseapp-1.0-SNAPSHOT.jar com.microsoft.examples.CreateTable
    

    此命令会创建名为 people 的一个 HBase 表,并在其中填充数据。This command creates a HBase table named people, and populates it with data.

  4. 若要搜索表中存储的电子邮件地址,请使用以下命令:To search for email addresses stored in the table, use the following command:

    yarn jar hbaseapp-1.0-SNAPSHOT.jar com.microsoft.examples.SearchByEmail contoso.com
    

    将生成以下结果:You receive the following results:

     Franklin Holtz - ID: 2
     Franklin Holtz - franklin@contoso.com - ID: 2
     Rae Schroeder - ID: 4
     Rae Schroeder - rae@contoso.com - ID: 4
     Gabriela Ingram - ID: 6
     Gabriela Ingram - gabriela@contoso.com - ID: 6
    
  5. 若要删除表,请使用以下命令:To delete the table, use the following command:

    yarn jar hbaseapp-1.0-SNAPSHOT.jar com.microsoft.examples.DeleteTable
    

上传 JAR 并运行作业 (PowerShell)Upload the JAR and run jobs (PowerShell)

以下步骤使用 Azure PowerShell AZ 模块将 JAR 上传到 Apache HBase 群集的默认存储。The following steps use the Azure PowerShell AZ module to upload the JAR to the default storage for your Apache HBase cluster. 然后使用 HDInsight cmdlet 远程运行这些示例。HDInsight cmdlets are then used to run the examples remotely.

  1. 安装并配置 AZ 模块后,创建一个名为 hbase-runner.psm1 的文件。After installing and configuring the AZ module, create a file named hbase-runner.psm1. 将以下文本用作此文件的内容:Use the following text as the contents of this file:

     <#
     .SYNOPSIS
     Copies a file to the primary storage of an HDInsight cluster.
     .DESCRIPTION
     Copies a file from a local directory to the blob container for
     the HDInsight cluster.
     .EXAMPLE
     Start-HBaseExample -className "com.microsoft.examples.CreateTable"
     -clusterName "MyHDInsightCluster"
    
     .EXAMPLE
     Start-HBaseExample -className "com.microsoft.examples.SearchByEmail"
     -clusterName "MyHDInsightCluster"
     -emailRegex "contoso.com"
    
     .EXAMPLE
     Start-HBaseExample -className "com.microsoft.examples.SearchByEmail"
     -clusterName "MyHDInsightCluster"
     -emailRegex "^r" -showErr
     #>
    
     function Start-HBaseExample {
     [CmdletBinding(SupportsShouldProcess = $true)]
     param(
     #The class to run
     [Parameter(Mandatory = $true)]
     [String]$className,
    
     #The name of the HDInsight cluster
     [Parameter(Mandatory = $true)]
     [String]$clusterName,
    
     #Only used when using SearchByEmail
     [Parameter(Mandatory = $false)]
     [String]$emailRegex,
    
     #Use if you want to see stderr output
     [Parameter(Mandatory = $false)]
     [Switch]$showErr
     )
    
     Set-StrictMode -Version 3
    
     # Is the Azure module installed?
     FindAzure
    
     # Get the login for the HDInsight cluster
     $creds=Get-Credential -Message "Enter the login for the cluster" -UserName "admin"
    
     # The JAR
     $jarFile = "wasb:///example/jars/hbaseapp-1.0-SNAPSHOT.jar"
    
     # The job definition
     $jobDefinition = New-AzHDInsightMapReduceJobDefinition `
         -JarFile $jarFile `
         -ClassName $className `
         -Arguments $emailRegex
    
     # Get the job output
     $job = Start-AzHDInsightJob `
         -ClusterName $clusterName `
         -JobDefinition $jobDefinition `
         -HttpCredential $creds
     Write-Host "Wait for the job to complete ..." -ForegroundColor Green
     Wait-AzHDInsightJob `
         -ClusterName $clusterName `
         -JobId $job.JobId `
         -HttpCredential $creds
     if($showErr)
     {
     Write-Host "STDERR"
     Get-AzHDInsightJobOutput `
                 -Clustername $clusterName `
                 -JobId $job.JobId `
                 -HttpCredential $creds `
                 -DisplayOutputType StandardError
     }
     Write-Host "Display the standard output ..." -ForegroundColor Green
     Get-AzHDInsightJobOutput `
                 -Clustername $clusterName `
                 -JobId $job.JobId `
                 -HttpCredential $creds
     }
    
     <#
     .SYNOPSIS
     Copies a file to the primary storage of an HDInsight cluster.
     .DESCRIPTION
     Copies a file from a local directory to the blob container for
     the HDInsight cluster.
     .EXAMPLE
     Add-HDInsightFile -localPath "C:\temp\data.txt"
     -destinationPath "example/data/data.txt"
     -ClusterName "MyHDInsightCluster"
     .EXAMPLE
     Add-HDInsightFile -localPath "C:\temp\data.txt"
     -destinationPath "example/data/data.txt"
     -ClusterName "MyHDInsightCluster"
     -Container "MyContainer"
     #>
    
     function Add-HDInsightFile {
         [CmdletBinding(SupportsShouldProcess = $true)]
         param(
             #The path to the local file.
             [Parameter(Mandatory = $true)]
             [String]$localPath,
    
             #The destination path and file name, relative to the root of the container.
             [Parameter(Mandatory = $true)]
             [String]$destinationPath,
    
             #The name of the HDInsight cluster
             [Parameter(Mandatory = $true)]
             [String]$clusterName,
    
             #If specified, overwrites existing files without prompting
             [Parameter(Mandatory = $false)]
             [Switch]$force
         )
    
         Set-StrictMode -Version 3
    
         # Is the Azure module installed?
         FindAzure
    
         # Get authentication for the cluster
         $creds=Get-Credential
    
         # Does the local path exist?
         if (-not (Test-Path $localPath))
         {
             throw "Source path '$localPath' does not exist."
         }
    
         # Get the primary storage container
         $storage = GetStorage -clusterName $clusterName
    
         # Upload file to storage, overwriting existing files if -force was used.
         Set-AzStorageBlobContent -File $localPath `
             -Blob $destinationPath `
             -force:$force `
             -Container $storage.container `
             -Context $storage.context
     }
    
     function FindAzure {
         # Is there an active Azure subscription?
         $sub = Get-AzSubscription -ErrorAction SilentlyContinue
         if(-not($sub))
         {
             Connect-AzAccount -EnvironmentName AzureChinaCloud
         }
     }
    
     function GetStorage {
         param(
             [Parameter(Mandatory = $true)]
             [String]$clusterName
         )
         $hdi = Get-AzHDInsightCluster -ClusterName $clusterName
         # Does the cluster exist?
         if (!$hdi)
         {
             throw "HDInsight cluster '$clusterName' does not exist."
         }
         # Create a return object for context & container
         $return = @{}
         $storageAccounts = @{}
    
         # Get storage information
         $resourceGroup = $hdi.ResourceGroup
         $storageAccountName=$hdi.DefaultStorageAccount.split('.')[0]
         $container=$hdi.DefaultStorageContainer
         $storageAccountKey=(Get-AzStorageAccountKey `
             -Name $storageAccountName `
         -ResourceGroupName $resourceGroup)[0].Value
         # Get the resource group, in case we need that
         $return.resourceGroup = $resourceGroup
         # Get the storage context, as we can't depend
         # on using the default storage context
         $return.context = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey
         # Get the container, so we know where to
         # find/store blobs
         $return.container = $container
         # Return storage accounts to support finding all accounts for
         # a cluster
         $return.storageAccount = $storageAccountName
         $return.storageAccountKey = $storageAccountKey
    
         return $return
     }
     # Only export the verb-phrase things
     export-modulemember *-*
    

    此文件包含两个模块:This file contains two modules:

    • Add-HDInsightFile - 用于将文件上传到群集Add-HDInsightFile - used to upload files to the cluster
    • Start-HBaseExample - 用于运行以前创建的类Start-HBaseExample - used to run the classes created earlier
  2. hbase-runner.psm1 文件保存在 hbaseapp 目录中。Save the hbase-runner.psm1 file in the hbaseapp directory.

  3. 将这些模块注册到 Azure PowerShell。Register the modules with Azure PowerShell. 打开新的 Azure PowerShell 窗口,编辑以下命令,将 CLUSTERNAME 替换为群集的名称,Open a new Azure PowerShell window and edit the command below by replacing CLUSTERNAME with the name of your cluster. 然后输入以下命令:Then enter the following commands:

    cd C:\HDI\hbaseapp
    $myCluster = "CLUSTERNAME"
    Import-Module .\hbase-runner.psm1
    
  4. 使用以下命令将 hbaseapp-1.0-SNAPSHOT.jar 上传到你的群集。Use the following command to upload the hbaseapp-1.0-SNAPSHOT.jar to your cluster.

    Add-HDInsightFile -localPath target\hbaseapp-1.0-SNAPSHOT.jar -destinationPath example/jars/hbaseapp-1.0-SNAPSHOT.jar -clusterName $myCluster
    

    出现提示时,输入群集登录名 (admin) 和密码。When prompted, enter the cluster login (admin) name and password. 此命令将 hbaseapp-1.0-SNAPSHOT.jar 上传到群集的主存储中的 example/jars 位置。The command uploads the hbaseapp-1.0-SNAPSHOT.jar to the example/jars location in the primary storage for your cluster.

  5. 若要使用 hbaseapp 创建表,请使用以下命令:To create a table using the hbaseapp, use the following command:

    Start-HBaseExample -className com.microsoft.examples.CreateTable -clusterName $myCluster
    

    出现提示时,输入群集登录名 (admin) 和密码。When prompted, enter the cluster login (admin) name and password.

    此命令将在 HDInsight 群集上的 HBase 中创建一个名为 people 的表。This command creates a table named people in HBase on your HDInsight cluster. 此命令在控制台窗口中不显示任何输出。This command doesn't show any output in the console window.

  6. 若要在表中搜索条目,请使用以下命令:To search for entries in the table, use the following command:

    Start-HBaseExample -className com.microsoft.examples.SearchByEmail -clusterName $myCluster -emailRegex contoso.com
    

    出现提示时,输入群集登录名 (admin) 和密码。When prompted, enter the cluster login (admin) name and password.

    此命令使用 SearchByEmail 类搜索 contactinformation 列系列和 email 列包含字符串 contoso.com 的任何行。This command uses the SearchByEmail class to search for any rows where the contactinformation column family and the email column, contains the string contoso.com. 应该会收到以下结果:You should receive the following results:

       Franklin Holtz - ID: 2
       Franklin Holtz - franklin@contoso.com - ID: 2
       Rae Schroeder - ID: 4
       Rae Schroeder - rae@contoso.com - ID: 4
       Gabriela Ingram - ID: 6
       Gabriela Ingram - gabriela@contoso.com - ID: 6
    

    fabrikam.com 用于 -emailRegex 值会返回电子邮件字段中包含 fabrikam.com 的用户。Using fabrikam.com for the -emailRegex value returns the users that have fabrikam.com in the email field. 还可以使用正则表达式作为搜索词。You can also use regular expressions as the search term. 例如, ^r 返回以字母“r”开头的电子邮件地址。For example, ^r returns email addresses that begin with the letter 'r'.

  7. 若要删除表,请使用以下命令:To delete the table, use the following command:

    Start-HBaseExample -className com.microsoft.examples.DeleteTable -clusterName $myCluster
    

使用 Start-HBaseExample 时无结果或意外结果No results or unexpected results when using Start-HBaseExample

使用 -showErr 参数可查看运行作业时生成的标准错误 (STDERR)。Use the -showErr parameter to view the standard error (STDERR) that is produced while running the job.

后续步骤Next steps

了解如何将 SQuirreL SQL 与 Apache HBase 配合使用Learn how to use SQuirreL SQL with Apache HBase