使用 HDInsight .NET SDK 运行 Apache Hive 查询Run Apache Hive queries using HDInsight .NET SDK

了解如何使用 HDInsight .NET SDK 提交 Apache Hive 查询。Learn how to submit Apache Hive queries using HDInsight .NET SDK. 编写 C# 程序来提交 Hive 查询以列出 Hive 表,并显示结果。You write a C# program to submit a Hive query for listing Hive tables, and display the results.

Note

必须从 Windows 客户端执行本文中的步骤。The steps in this article must be performed from a Windows client. 有关使用 Linux、OS X 或 Unix 客户端处理 Hive 的信息,请使用本文顶部显示的选项卡选择器。For information on using a Linux, OS X, or Unix client to work with Hive, use the tab selector shown on the top of the article.

先决条件Prerequisites

在开始阅读本文前,必须具有以下项:Before you begin this article, you must have the following items:

  • HDInsight 中的 Apache Hadoop 群集An Apache Hadoop cluster in HDInsight. 请参阅在 HDInsight 中开始使用基于 Linux 的 HadoopSee Get started using Linux-based Hadoop in HDInsight.

    Warning

    自 2017 年 9 月 15 日起,HDInsight .NET SDK 仅支持从 Azure 存储帐户返回 Hive 查询结果。As of September 15, 2017, the HDInsight .NET SDK only supports returning Hive query results from Azure Storage accounts. 如果将此示例用于使用 Azure Data Lake Storage 作为主存储的 HDInsight 群集,则无法使用 .NET SDK 检索搜索结果。If you use this example with an HDInsight cluster that uses Azure Data Lake Storage as primary storage, you cannot retrieve search results using the .NET SDK.

  • Visual Studio 2013/2015/2017Visual Studio 2013/2015/2017.

运行 Hive 查询Run a Hive Query

HDInsight .NET SDK 提供 .NET 客户端库,可简化从 .NET 中使用 HDInsight 群集的操作。The HDInsight .NET SDK provides .NET client libraries, which makes it easier to work with HDInsight clusters from .NET.

提交作业To Submit jobs

  1. 在 Visual Studio 中创建 C# 控制台应用程序。Create a C# console application in Visual Studio.

  2. 通过 Nuget 包管理器控制台运行以下命令:From the Nuget Package Manager Console, run the following command:

     Install-Package Microsoft.Azure.Management.HDInsight.Job
    
  3. 使用以下代码:Use the following code:

        using System.Collections.Generic;
        using System.IO;
        using System.Text;
        using System.Threading;
        using Microsoft.Azure.Management.HDInsight.Job;
        using Microsoft.Azure.Management.HDInsight.Job.Models;
        using Hyak.Common;
    
        namespace SubmitHDInsightJobDotNet
        {
            class Program
            {
                private static HDInsightJobManagementClient _hdiJobManagementClient;
    
                private const string ExistingClusterName = "<Your HDInsight Cluster Name>";
                private const string ExistingClusterUri = ExistingClusterName + ".azurehdinsight.cn";
                private const string ExistingClusterUsername = "<Cluster Username>";
                private const string ExistingClusterPassword = "<Cluster User Password>";
    
                // Only Azure Storage accounts are supported by the SDK
                private const string DefaultStorageAccountName = "<Default Storage Account Name>";
                private const string DefaultStorageAccountKey = "<Default Storage Account Key>";
                private const string DefaultStorageContainerName = "<Default Blob Container Name>";
    
                static void Main(string[] args)
                {
                    System.Console.WriteLine("The application is running ...");
    
                    var clusterCredentials = new BasicAuthenticationCloudCredentials { Username = ExistingClusterUsername, Password = ExistingClusterPassword };
                    _hdiJobManagementClient = new HDInsightJobManagementClient(ExistingClusterUri, clusterCredentials);
    
                    SubmitHiveJob();
    
                    System.Console.WriteLine("Press ENTER to continue ...");
                    System.Console.ReadLine();
                }
    
                private static void SubmitHiveJob()
                {
                    Dictionary<string, string> defines = new Dictionary<string, string> { { "hive.execution.engine", "tez" }, { "hive.exec.reducers.max", "1" } };
                    List<string> args = new List<string> { { "argA" }, { "argB" } };
                    var parameters = new HiveJobSubmissionParameters
                    {
                        Query = "SHOW TABLES",
                        Defines = defines,
                        Arguments = args
                    };
    
                    System.Console.WriteLine("Submitting the Hive job to the cluster...");
                    var jobResponse = _hdiJobManagementClient.JobManagement.SubmitHiveJob(parameters);
                    var jobId = jobResponse.JobSubmissionJsonResponse.Id;
                    System.Console.WriteLine("Response status code is " + jobResponse.StatusCode);
                    System.Console.WriteLine("JobId is " + jobId);
    
                    System.Console.WriteLine("Waiting for the job completion ...");
    
                    // Wait for job completion
                    var jobDetail = _hdiJobManagementClient.JobManagement.GetJob(jobId).JobDetail;
                    while (!jobDetail.Status.JobComplete)
                    {
                        Thread.Sleep(1000);
                        jobDetail = _hdiJobManagementClient.JobManagement.GetJob(jobId).JobDetail;
                    }
    
                    // Get job output
                    var storageAccess = new AzureStorageAccess(DefaultStorageAccountName, DefaultStorageAccountKey,
                        DefaultStorageContainerName);
                    var output = (jobDetail.ExitValue == 0)
                        ? _hdiJobManagementClient.JobManagement.GetJobOutput(jobId, storageAccess) // fetch stdout output in case of success
                        : _hdiJobManagementClient.JobManagement.GetJobErrorLogs(jobId, storageAccess); // fetch stderr output in case of failure
    
                    System.Console.WriteLine("Job output is: ");
    
                    using (var reader = new StreamReader(output, Encoding.UTF8))
                    {
                        string value = reader.ReadToEnd();
                        System.Console.WriteLine(value);
                    }
                }
            }
        }
    
  4. F5 运行应用程序。Press F5 to run the application.

应用程序的输出应如下:The output of the application shall be similar to:

HDInsight Hadoop Hive 作业输出

后续步骤Next steps

在本文中,已经学习了几种创建 HDInsight 群集的方法。In this article, you have learned several ways to create an HDInsight cluster. 要了解更多信息,请参阅下列文章:To learn more, see the following articles: