从 Azure 存储下载大量随机数据Download large amounts of random data from Azure storage

本教程是一个系列中的第三部分。This tutorial is part three of a series. 本教程介绍如何从 Azure 存储下载大量数据。This tutorial shows you how to download large amounts of data from Azure storage.

在该系列的第三部分中,你会学习如何:In part three of the series, you learn how to:

  • 更新应用程序Update the application
  • 运行应用程序Run the application
  • 验证连接数Validate the number of connections

先决条件Prerequisites

若要完成本教程,必须先完成上一存储教程:将大量随机数据并行上传到 Azure 存储To complete this tutorial, you must have completed the previous Storage tutorial: Upload large amounts of random data in parallel to Azure storage.

远程登录到虚拟机Remote into your virtual machine

若要创建与虚拟机的远程桌面会话,请在本地计算机上使用以下命令。To create a remote desktop session with the virtual machine, use the following command on your local machine. 将 IP 地址替换为虚拟机的 publicIPAddress。Replace the IP address with the publicIPAddress of your virtual machine. 出现提示时,输入创建虚拟机时使用的凭据。When prompted, enter the credentials used when creating the virtual machine.

mstsc /v:<publicIpAddress>

更新应用程序Update the application

上一教程中只将文件上传到了存储帐户。In the previous tutorial, you only uploaded files to the storage account. 在文本编辑器中打开 D:\git\storage-dotnet-perf-scale-app\Program.csOpen D:\git\storage-dotnet-perf-scale-app\Program.cs in a text editor. Main 方法替换为以下示例。Replace the Main method with the following sample. 本示例添加了上传任务注释,取消了下载任务注释以及在完成时删除存储帐户中的内容这一任务的注释。This example comments out the upload task and uncomments the download task and the task to delete the content in the storage account when complete.

public static void Main(string[] args)
{
    Console.WriteLine("Azure Blob storage performance and scalability sample");
    // Set threading and default connection limit to 100 to 
    // ensure multiple threads and connections can be opened.
    // This is in addition to parallelism with the storage 
    // client library that is defined in the functions below.
    ThreadPool.SetMinThreads(100, 4);
    ServicePointManager.DefaultConnectionLimit = 100; // (Or More)

    bool exception = false;
    try
    {
        // Call the UploadFilesAsync function.
        // await UploadFilesAsync();

        // Uncomment the following line to enable downloading of files from the storage account.
        // This is commented out initially to support the tutorial at 
        // https://docs.azure.cn/storage/blobs/storage-blob-scalable-app-download-files
        await DownloadFilesAsync();
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
        exception = true;
    }
    finally
    {
        // The following function will delete the container and all files contained in them.
        // This is commented out initially as the tutorial at 
        // https://docs.azure.cn/storage/blobs/storage-blob-scalable-app-download-files
        // has you upload only for one tutorial and download for the other.
        if (!exception)
        {
            // await DeleteExistingContainersAsync();
        }
        Console.WriteLine("Press any key to exit the application");
        Console.ReadKey();
    }
}

应用程序更新后,需再次生成应用程序。After the application has been updated, you need to build the application again. 打开 Command Prompt 并导航到 D:\git\storage-dotnet-perf-scale-appOpen a Command Prompt and navigate to D:\git\storage-dotnet-perf-scale-app. 通过运行 dotnet build 重新生成应用程序,如以下示例所示:Rebuild the application by running dotnet build as seen in the following example:

dotnet build

运行应用程序Run the application

现在,已重新生成应用程序,可使用更新的代码运行该应用程序。Now that the application has been rebuilt it is time to run the application with the updated code. 如果尚未打开,请打开 Command Prompt 并导航到 D:\git\storage-dotnet-perf-scale-appIf not already open, open a Command Prompt and navigate to D:\git\storage-dotnet-perf-scale-app.

键入 dotnet run 运行应用程序。Type dotnet run to run the application.

dotnet run

下例显示了 DownloadFilesAsync 任务:The DownloadFilesAsync task is shown in the following example:

应用程序读取位于 storageconnectionstring 中指定的存储帐户中的容器。The application reads the containers located in the storage account specified in the storageconnectionstring. 它使用 GetBlobs 方法循环访问 Blob,并使用 DownloadToAsync 方法将它们下载到本地计算机。It iterates through the blobs using the GetBlobs method and downloads them to the local machine using the DownloadToAsync method.

private static async Task DownloadFilesAsync()
{
    BlobServiceClient blobServiceClient = GetBlobServiceClient();

    // Path to the directory to upload
    string downloadPath = Directory.GetCurrentDirectory() + "\\download\\";
    Directory.CreateDirectory(downloadPath);
    Console.WriteLine($"Created directory {downloadPath}");

    // Specify the StorageTransferOptions
    var options = new StorageTransferOptions
    {
        // Set the maximum number of workers that 
        // may be used in a parallel transfer.
        MaximumConcurrency = 8,

        // Set the maximum length of a transfer to 50MB.
        MaximumTransferSize = 50 * 1024 * 1024
    };

    List<BlobContainerClient> containers = new List<BlobContainerClient>();

    foreach (BlobContainerItem container in blobServiceClient.GetBlobContainers())
    {
        containers.Add(blobServiceClient.GetBlobContainerClient(container.Name));
    }

    // Start a timer to measure how long it takes to download all the files.
    Stopwatch timer = Stopwatch.StartNew();

    // Download the blobs
    try
    {
        int count = 0;

        // Create a queue of tasks that will each upload one file.
        var tasks = new Queue<Task<Response>>();

        foreach (BlobContainerClient container in containers)
        {                     
            // Iterate through the files
            foreach (BlobItem blobItem in container.GetBlobs())
            {
                string fileName = downloadPath + blobItem.Name;
                Console.WriteLine($"Downloading {blobItem.Name} to {downloadPath}");

                BlobClient blob = container.GetBlobClient(blobItem.Name);

                // Add the download task to the queue
                tasks.Enqueue(blob.DownloadToAsync(fileName, default, options));
                count++;
            }
        }

        // Run all the tasks asynchronously.
        await Task.WhenAll(tasks);

        // Report the elapsed time.
        timer.Stop();
        Console.WriteLine($"Downloaded {count} files in {timer.Elapsed.TotalSeconds} seconds");
    }
    catch (RequestFailedException ex)
    {
        Console.WriteLine($"Azure request failed: {ex.Message}");
    }
    catch (DirectoryNotFoundException ex)
    {
        Console.WriteLine($"Error parsing files in the directory: {ex.Message}");
    }
    catch (Exception ex)
    {
        Console.WriteLine($"Exception: {ex.Message}");
    }
}

验证连接Validate the connections

在下载文件的同时,可以验证存储帐户的并发连接数。While the files are being downloaded, you can verify the number of concurrent connections to your storage account. 打开控制台窗口,然后键入 netstat -a | find /c "blob:https"Open a console window and type netstat -a | find /c "blob:https". 此命令显示当前打开的连接数。This command shows the number of connections that are currently opened. 如以下示例所示,从存储帐户下载文件时,打开了 280 多个连接。As you can see from the following example, over 280 connections were open when downloading files from the storage account.

C:\>netstat -a | find /c "blob:https"
289

C:\>

后续步骤Next steps

本系列的第三部分介绍了从存储帐户下载大量数据的方法,包括如何:In part three of the series, you learned about downloading large amounts of data from a storage account, including how to:

  • 运行应用程序Run the application
  • 验证连接数Validate the number of connections

转到本系列的第四部分,验证门户中的吞吐量和延迟指标。Go to part four of the series to verify throughput and latency metrics in the portal.