使用 Azure 存储共享访问签名来限制访问 HDInsight 中的数据Use Azure Storage Shared Access Signatures to restrict access to data in HDInsight

HDInsight 对群集关联的 Azure 存储帐户中的数据拥有完全访问权限。HDInsight has full access to data in the Azure Storage accounts associated with the cluster. 可以使用 Blob 容器中的共享访问签名来限制对数据的访问。You can use Shared Access Signatures on the blob container to restrict access to the data. 共享访问签名 (SAS) 是可用于限制数据访问权限的一项 Azure 存储帐户功能。Shared Access Signatures (SAS) are a feature of Azure storage accounts that allows you to limit access to data. 例如,它可以提供对数据的只读访问。For example, providing read-only access to data.

警告

HDInsight 必须对群集的默认存储拥有完全访问权限。HDInsight must have full access to the default storage for the cluster.

先决条件Prerequisites

  • Azure 订阅。An Azure subscription.

  • SSH 客户端。An SSH client. 有关详细信息,请参阅使用 SSH 连接到 HDInsight (Apache Hadoop)For more information, see Connect to HDInsight (Apache Hadoop) using SSH.

  • 一个现有的存储容器An existing storage container.

  • 如果使用 PowerShell,你将需要 Az 模块If using PowerShell, you will need the Az Module.

  • 若要使用 Azure CLI 但尚未安装它,请参阅安装 Azure CLIIf wanting to use Azure CLI and you have not yet installed it, see Install the Azure CLI.

  • 如果使用 Python,请安装 2.7 或更高版本。If using Python, version 2.7 or higher.

  • 如果使用 C#,Visual Studio 的版本必须是 2013 或更高。If using C#, Visual Studio must be version 2013 or higher.

  • 存储帐户的 URI 方案The URI scheme for your storage account. 对于 Azure 存储,此值为 wasb://;对于Azure Data Lake Storage Gen2,此值为 abfs://This would be wasb:// for Azure Storage, abfs:// for Azure Data Lake Storage Gen2. 如果为 Azure 存储或 Data Lake Storage Gen2 启用了安全传输,则 URI 将是 wasbs://abfss://。另请参阅安全传输If secure transfer is enabled for Azure Storage or Data Lake Storage Gen2, the URI would be wasbs:// or abfss://, respectively See also, secure transfer.

  • 共享访问签名要添加到的现有 HDInsight 群集。An existing HDInsight cluster to add a Shared Access Signature to. 如果没有,则可以使用 Azure PowerShell 创建群集,并在创建群集期间添加共享访问签名。If not, you can use Azure PowerShell to create a cluster and add a Shared Access Signature during cluster creation.

  • https://github.com/Azure-Samples/hdinsight-dotnet-python-azure-storage-shared-access-signature 中的示例文件。The example files from https://github.com/Azure-Samples/hdinsight-dotnet-python-azure-storage-shared-access-signature. 此存储库包含以下项:This repository contains the following items:

    • Visual Studio 项目,可以创建存储容器、存储策略,以及与 HDInsight 配合使用的 SASA Visual Studio project that can create a storage container, stored policy, and SAS for use with HDInsight
    • Python 脚本,可以创建存储容器、存储策略,以及与 HDInsight 配合使用的 SASA Python script that can create a storage container, stored policy, and SAS for use with HDInsight
    • PowerShell 脚本,可以创建 HDInsight 群集并将其配置为使用 SAS。A PowerShell script that can create a HDInsight cluster and configure it to use the SAS. 下面进一步使用更新的版本。An updated version is used further below.
    • 示例文件:hdinsight-dotnet-python-azure-storage-shared-access-signature-master\sampledata\sample.logA sample file: hdinsight-dotnet-python-azure-storage-shared-access-signature-master\sampledata\sample.log

共享访问签名Shared Access Signatures

共享访问签名有两种形式:There are two forms of Shared Access Signatures:

  • 即席:针对该 SAS 的开始时间、到期时间和权限全都在 SAS URI 上指定。Ad hoc: The start time, expiry time, and permissions for the SAS are all specified on the SAS URI.

  • 存储访问策略:在资源容器(例如 Blob 容器)中定义存储的访问策略。Stored access policy: A stored access policy is defined on a resource container, such as a blob container. 可以使用策略来管理一个或多个共享访问签名的约束。A policy can be used to manage constraints for one or more shared access signatures. 将某一 SAS 与一个存储访问策略相关联时,该 SAS 会继承对该存储访问策略定义的约束:开始时间、到期时间和权限。When you associate a SAS with a stored access policy, the SAS inherits the constraints - the start time, expiry time, and permissions - defined for the stored access policy.

这两种形式之间的差异对于一个关键情形而言十分重要:吊销。The difference between the two forms is important for one key scenario: revocation. SAS 就是 URL,因此获取该 SAS 的任何人都可以使用它,而与谁请求它开始操作无关。A SAS is a URL, so anyone who obtains the SAS can use it, regardless of who requested it to begin with. 如果某一 SAS 是公开发布的,则世界上的任何人都可以使用它。If a SAS is published publicly, it can be used by anyone in the world. 在发生以下四种情况之一前分发的 SAS 有效:A SAS that is distributed is valid until one of four things happens:

  1. 达到了对该 SAS 指定的到期时间。The expiry time specified on the SAS is reached.

  2. 达到了该 SAS 引用的存储访问策略中指定的过期时间。The expiry time specified on the stored access policy referenced by the SAS is reached. 以下方案导致达到了到期时间:The following scenarios cause the expiry time to be reached:

    • 时间间隔已过。The time interval has elapsed.
    • 将存储访问策略修改为具有过去的到期时间。The stored access policy is modified to have an expiry time in the past. 更改到期时间是撤销 SAS 的一种方法。Changing the expiry time is one way to revoke the SAS.
  3. 删除了该 SAS 引用的存储访问策略,这是用于吊销 SAS 的另一种方法。The stored access policy referenced by the SAS is deleted, which is another way to revoke the SAS. 如果重新创建同名的存储访问策略,以前策略的所有 SAS 令牌都将有效(如果 SAS 的过期时间尚未过)。If you recreate the stored access policy with the same name, all SAS tokens for the previous policy are valid (if the expiry time on the SAS has not passed). 如果想要撤销 SAS,请确保使用不同名称(如果你使用将来的过期时间重新创建该访问策略)。If you intend to revoke the SAS, be sure to use a different name if you recreate the access policy with an expiry time in the future.

  4. 将重新生成用于创建 SAS 的帐户密钥。The account key that was used to create the SAS is regenerated. 重新生成密钥会导致使用前一密钥的所有应用程序无法通过身份验证。Regenerating the key causes all applications that use the previous key to fail authentication. 将所有组件更新为使用新密钥。Update all components to the new key.

重要

共享访问签名 URI 与用于创建签名的帐户密钥和关联的存储访问策略(如果有)相关联。A shared access signature URI is associated with the account key used to create the signature, and the associated stored access policy (if any). 如果未指定存储访问策略,则吊销共享访问签名的唯一方法是更改帐户密钥。If no stored access policy is specified, the only way to revoke a shared access signature is to change the account key.

建议始终使用存储访问策略。We recommend that you always use stored access policies. 使用存储策略时,可以根据需要撤销签名或延长过期日期。When using stored policies, you can either revoke signatures or extend the expiry date as needed. 本文档中的步骤使用存储访问策略生成 SAS。The steps in this document use stored access policies to generate SAS.

有关共享访问签名的详细信息,请参阅了解 SAS 模型For more information on Shared Access Signatures, see Understanding the SAS model.

创建存储策略和 SASCreate a stored policy and SAS

保存每个方法结束时生成的 SAS 令牌。Save the SAS token that is produced at the end of each method. 令牌如下所示:The token will look similar to the following:

?sv=2018-03-28&sr=c&si=myPolicyPS&sig=NAxefF%2BrR2ubjZtyUtuAvLQgt%2FJIN5aHJMj6OsDwyy4%3D

使用 PowerShellUsing PowerShell

请将 RESOURCEGROUPSTORAGEACCOUNTSTORAGECONTAINER 替换为现有存储容器的相应值。Replace RESOURCEGROUP, STORAGEACCOUNT, and STORAGECONTAINER with the appropriate values for your existing storage container. 将目录更改为 hdinsight-dotnet-python-azure-storage-shared-access-signature-master,或修改 -File 参数以包含 Set-AzStorageblobcontent 的绝对路径。Change directory to hdinsight-dotnet-python-azure-storage-shared-access-signature-master or revise the -File parameter to contain the absolute path for Set-AzStorageblobcontent. 输入以下 PowerShell 命令:Enter the following PowerShell command:

$resourceGroupName = "RESOURCEGROUP"
$storageAccountName = "STORAGEACCOUNT"
$containerName = "STORAGECONTAINER"
$policy = "myPolicyPS"

# Login to your Azure subscription
$sub = Get-AzSubscription -ErrorAction SilentlyContinue
if(-not($sub))
{
    Connect-AzAccount
}

# If you have multiple subscriptions, set the one to use
# Select-AzSubscription -SubscriptionId "<SUBSCRIPTIONID>"

# Get the access key for the Azure Storage account
$storageAccountKey = (Get-AzStorageAccountKey `
                                -ResourceGroupName $resourceGroupName `
                                -Name $storageAccountName)[0].Value

# Create an Azure Storage context
$storageContext = New-AzStorageContext `
                                -StorageAccountName $storageAccountName `
                                -StorageAccountKey $storageAccountKey

# Create a stored access policy for the Azure storage container
New-AzStorageContainerStoredAccessPolicy `
   -Container $containerName `
   -Policy $policy `
   -Permission "rl" `
   -ExpiryTime "12/31/2025 08:00:00" `
   -Context $storageContext

# Get the stored access policy or policies for the Azure storage container
Get-AzStorageContainerStoredAccessPolicy `
    -Container $containerName `
    -Context $storageContext

# Generates an SAS token for the Azure storage container
New-AzStorageContainerSASToken `
    -Name $containerName `
    -Policy $policy `
    -Context $storageContext

<# Removes a stored access policy from the Azure storage container
Remove-AzStorageContainerStoredAccessPolicy `
    -Container $containerName `
    -Policy $policy `
    -Context $storageContext
#>

# upload a file for a later example
Set-AzStorageblobcontent `
    -File "./sampledata/sample.log" `
    -Container $containerName `
    -Blob "samplePS.log" `
    -Context $storageContext

使用 Azure CLIUsing Azure CLI

本部分使用的变量基于 Windows 环境。The use of variables in this section is based on a Windows environment. 对于 bash 或其他环境,需要做出轻微的改动。Slight variations will be needed for bash or other environments.

  1. 请将 STORAGEACCOUNTSTORAGECONTAINER 替换为现有存储容器的相应值。Replace STORAGEACCOUNT, and STORAGECONTAINER with the appropriate values for your existing storage container.

    # set variables
    set AZURE_STORAGE_ACCOUNT=STORAGEACCOUNT
    set AZURE_STORAGE_CONTAINER=STORAGECONTAINER
    
    #Login
    az login
    
    # If you have multiple subscriptions, set the one to use
    # az account set --subscription SUBSCRIPTION
    
    # Retrieve the primary key for the storage account
    az storage account keys list --account-name %AZURE_STORAGE_ACCOUNT% --query "[0].{PrimaryKey:value}" --output table
    
  2. 将检索到的主键设置为某个变量,供稍后使用。Set the retrieved primary key to a variable for later use. 请将 PRIMARYKEY 替换为在前一步骤中检索到的值,然后输入以下命令:Replace PRIMARYKEY with the retrieved value in the prior step, and then enter the command below:

    #set variable for primary key
    set AZURE_STORAGE_KEY=PRIMARYKEY
    
  3. 将目录更改为 hdinsight-dotnet-python-azure-storage-shared-access-signature-master,或修改 --file 参数以包含 az storage blob upload 的绝对路径。Change directory to hdinsight-dotnet-python-azure-storage-shared-access-signature-master or revise the --file parameter to contain the absolute path for az storage blob upload. 执行剩余的命令:Execute the remaining commands:

    # Create stored access policy on the containing object
    az storage container policy create --container-name %AZURE_STORAGE_CONTAINER% --name myPolicyCLI --account-key %AZURE_STORAGE_KEY% --account-name %AZURE_STORAGE_ACCOUNT% --expiry 2025-12-31 --permissions rl
    
    # List stored access policies on a containing object
    az storage container policy list --container-name %AZURE_STORAGE_CONTAINER% --account-key %AZURE_STORAGE_KEY% --account-name %AZURE_STORAGE_ACCOUNT%
    
    # Generate a shared access signature for the container
    az storage container generate-sas --name myPolicyCLI --account-key %AZURE_STORAGE_KEY% --account-name %AZURE_STORAGE_ACCOUNT%
    
    # Reversal
    # az storage container policy delete --container-name %AZURE_STORAGE_CONTAINER% --name myPolicyCLI --account-key %AZURE_STORAGE_KEY% --account-name %AZURE_STORAGE_ACCOUNT%
    
    # upload a file for a later example
    az storage blob upload --container-name %AZURE_STORAGE_CONTAINER% --account-key %AZURE_STORAGE_KEY% --account-name %AZURE_STORAGE_ACCOUNT% --name sampleCLI.log --file "./sampledata/sample.log"
    

使用 PythonUsing Python

打开 SASToken.py 文件,将 storage_account_namestorage_account_keystorage_container_name 替换为现有存储容器的相应值,然后运行该脚本。Open the SASToken.py file and replace storage_account_name, storage_account_key, and storage_container_name with the appropriate values for your existing storage container, and then run the script.

如果收到错误消息 ImportError: No module named azure.storage,可能需要执行 pip install --upgrade azure-storageYou may need to execute pip install --upgrade azure-storage if you receive the error message ImportError: No module named azure.storage.

使用 C#Using C#

  1. 在 Visual Studio 中打开解决方案。Open the solution in Visual Studio.

  2. 在解决方案资源管理器中,右键单击“SASExample”项目并选择“属性”。 In Solution Explorer, right-click on the SASExample project and select Properties.

  3. 选择“设置” ,并添加以下条目的值:Select Settings and add values for the following entries:

    • StorageConnectionString:想要为其创建存储策略和 SAS 的存储帐户的连接字符串。StorageConnectionString: The connection string for the storage account that you want to create a stored policy and SAS for. 其格式应为 DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey,其中 myaccount 是存储帐户名称,mykey 是存储帐户密钥。The format should be DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey where myaccount is the name of your storage account and mykey is the key for the storage account.

    • ContainerName:想要限制访问的存储帐户中的容器。ContainerName: The container in the storage account that you want to restrict access to.

    • SASPolicyName:要创建的存储策略所用的名称。SASPolicyName: The name to use for the stored policy to create.

    • FileToUpload:上传到容器的文件的路径。FileToUpload: The path to a file that is uploaded to the container.

  4. 运行该项目。Run the project. 保存 SAS 策略令牌、存储帐户名称和容器名称。Save the SAS policy token, storage account name, and container name. 将存储帐户与 HDInsight 群集关联时,将使用这些值。These values are used when associating the storage account with your HDInsight cluster.

将 SAS 与 HDInsight 配合使用Use the SAS with HDInsight

创建 HDInsight 群集时,必须指定主存储帐户,可以选择性地指定其他存储帐户。When creating an HDInsight cluster, you must specify a primary storage account and you can optionally specify additional storage accounts. 这两种添加存储的方法都需要对所用存储帐户和容器拥有完全访问权限。Both of these methods of adding storage require full access to the storage accounts and containers that are used.

要使用共享访问签名限制对容器的访问,请将一个自定义条目添加到群集的 core-site 配置中。To use a Shared Access Signature to limit access to a container, add a custom entry to the core-site configuration for the cluster. 可以在创建群集期间使用 PowerShell 添加该条目,或者在创建群集之后使用 Ambari 添加该条目。You can add the entry during cluster creation using PowerShell or after cluster creation using Ambari.

创建使用 SAS 的群集Create a cluster that uses the SAS

CLUSTERNAMERESOURCEGROUPDEFAULTSTORAGEACCOUNTSTORAGECONTAINERSTORAGEACCOUNTTOKEN 替换为相应的值。Replace CLUSTERNAME, RESOURCEGROUP, DEFAULTSTORAGEACCOUNT, STORAGECONTAINER, STORAGEACCOUNT, and TOKEN with the appropriate values. 输入 PowerShell 命令:Enter the PowerShell commands:


$clusterName = 'CLUSTERNAME'
$resourceGroupName = 'RESOURCEGROUP'

# Replace with the Azure data center you want to the cluster to live in
$location = 'chinaeast'

# Replace with the name of the default storage account TO BE CREATED
$defaultStorageAccountName = 'DEFAULTSTORAGEACCOUNT'

# Replace with the name of the SAS container CREATED EARLIER
$SASContainerName = 'STORAGECONTAINER'

# Replace with the name of the SAS storage account CREATED EARLIER
$SASStorageAccountName = 'STORAGEACCOUNT'

# Replace with the SAS token generated earlier
$SASToken = 'TOKEN'

# Default cluster size (# of worker nodes), version, and type
$clusterSizeInNodes = "4"
$clusterVersion = "3.6"
$clusterType = "Hadoop"

# Login to your Azure subscription
$sub = Get-AzSubscription -ErrorAction SilentlyContinue
if(-not($sub))
{
    Connect-AzAccount
}

# If you have multiple subscriptions, set the one to use
# Select-AzSubscription -SubscriptionId "<SUBSCRIPTIONID>"

# Create an Azure Storage account and container
New-AzStorageAccount `
    -ResourceGroupName $resourceGroupName `
    -Name $defaultStorageAccountName `
    -Location $location `
    -SkuName Standard_LRS `
    -Kind StorageV2 `
    -EnableHttpsTrafficOnly 1

$defaultStorageAccountKey = (Get-AzStorageAccountKey `
                                -ResourceGroupName $resourceGroupName `
                                -Name $defaultStorageAccountName)[0].Value

$defaultStorageContext = New-AzStorageContext `
                                -StorageAccountName $defaultStorageAccountName `
                                -StorageAccountKey $defaultStorageAccountKey


# Create a blob container. This holds the default data store for the cluster.
New-AzStorageContainer `
    -Name $clusterName `
    -Context $defaultStorageContext 

# Cluster login is used to secure HTTPS services hosted on the cluster
$httpCredential = Get-Credential `
    -Message "Enter Cluster login credentials" `
    -UserName "admin"

# SSH user is used to remotely connect to the cluster using SSH clients
$sshCredential = Get-Credential `
    -Message "Enter SSH user credentials" `
    -UserName "sshuser"

# Create the configuration for the cluster
$config = New-AzHDInsightClusterConfig 

$config = $config | Add-AzHDInsightConfigValues `
    -Spark2Defaults @{} `
    -Core @{"fs.azure.sas.$SASContainerName.$SASStorageAccountName.blob.core.chinacloudapi.cn"=$SASToken}

# Create the HDInsight cluster
New-AzHDInsightCluster `
    -Config $config `
    -ResourceGroupName $resourceGroupName `
    -ClusterName $clusterName `
    -Location $location `
    -ClusterSizeInNodes $clusterSizeInNodes `
    -ClusterType $clusterType `
    -OSType Linux `
    -Version $clusterVersion `
    -HttpCredential $httpCredential `
    -SshCredential $sshCredential `
    -DefaultStorageAccountName "$defaultStorageAccountName.blob.core.chinacloudapi.cn" `
    -DefaultStorageAccountKey $defaultStorageAccountKey `
    -DefaultStorageContainer $clusterName

<# REVERSAL
Remove-AzHDInsightCluster `
    -ResourceGroupName $resourceGroupName `
    -ClusterName $clusterName

Remove-AzStorageContainer `
    -Name $clusterName `
    -Context $defaultStorageContext

Remove-AzStorageAccount `
    -ResourceGroupName $resourceGroupName `
    -Name $defaultStorageAccountName

Remove-AzResourceGroup `
    -Name $resourceGroupName
#>

重要

出现输入 HTTP/s 或 SSH 用户名和密码的提示时,必须提供符合以下条件的密码:When prompted for the HTTP/s or SSH user name and password, you must provide a password that meets the following criteria:

  • 长度必须至少为 10 个字符。Must be at least 10 characters in length.
  • 必须至少包含一个数字。Must contain at least one digit.
  • 必须至少包含一个非字母数字字符。Must contain at least one non-alphanumeric character.
  • 必须至少包含一个大写或小写字母。Must contain at least one upper or lower case letter.

需要等待一段时间让此脚本完成,通常大约是 15 分钟。It takes a while for this script to complete, usually around 15 minutes. 如果脚本完成且没有发生任何错误,则群集创建完毕。When the script completes without any errors, the cluster has been created.

将 SAS 与现有群集配合使用Use the SAS with an existing cluster

如果已有一个群集,可使用以下步骤将 SAS 添加到 core-site 配置:If you have an existing cluster, you can add the SAS to the core-site configuration by using the following steps:

  1. 打开群集的 Ambari Web UI。Open the Ambari web UI for your cluster. 此页面的地址为 https://YOURCLUSTERNAME.azurehdinsight.cnThe address for this page is https://YOURCLUSTERNAME.azurehdinsight.cn. 出现提示时,使用创建群集时所用的管理员名称 (admin) 和密码向群集进行身份验证。When prompted, authenticate to the cluster using the admin name (admin) and password you used when creating the cluster.

  2. 导航到“HDFS” > “配置” > “高级” > “自定义 core-site”。Navigate to HDFS > Configs > Advanced > Custom core-site.

  3. 展开“自定义 core-site” 部分,并滚动到底部,然后选择“添加属性...” 。将以下值用于“键”和“值”: Expand the Custom core-site section, scroll to the end and, then select Add property.... Use the following values for Key and Value:

    • fs.azure.sas.CONTAINERNAME.STORAGEACCOUNTNAME.blob.core.chinacloudapi.cnKey: fs.azure.sas.CONTAINERNAME.STORAGEACCOUNTNAME.blob.core.chinacloudapi.cn

    • :前面执行的某个方法返回的 SAS。Value: The SAS returned by one of the methods earlier executed.

      CONTAINERNAME 替换为用于 C# 或 SAS 应用程序的容器名称。Replace CONTAINERNAME with the container name you used with the C# or SAS application. STORAGEACCOUNTNAME 替换为所用的存储帐户名称。Replace STORAGEACCOUNTNAME with the storage account name you used.

    选择“添加”以保存此键和值 Select Add to save this key and value

  4. 选择“保存”按钮以保存配置更改。 Select the Save button to save the configuration changes. 出现提示时,请添加更改的说明(例如,“添加 SAS 存储访问”),并选择“保存” 。When prompted, add a description of the change ("adding SAS storage access" for example) and then select Save.

    完成更改后,选择“确定” 。Select OK when the changes have been completed.

    重要

    必须重启几个服务才能使更改生效。You must restart several services before the change takes effect.

  5. 会显示一个“重启”下拉列表。 A Restart drop-down list will appear. 从下拉列表中选择“重启所有受影响的项”,然后选择“确认全部重启”。 Select Restart All Affected from the drop-down list and then Confirm Restart All.

    MapReduce2YARN 重复此过程。Repeat this process for MapReduce2 and YARN.

  6. 重新启动这些服务后,选择每个服务并从“服务操作” 下拉列表中选择“禁用维护模式”。Once the services have restarted, select each one and disable maintenance mode from the Service Actions drop down.

测试限制的访问Test restricted access

使用以下步骤验证是否只能读取和列出 SAS 存储帐户中的项。Use the following steps to verify that you can only read and list items on the SAS storage account.

  1. 连接到群集。Connect to the cluster. CLUSTERNAME 替换为群集的名称,然后输入以下命令:Replace CLUSTERNAME with the name of your cluster and enter the following command:

    ssh sshuser@CLUSTERNAME-ssh.azurehdinsight.cn
    
  2. 要列出容器的内容,请在提示符下使用以下命令:To list the contents of the container, use the following command from the prompt:

    hdfs dfs -ls wasbs://SASCONTAINER@SASACCOUNTNAME.blob.core.chinacloudapi.cn/
    

    SASCONTAINER 替换为针对 SAS 存储帐户创建的容器名称。Replace SASCONTAINER with the name of the container created for the SAS storage account. SASACCOUNTNAME 替换为用于 SAS 的存储帐户名称。Replace SASACCOUNTNAME with the name of the storage account used for the SAS.

    该列表包含创建容器和 SAS 时上传的文件。The list includes the file uploaded when the container and SAS were created.

  3. 使用以下命令验证是否可以读取文件的内容。Use the following command to verify that you can read the contents of the file. 按上一步骤中所述替换 SASCONTAINERSASACCOUNTNAMEReplace the SASCONTAINER and SASACCOUNTNAME as in the previous step. sample.log 替换为前一个命令中显示的名称:Replace sample.log with the name of the file displayed in the previous command:

    hdfs dfs -text wasb://SASCONTAINER@SASACCOUNTNAME.blob.core.chinacloudapi.cn/sample.log
    

    此命令列出文件的内容。This command lists the contents of the file.

  4. 使用以下命令将文件下载到本地文件系统:Use the following command to download the file to the local file system:

    hdfs dfs -get wasbs://SASCONTAINER@SASACCOUNTNAME.blob.core.chinacloudapi.cn/sample.log testfile.txt
    

    此命令会将该文件下载到名为 testfile.txt的本地文件中。This command downloads the file to a local file named testfile.txt.

  5. 使用以下命令将本地文件上传到 SAS 存储上名为 testupload.txt 的新文件中: Use the following command to upload the local file to a new file named testupload.txt on the SAS storage:

    hdfs dfs -put testfile.txt wasbs://SASCONTAINER@SASACCOUNTNAME.blob.core.chinacloudapi.cn/testupload.txt
    

    将收到类似于以下文本的消息:You receive a message similar to the following text:

     put: java.io.IOException
    

    发生此错误的原因是存储位置仅支持读取和列出。This error occurs because the storage location is read+list only. 使用以下命令将数据放在群集的可写默认存储中:Use the following command to put the data on the default storage for the cluster, which is writable:

    hdfs dfs -put testfile.txt wasbs:///testupload.txt
    

    这一次操作应该会成功完成。This time, the operation should complete successfully.

后续步骤Next steps

现在你已了解如何将访问受限的存储添加到 HDInsight 群集,接下来请了解在群集上处理数据的其他方法:Now that you have learned how to add limited-access storage to your HDInsight cluster, learn other ways to work with data on your cluster: