无法访问 Azure HDInsight 中的 Data Lake 存储文件

本文介绍在与 Azure HDInsight 群集交互时出现的问题的故障排除步骤和可能的解决方法。

问题:ACL 验证失败

收到如下所示的错误消息:

LISTSTATUS failed with error 0x83090aa2 (Forbidden. ACL verification failed. Either the resource does not exist or the user is not authorized to perform the requested operation.).

原因

用户可能已撤消服务主体 (SP) 对文件/文件夹的权限。

解决方法

  1. 检查 SP 是否拥有遍历路径的“x”权限。 有关详细信息,请参阅权限。 用于检查对 Data Lake 存储帐户中文件/文件夹的访问权限的示例 dfs 命令:

    hdfs dfs -ls /<path to check access>
    
  2. 根据正在执行的读/写操作设置访问路径所需的权限。 了解各种文件系统操作所需的权限。


问题:服务主体证书过期

收到如下所示的错误消息:

Token Refresh failed - Received invalid http response: 500

原因

为服务主体访问权限提供的证书可能已过期。

  1. 通过 SSH 连接到头节点。 使用以下 dfs 命令检查对存储帐户的访问权限:

    hdfs dfs -ls /
    
  2. 确认错误消息类似于以下输出:

    {"stderr": "-ls: Token Refresh failed - Received invalid http response: 500, text = Response{protocol=http/1.1, code=500, message=Internal Server Error, url=http://gw0-abccluster.24ajrd4341lebfgq5unsrzq0ue.fx.internal.chinacloudapp.cn:909/api/oauthtoken}}...
    
  3. core-site.xml property - fs.azure.datalake.token.provider.service.urls 获取一个URL。

  4. 运行以下 curl 命令检索 OAuth 令牌。

    curl gw0-abccluster.24ajrd4341lebfgq5unsrzq0ue.fx.internal.chinacloudapp.cn:909/api/oauthtoken
    
  5. 如果服务主体有效,则输出应如下所示:

    {"AccessToken":"MIIGHQYJKoZIhvcNAQcDoIIGDjCCBgoCAQA…….","ExpiresOn":1500447750098}
    
  6. 如果服务主体证书已过期,则输出如下所示:

    Exception in OAuthTokenController.GetOAuthToken: 'System.InvalidOperationException: Error while getting the OAuth token from AAD for AppPrincipalId 23abe517-2ffd-4124-aa2d-7c224672cae2, ResourceUri https://management.core.chinacloudapi.cn/, AADTenantId https://login.chinacloudapi.cn/80abc8bf-86f1-41af-91ab-2d7cd011db47, ClientCertificateThumbprint C49C25705D60569884EDC91986CEF8A01A495783 ---> Microsoft.IdentityModel.Clients.ActiveDirectory.AdalServiceException: AADSTS70002: Error validating credentials. AADSTS50012: Client assertion contains an invalid signature. **[Reason - The key used is expired.**, Thumbprint of key used by client: 'C49C25705D60569884EDC91986CEF8A01A495783', Found key 'Start=08/03/2016, End=08/03/2017, Thumbprint=C39C25705D60569884EDC91986CEF8A01A4956D1', Configured keys: [Key0:Start=08/03/2016, End=08/03/2017, Thumbprint=C39C25705D60569884EDC91986CEF8A01A4956D1;]]
    Trace ID: e4d34f1c-a584-47f5-884e-1235026d5000
    Correlation ID: a44d870e-6f23-405a-8b23-9b44aebfa4bb
    Timestamp: 2017-10-06 20:44:56Z ---> System.Net.WebException: The remote server returned an error: (401) Unauthorized.
    at System.Net.HttpWebRequest.GetResponse()
    at Microsoft.IdentityModel.Clients.ActiveDirectory.HttpWebRequestWrapper.<GetResponseSyncOrAsync>d__2.MoveNext()
    
  7. 可通过对网关 URL 执行 ping 操作来获取 OAuth 令牌,从而识别与 Microsoft Entra 相关的任何其他错误/证书相关错误。

  8. 如果尝试从 HDI 群集访问 ADLS 时遇到以下错误: 按上述步骤检查证书是否已过期。

    Error: java.lang.IllegalArgumentException: Token Refresh failed - Received invalid http response: 500, text = Response{protocol=http/1.1, code=500, message=Internal Server Error, url=http://clustername.hmssomerandomstringc.cx.internal.chinacloudapp.cn:909/api/oauthtoken}
    

解决方法

使用以下 PowerShell 脚本创建新证书或分配现有证书:

$clusterName = 'CLUSTERNAME'
$resourceGroupName = 'RGNAME'
$subscriptionId = 'SUBSCRIPTIONID'
$appId = 'APPLICATIONID'
$generateSelfSignedCert = $false
$addNewCertKeyCredential = $true
$certFilePath = 'NEW_CERT_PFX_LOCAL_PATH'
$certPassword = Read-Host "Enter Certificate Password"

if($generateSelfSignedCert)
{
    Write-Host "Generating new SelfSigned certificate"

    $cert = New-SelfSignedCertificate -CertStoreLocation "cert:\CurrentUser\My" -Subject "CN=hdinsightAdlsCert" -KeySpec KeyExchange
    $certBytes = $cert.Export([System.Security.Cryptography.X509Certificates.X509ContentType]::Pkcs12, $certPassword);
    $certString = [System.Convert]::ToBase64String($certBytes)
}
else
{

    Write-Host "Reading the cert file from path $certFilePath"

    $cert = new-object System.Security.Cryptography.X509Certificates.X509Certificate2($certFilePath, $certPassword)
    $certString = [System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes($certFilePath))
}

Connect-AzAccount -Environment AzureChinaCloud

if($addNewCertKeyCredential)
{
    Write-Host "Creating new KeyCredential for the app"

    $keyValue = [System.Convert]::ToBase64String($cert.GetRawCertData())

    New-AzureRmADAppCredential -ApplicationId $appId -CertValue $keyValue -EndDate $cert.NotAfter -StartDate $cert.NotBefore

    Write-Host "Waiting for 30 seconds for the permissions to get propagated"

    Start-Sleep -s 30
}

Select-AzureRmSubscription -SubscriptionId $subscriptionId

Write-Host "Updating the certificate on HDInsight cluster."

Invoke-AzureRmResourceAction `
    -ResourceGroupName $resourceGroupName `
    -ResourceType 'Microsoft.HDInsight/clusters' `
    -ResourceName $clusterName `
    -ApiVersion '2015-03-01-preview' `
    -Action 'updateclusteridentitycertificate' `
    -Parameters @{ ApplicationId = $appId.ToString(); Certificate = $certString; CertificatePassword = $certPassword.ToString() } `
    -Force

若要分配现有证书,请创建一个证书,并准备好 .pfx 文件和密码。 使用准备好的 AppId 将证书与创建群集时使用的服务主体相关联。

将参数替换为实际值后,执行 PowerShell 命令。

后续步骤

如果你的问题未在本文中列出,或者无法解决问题,请访问以下渠道之一获取更多支持:

  • 如果需要更多帮助,可以从 Azure 门户提交支持请求。 从菜单栏中选择“支持” ,或打开“帮助 + 支持” 中心。