Unable to access Data Lake storage files in Azure HDInsight
This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.
You receive an error message similar to:
LISTSTATUS failed with error 0x83090aa2 (Forbidden. ACL verification failed. Either the resource does not exist or the user is not authorized to perform the requested operation.).
The user might have revoked permissions of service principal(SP) on files/folders.
Check that the SP has 'x' permissions to traverse along the path. For more information, see Permissions. Sample
dfs
command to check access to files/folders in Data Lake storage account:hdfs dfs -ls /<path to check access>
Set up required permissions to access the path based on the read/write operation being performed. See here for permissions required for various file system operations.
You receive an error message similar to:
Token Refresh failed - Received invalid http response: 500
The certificate provided for Service principal access might have expired.
SSH into headnode. Check access to storage account using following
dfs
command:hdfs dfs -ls /
Confirm that the error message is similar to the following output:
{"stderr": "-ls: Token Refresh failed - Received invalid http response: 500, text = Response{protocol=http/1.1, code=500, message=Internal Server Error, url=http://gw0-abccluster.24ajrd4341lebfgq5unsrzq0ue.fx.internal.chinacloudapp.cn:909/api/oauthtoken}}...
Get one of the urls from
core-site.xml property
-fs.azure.datalake.token.provider.service.urls
.Run the following curl command to retrieve OAuth token.
curl gw0-abccluster.24ajrd4341lebfgq5unsrzq0ue.fx.internal.chinacloudapp.cn:909/api/oauthtoken
The output for a valid service principal should be something like:
{"AccessToken":"MIIGHQYJKoZIhvcNAQcDoIIGDjCCBgoCAQA…….","ExpiresOn":1500447750098}
If the service principal certificate has expired, the output will look something like this:
Exception in OAuthTokenController.GetOAuthToken: 'System.InvalidOperationException: Error while getting the OAuth token from AAD for AppPrincipalId 23abe517-2ffd-4124-aa2d-7c224672cae2, ResourceUri https://management.core.chinacloudapi.cn/, AADTenantId https://login.chinacloudapi.cn/80abc8bf-86f1-41af-91ab-2d7cd011db47, ClientCertificateThumbprint C49C25705D60569884EDC91986CEF8A01A495783 ---> Microsoft.IdentityModel.Clients.ActiveDirectory.AdalServiceException: AADSTS70002: Error validating credentials. AADSTS50012: Client assertion contains an invalid signature. **[Reason - The key used is expired.**, Thumbprint of key used by client: 'C49C25705D60569884EDC91986CEF8A01A495783', Found key 'Start=08/03/2016, End=08/03/2017, Thumbprint=C39C25705D60569884EDC91986CEF8A01A4956D1', Configured keys: [Key0:Start=08/03/2016, End=08/03/2017, Thumbprint=C39C25705D60569884EDC91986CEF8A01A4956D1;]] Trace ID: e4d34f1c-a584-47f5-884e-1235026d5000 Correlation ID: a44d870e-6f23-405a-8b23-9b44aebfa4bb Timestamp: 2017-10-06 20:44:56Z ---> System.Net.WebException: The remote server returned an error: (401) Unauthorized. at System.Net.HttpWebRequest.GetResponse() at Microsoft.IdentityModel.Clients.ActiveDirectory.HttpWebRequestWrapper.<GetResponseSyncOrAsync>d__2.MoveNext()
Any other Microsoft Entra related errors/certificate related errors can be recognized by pinging the gateway url to get the OAuth token.
If you are getting following error when attempting to access ADLS from the HDI Cluster. Check if the Certificate has Expired by following the steps mentioned above.
Error: java.lang.IllegalArgumentException: Token Refresh failed - Received invalid http response: 500, text = Response{protocol=http/1.1, code=500, message=Internal Server Error, url=http://clustername.hmssomerandomstringc.cx.internal.chinacloudapp.cn:909/api/oauthtoken}
Create a new Certificate or assign existing Certificate using the following PowerShell script:
$clusterName = 'CLUSTERNAME'
$resourceGroupName = 'RGNAME'
$subscriptionId = 'SUBSCRIPTIONID'
$appId = 'APPLICATIONID'
$generateSelfSignedCert = $false
$addNewCertKeyCredential = $true
$certFilePath = 'NEW_CERT_PFX_LOCAL_PATH'
$certPassword = Read-Host "Enter Certificate Password"
if($generateSelfSignedCert)
{
Write-Host "Generating new SelfSigned certificate"
$cert = New-SelfSignedCertificate -CertStoreLocation "cert:\CurrentUser\My" -Subject "CN=hdinsightAdlsCert" -KeySpec KeyExchange
$certBytes = $cert.Export([System.Security.Cryptography.X509Certificates.X509ContentType]::Pkcs12, $certPassword);
$certString = [System.Convert]::ToBase64String($certBytes)
}
else
{
Write-Host "Reading the cert file from path $certFilePath"
$cert = new-object System.Security.Cryptography.X509Certificates.X509Certificate2($certFilePath, $certPassword)
$certString = [System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes($certFilePath))
}
Connect-AzAccount -Environment AzureChinaCloud
if($addNewCertKeyCredential)
{
Write-Host "Creating new KeyCredential for the app"
$keyValue = [System.Convert]::ToBase64String($cert.GetRawCertData())
New-AzureRmADAppCredential -ApplicationId $appId -CertValue $keyValue -EndDate $cert.NotAfter -StartDate $cert.NotBefore
Write-Host "Waiting for 30 seconds for the permissions to get propagated"
Start-Sleep -s 30
}
Select-AzureRmSubscription -SubscriptionId $subscriptionId
Write-Host "Updating the certificate on HDInsight cluster."
Invoke-AzureRmResourceAction `
-ResourceGroupName $resourceGroupName `
-ResourceType 'Microsoft.HDInsight/clusters' `
-ResourceName $clusterName `
-ApiVersion '2015-03-01-preview' `
-Action 'updateclusteridentitycertificate' `
-Parameters @{ ApplicationId = $appId.ToString(); Certificate = $certString; CertificatePassword = $certPassword.ToString() } `
-Force
For assigning existing certificate, create a certificate, have the .pfx file and password ready. Associate the certificate with the service principal that the cluster was created with, using the AppId ready.
Execute the PowerShell command after you substitute the parameters with the actual values.
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
- If you need more help, you can submit a support request from the Azure portal. Select Support from the menu bar or open the Help + support hub.