Mount a virtual file system on a Batch pool
Caution
This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and planning accordingly.
Azure Batch supports mounting cloud storage or an external file system on Windows or Linux compute nodes in Batch pools. When a compute node joins the pool, the virtual file system mounts and acts as a local drive on that node. This article shows you how to mount a virtual file system on a pool of compute nodes by using the Batch Management Library for .NET.
Mounting the file system to the pool makes accessing data easier and more efficient than requiring tasks to get their own data from a large shared data set. Consider a scenario where multiple tasks need access to a common set of data, like rendering a movie. Each task renders one or more frames at once from the scene files. By mounting a drive that contains the scene files, it's easier for each compute node to access the shared data.
Also, you can choose the underlying file system to meet performance, throughout, and input/output operations per second (IOPS) requirements. You can independently scale the file system based on the number of compute nodes that concurrently access the data.
Supported configurations
You can mount the following types of file systems:
- Azure Files
- Azure Blob storage
- Network File System (NFS)
- Common Internet File System (CIFS)
Batch supports the following virtual file system types for node agents that are produced for their respective publisher and offer.
OS Type | Azure Files share | Azure Blob container | NFS mount | CIFS mount |
---|---|---|---|---|
Linux | ✔️ | ✔️ | ✔️ | ✔️ |
Windows | ✔️ | ❌ | ❌ | ❌ |
Note
Mounting a virtual file system isn't supported on Batch pools created before August 8, 2019.
Networking requirements
When you use virtual file mounts with Batch pools in a virtual network, keep the following requirements in mind, and ensure that no required traffic is blocked. For more information, see Batch pools in a virtual network.
Azure Files shares require TCP port 445 to be open for traffic to and from the
storage
service tag. For more information, see Use an Azure file share with Windows.Azure Blob containers require TCP port 443 to be open for traffic to and from the
storage
service tag. Virtual machines (VMs) must have access tohttps://packages.microsoft.com
to download theblobfuse
andgpg
packages. Depending on your configuration, you might need access to other URLs.Network File System (NFS) requires access to port 2049 by default. Your configuration might have other requirements. VMs must have access to the appropriate package manager to download the
nfs-common
(for Debian or Ubuntu) packages. The URL might vary based on your OS version. Depending on your configuration, you might also need access to other URLs.Mounting Azure Blob or Azure Files through NFS might have more networking requirements. For example, your compute nodes might need to use the same virtual network subnet as the storage account.
Common Internet File System (CIFS) requires access to TCP port 445. VMs must have access to the appropriate package manager to download the
cifs-utils
package. The URL might vary based on your OS version.
Mounting configuration and implementation
Mounting a virtual file system on a pool makes the file system available to every compute node in the pool. Configuration for the file system happens when a compute node joins a pool, restarts, or is reimaged.
To mount a file system on a pool, you create a MountConfiguration object that matches your virtual file system: AzureBlobFileSystemConfiguration
, AzureFileShareConfiguration
, NfsMountConfiguration
, or CifsMountConfiguration
.
All mount configuration objects need the following base parameters. Some mount configurations have specific parameters for the particular file system, which the code examples present in more detail.
Account name or source of the storage account.
Relative mount path or source, the location of the file system to mount on the compute node, relative to the standard \fsmounts directory accessible via
AZ_BATCH_NODE_MOUNTS_DIR
.The exact \fsmounts directory location varies depending on node OS. For example, the location on an Ubuntu node maps to mnt\batch\tasks\fsmounts.
Mount options or BlobFuse options that describe specific parameters for mounting a file system.
When you create the pool and the MountConfiguration
object, you assign the object to the MountConfigurationList
property. Mounting for the file system happens when a node joins the pool, restarts, or is reimaged.
The Batch agent implements mounting differently on Windows and Linux.
On Linux, Batch installs the package
cifs-utils
. Then, Batch issues the mount command.On Windows, Batch uses
cmdkey
to add your Batch account credentials. Then, Batch issues the mount command throughnet use
. For example:net use S: \\<storage-account-name>.file.core.chinacloudapi.cn\<fileshare> /u:AZURE\<storage-account-name> <storage-account-key>
Mounting the file system creates an environment variable AZ_BATCH_NODE_MOUNTS_DIR
, which points to the location of the mounted file system and log files. You can use the log files for troubleshooting and debugging.
Mount an Azure Files share with PowerShell
You can use Azure PowerShell to mount an Azure Files share on a Windows or Linux Batch pool. The following procedure walks you through configuring and mounting an Azure file share file system on a Batch pool.
Important
The maximum number of mounted file systems on a pool is 10. For details and other limits, see Batch service quotas and limits.
Prerequisites
- An Azure account with an active subscription.
- Azure PowerShell installed.
- An existing Batch account with a linked Azure Storage account that has a file share.
Sign in to your Azure subscription, replacing the placeholder with your subscription ID.
Connect-AzAccount -Environment AzureChinaCloud -Subscription "<subscription-ID>"
Get the context for your Batch account. Replace the
<batch-account-name>
placeholder with your Batch account name.$context = Get-AzBatchAccount -AccountName <batch-account-name>
Create a Batch pool with the following settings. Replace the
<storage-account-name>
,<storage-account-key>
, and<file-share-name>
placeholders with the values from the storage account that's linked to your Batch account. Replace the<pool-name>
placeholder with the name you want for the pool.The following script creates a pool with one Windows Server 2016 Datacenter, Standard_D2_V2 size node, and then mounts the Azure file share to the S drive of the node.
$fileShareConfig = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSAzureFileShareConfiguration" -ArgumentList @("<storage-account-name>", "https://<storage-account-name>.file.core.chinacloudapi.cn/batchfileshare1", "S", "<storage-account-key>") $mountConfig = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSMountConfiguration" -ArgumentList @($fileShareConfig) $imageReference = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSImageReference" -ArgumentList @("WindowsServer", "MicrosoftWindowsServer", "2016-Datacenter", "latest") $configuration = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSVirtualMachineConfiguration" -ArgumentList @($imageReference, "batch.node.windows amd64") New-AzBatchPool -Id "<pool-name>" -VirtualMachineSize "STANDARD_D2_V2" -VirtualMachineConfiguration $configuration -TargetDedicatedComputeNodes 1 -MountConfiguration @($mountConfig) -BatchContext $context
Connect to the node and check that the output file is correct.
Access the mounted files
Azure Batch tasks can access the mounted files by using the drive's direct path, for example:
cmd /c "more S:\folder1\out.txt & timeout /t 90 > NULL"
The Azure Batch agent grants access only for Azure Batch tasks. If you use Remote Desktop Protocol (RDP) to connect to the node, your user account doesn't have automatic access to the mounting drive. When you connect to the node over RDP, you must add credentials for the storage account to access the S drive directly.
Use cmdkey
to add the credentials. Replace the <storage-account-name>
and <storage-account-key
> placeholders with your own information.
cmdkey /add:"<storage-account-name>.file.core.chinacloudapi.cn" /user:"Azure\<storage-account-name>" /pass:"<storage-account-key>"
Troubleshoot mount issues
If a mount configuration fails, the compute node fails and the node state is set to Unusable. To diagnose a mount configuration failure, inspect the ComputeNodeError property for details on the error.
To get log files for debugging, you can use the OutputFiles API to upload the *.log files. The *.log files contain information about the file system mount at the AZ_BATCH_NODE_MOUNTS_DIR
location. Mount log files have the format: <type>-<mountDirOrDrive>.log for each mount. For example, a CIFS mount at a mount directory named test has a mount log file named: cifs-test.log.
Investigate mounting errors
You can RDP or SSH to the node to check the log files pertaining to filesystem mounts. The following example error message is possible when you try to mount an Azure file share to a Batch node:
Mount Configuration Error | An error was encountered while configuring specified mount(s)
Message: System error (out of memory, cannot fork, no more loop devices)
MountConfigurationPath: S
If you receive this error, RDP or SSH to the node to check the related log files. The Batch agent implements mounting differently on Windows and Linux for Azure file shares. On Linux, Batch installs the package cifs-utils
. Then, Batch issues the mount command. On Windows, Batch uses cmdkey
to add your Batch account credentials. Then, Batch issues the mount command through net use
. For example:
net use S: \\<storage-account-name>.file.core.chinacloudapi.cn\<fileshare> /u:AZURE\<storage-account-name> <storage-account-key>
Connect to the node over RDP.
Open the log file fshare-S.log, at D:\batch\tasks\fsmounts.
Review the error messages, for example:
CMDKEY: Credential added successfully. System error 86 has occurred. The specified network password is not correct.
Troubleshoot the problem by using the Azure file shares troubleshooter.
If you can't use RDP or SSH to check the log files on the node, you can upload the logs to your Azure storage account. You can use this method for both Windows and Linux logs.
In the Azure portal, search for and select the Batch account that has your pool.
On the Batch account page, select Pools from the left navigation.
On the Pools page, select the pool's name.
On the pool's page, select Nodes from the left navigation.
On the Nodes page, select the node's name.
On the node's page, select Upload batch logs.
On the Upload batch logs pane, select Pick storage container.
On the Storage accounts page, select a storage account.
On the Containers page, select or create a container to upload the files to, and select Select.
Select Start upload.
When the upload completes, download the files and open agent-debug.log.
Review the error messages, for example:
..20210322T113107.448Z.00000000-0000-0000-0000-000000000000.ERROR.agent.mount.filesystems.basefilesystem.basefilesystem.py.run_cmd_persist_output_async.59.2912.MainThread.3580.Mount command failed with exit code: 2, output: CMDKEY: Credential added successfully. System error 86 has occurred. The specified network password is not correct.
Troubleshoot the problem by using the Azure file shares troubleshooter.
Manually mount a file share with PowerShell
If you can't diagnose or fix mounting errors, you can use PowerShell to mount the file share manually instead.
Create a pool without a mounting configuration. For example:
$imageReference = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSImageReference" -ArgumentList @("WindowsServer", "MicrosoftWindowsServer", "2016-Datacenter", "latest") $configuration = New-Object -TypeName "Microsoft.Azure.Commands.Batch.Models.PSVirtualMachineConfiguration" -ArgumentList @($imageReference, "batch.node.windows amd64") New-AzBatchPool -Id "<pool-name>" -VirtualMachineSize "STANDARD_D2_V2" -VirtualMachineConfiguration $configuration -TargetDedicatedComputeNodes 1 -BatchContext $Context
Wait for the node to be in the Idle state.
In the Azure portal, search for and select the storage account that has your file share.
On the storage account page's menu, select File shares from the left navigation.
On the File shares page, select the file share you want to mount.
On the file share's page, select Connect.
In the Connect pane, select the Windows tab.
For Drive letter, enter the drive you want to use. The default is Z.
For Authentication method, select how you want to connect to the file share.
Select Show Script, and copy the PowerShell script for mounting the file share.
Connect to the node over RDP.
Run the command you copied to mount the file share.
Note any error messages in the output. Use this information to troubleshoot any networking-related issues.
Example mount configurations
The following code example configurations demonstrate mounting various file share systems to a pool of compute nodes.
Azure Files share
Azure Files is the standard Azure cloud file system offering. The following configuration mounts an Azure Files share named <file-share-name>
to the S drive. For information about the parameters in the example, see Mount SMB Azure file share on Windows or Create an NFS Azure file share and mount it on a Linux VM using the Azure portal.
new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
AzureFileShareConfiguration = new AzureFileShareConfiguration
{
AccountName = "<storage-account-name>",
AzureFileUrl = "https://<storage-account-name>.file.core.chinacloudapi.cn/<file-share-name>",
AccountKey = "<storage-account-key>",
RelativeMountPath = "S",
MountOptions = "-o vers=3.0,dir_mode=0777,file_mode=0777,sec=ntlmssp"
},
}
}
}
Azure Blob container
Another option is to use Azure Blob storage via BlobFuse. Mounting a blob file system requires either an account key, shared access signature (SAS) key, or managed identity with access to your storage account.
For information on getting these keys or identity, see the following articles:
Grant limited access to Azure Storage resources using shared access signatures (SAS)
Configure managed identities in Batch pools
Tip
If you use a managed identity, ensure that the identity has been assigned to the pool so that it's available on the VM doing the mounting. The identity must also have the Storage Blob Data Contributor role.
The following configuration mounts a blob file system with BlobFuse options. For illustration purposes, the example shows AccountKey
, SasKey
and IdentityReference
, but you can actually specify only one of these methods.
new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
AzureBlobFileSystemConfiguration = new AzureBlobFileSystemConfiguration
{
AccountName = "<storage-account-name>",
ContainerName = "<container-name>",
// Use only one of the following three lines:
AccountKey = "<storage-account-key>",
SasKey = "<sas-key>",
IdentityReference = new ComputeNodeIdentityReference("/subscriptions/<subscription>/resourceGroups/<resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<identity-name>"),
RelativeMountPath = "<relative-mount-path>",
BlobfuseOptions = "-o attr_timeout=240 -o entry_timeout=240 -o negative_timeout=120 "
},
}
}
}
To get default access to the BlobFuse mounted directory, run the task as an administrator. BlobFuse mounts the directory at the user space, and at pool creation mounts the directory as root. In Linux, all administrator tasks are root. The FUSE reference page describes all options for the FUSE module.
For more information and tips on using BlobFuse, see the following references:
NFS
You can mount NFS shares to pool nodes to allow Batch to access traditional file systems. The setup can be a single NFS server deployed in the cloud or an on-premises NFS server accessed over a virtual network. NFS mounts , a distributed in-memory cache for data-intensive high-performance computing (HPC) tasks. NFS mounts also support other standard NFS-compliant interfaces, such as NFS for Azure Blob and NFS for Azure Files.
The following example shows a configuration for an NFS file system mount:
new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
NfsMountConfiguration = new NFSMountConfiguration
{
Source = "<source>",
RelativeMountPath = "<relative-mount-path>",
MountOptions = "options ver=3.0"
},
}
}
}
CIFS
Mounting CIFS to pool nodes is another way to provide access to traditional file systems. CIFS is a file-sharing protocol that provides an open and cross-platform mechanism for requesting network server files and services. CIFS is based on the enhanced version of the SMB protocol for internet and intranet file sharing.
The following example shows a configuration for a CIFS file mount.
new PoolAddParameter
{
Id = poolId,
MountConfiguration = new[]
{
new MountConfiguration
{
CifsMountConfiguration = new CIFSMountConfiguration
{
Username = "<storage-account-name>",
RelativeMountPath = "<relative-mount-path>",
Source = "<source>",
Password = "<storage-account-key>",
MountOptions = "-o vers=3.0,dir_mode=0777,file_mode=0777,serverino,domain=<domain-name>"
},
}
}
}
Note
Looking for an example using PowerShell rather than C#? You can find another great example here: Mount Azure File to Azure Batch Pool.