Python package extensibility for prebuilt Docker images (preview)
APPLIES TO: Python SDK azureml v1
The prebuilt Docker images for model inference contain packages for popular machine learning frameworks. There are two methods that can be used to add Python packages without rebuilding the Docker image:
Dynamic installation: This approach uses a requirements file to automatically restore Python packages when the Docker container boots.
Consider this method for rapid prototyping. When the image starts, packages are restored using the
requirements.txt
file. This method increases startup of the image, and you must wait longer before the deployment can handle requests.Pre-installed Python packages: You provide a directory containing preinstalled Python packages. During deployment, this directory is mounted into the container for your entry script (
score.py
) to use.Use this approach for production deployments. Since the directory containing the packages is mounted to the image, it can be used even when your deployments don't have public internet access. For example, when deployed into a secured Azure Virtual Network.
Important
Using Python package extensibility for prebuilt Docker images with Azure Machine Learning is currently in preview. Preview functionality is provided "as-is", with no guarantee of support or service level agreement. For more information, see the Supplemental terms of use for Azure previews.
Prerequisites
- An Azure Machine Learning workspace. For a tutorial on creating a workspace, see Get started with Azure Machine Learning.
- Familiarity with using Azure Machine Learning environments.
- Familiarity with Where and how to deploy models with Azure Machine Learning.
Dynamic installation
This approach uses a requirements file to automatically restore Python packages when the image starts up.
To extend your prebuilt docker container image through a requirements.txt, follow these steps:
- Create a
requirements.txt
file alongside yourscore.py
script. - Add all of your required packages to the
requirements.txt
file. - Set the
AZUREML_EXTRA_REQUIREMENTS_TXT
environment variable in your Azure Machine Learning environment to the location ofrequirements.txt
file.
Once deployed, the packages will automatically be restored for your score script.
Tip
Even while prototyping, we recommend that you pin each package version in requirements.txt
.
For example, use scipy == 1.2.3
instead of just scipy
or even scipy > 1.2.3
.
If you don't pin an exact version and scipy
releases a new version, this can break your scoring script and cause failures during deployment and scaling.
The following example demonstrates setting the AZUREML_EXTRA_REQUIRMENTS_TXT
environment variable:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
myenv = Environment(name="my_azureml_env")
myenv.docker.enabled = True
myenv.docker.base_image = <MCR-path>
myenv.python.user_managed_dependencies = True
myenv.environment_variables = {
"AZUREML_EXTRA_REQUIREMENTS_TXT": "requirements.txt"
}
The following diagram is a visual representation of the dynamic installation process:
Pre-installed Python packages
This approach mounts a directory that you provide into the image. The Python packages from this directory can then be used by the entry script (score.py
).
To extend your prebuilt docker container image through pre-installed Python packages, follow these steps:
Important
You must use packages compatible with Python 3.7. All current images are pinned to Python 3.7.
Create a virtual environment using virtualenv.
Install your Dependencies. If you have a list of dependencies in a
requirements.txt
, for example, you can use that to install withpip install -r requirements.txt
or justpip install
individual dependencies.When you specify the
AZUREML_EXTRA_PYTHON_LIB_PATH
environment variable, make sure that you point to the correct site packages directory, which will vary depending on your environment name and Python version. The following code demonstrates setting the path for a virtual environment namedmyenv
and Python 3.7:from azureml.core import Environment from azureml.core.conda_dependencies import CondaDependencies myenv = Environment(name='my_azureml_env') myenv.docker.enabled = True myenv.docker.base_image = <MCR-path> myenv.python.user_managed_dependencies = True myenv.environment_variables = { "AZUREML_EXTRA_PYTHON_LIB_PATH": "myenv/lib/python3.7/site-packages" }
The following diagram is a visual representation of the pre-installed packages process:
Common problems
The mounting solution will only work when your myenv
site packages directory contains all of your dependencies. If your local environment is using dependencies installed in a different location, they won't be available in the image.
Here are some things that may cause this problem:
virtualenv
creates an isolated environment by default. Once you activate the virtual environment, global dependencies cannot be used.- If you have a
PYTHONPATH
environment variable pointing to your global dependencies, it may interfere with your virtual environment. Runpip list
andpip freeze
after activating your environment to make sure no unwanted dependencies are in your environment. - Conda and
virtualenv
environments can interfere. Make sure that not to use Conda environment andvirtualenv
at the same time.
Limitations
Model.package()
The Model.package() method lets you create a model package in the form of a Docker image or Dockerfile build context. Using Model.package() with prebuilt inference docker images triggers an intermediate image build that changes the non-root user to root user.
We encourage you to use our Python package extensibility solutions. If other dependencies are required (such as
apt
packages), create your own Dockerfile extending from the inference image.
Frequently asked questions
In the requirements.txt extensibility approach is it mandatory for the file name to be
requirements.txt
?myenv.environment_variables = { "AZUREML_EXTRA_REQUIREMENTS_TXT": "name of your pip requirements file goes here" }
Can you summarize the
requirements.txt
approach versus the mounting approach?Start prototyping with the requirements.txt approach. After some iteration, when you're confident about which packages (and versions) you need for a successful model deployment, switch to the Mounting Solution.
Here's a detailed comparison.
Compared item Requirements.txt (dynamic installation) Package Mount Solution Create a requirements.txt
that installs the specified packages when the container starts.Create a local Python environment with all of the dependencies. Mount this directory into container at runtime. Package Installation No extra installation (assuming pip already installed) Virtual environment or conda environment installation. Virtual environment Setup No extra setup of virtual environment required, as users can pull the current local user environment with pip freeze as needed to create the requirements.txt
.Need to set up a clean virtual environment, may take extra steps depending on the current user local environment. Debugging Easy to set up and debug server, since dependencies are clearly listed. Unclean virtual environment could cause problems when debugging of server. For example, it may not be clear if errors come from the environment or user code. Consistency during scaling out Not consistent as dependent on external PyPi packages and users pinning their dependencies. These external downloads could be flaky. Relies solely on user environment, so no consistency issues. Why are my
requirements.txt
and mounted dependencies directory not found in the container?Locally, verify the environment variables are set properly. Next, verify the paths that are specified are spelled properly and exist. Check if you have set your source directory correctly in the inference config constructor.
Can I override Python package dependencies in prebuilt inference docker image?
Yes. If you want to use other version of Python package that is already installed in an inference image, our extensibility solution will respect your version. Make sure there are no conflicts between the two versions.
Best Practices
Refer to the Load registered model docs. When you register a model directory, don't include your scoring script, your mounted dependencies directory, or
requirements.txt
within that directory.For more information on how to load a registered or local model, see Where and how to deploy.
Bug Fixes
2021-07-26
AZUREML_EXTRA_REQUIREMENTS_TXT
andAZUREML_EXTRA_PYTHON_LIB_PATH
are now always relative to the directory of the score script. For example, if the both the requirements.txt and score script is in my_folder, thenAZUREML_EXTRA_REQUIREMENTS_TXT
will need to be set to requirements.txt. No longer willAZUREML_EXTRA_REQUIREMENTS_TXT
be set to my_folder/requirements.txt.
Next steps
To learn more about deploying a model, see How to deploy a model.
To learn how to troubleshoot prebuilt docker image deployments, see how to troubleshoot prebuilt Docker image deployments.