Azure Functions Flex Consumption plan hosting
Flex Consumption is a Linux-based Azure Functions hosting plan that builds on the Consumption pay for what you use serverless billing model. It gives you more flexibility and customizability by introducing private networking, instance memory size selection, and fast/large scale-out features still based on a serverless model.
Important
The Flex Consumption plan is currently in preview. For a list of current limitations when using this hosting plan, see Considerations. For current information about billing during the preview, see Billing.
You can review end-to-end samples that feature the Flex Consumption plan in the Flex Consumption plan samples repository.
Benefits
The Flex Consumption plan builds on the strengths of the Consumption plan, which include dynamic scaling and execution-based billing. With Flex Consumption, you also get these extra features:
- Always-ready instances
- Virtual network integration
- Fast scaling based on concurrency for both HTTP and non-HTTP apps
- Multiple choices for instance memory sizes
This table helps you directly compare the features of Flex Consumption with the Consumption hosting plan:
Feature | Consumption | Flex Consumption |
---|---|---|
Scale to zero | ✅ Yes | ✅ Yes |
Scale behavior | Event driven | Event driven (fast) |
Virtual networks | ❌ Not supported | ✅ Supported |
Dedicated compute (mitigate cold starts) | ❌ None | ✅ Always ready instances (optional) |
Billing | Execution-time only | Execution-time + always-ready instances |
Scale-out instances (max) | 200 | 1000 |
For a complete comparison of the Flex Consumption plan against the Consumption plan and all other plan and hosting types, see function scale and hosting options.
Virtual network integration
Flex Consumption expands on the traditional benefits of Consumption plan by adding support for virtual network integration. When your apps run in a Flex Consumption plan, they can connect to other Azure services secured inside a virtual network. All while still allowing you to take advantage of serverless billing and scale, together with the scale and throughput benefits of the Flex Consumption plan. For more information, see Enable virtual network integration.
Instance memory
When you create your function app in a Flex Consumption plan, you can select the memory size of the instances on which your app runs. See Billing to learn how instance memory sizes affect the costs of your function app.
Currently, Flex Consumption offers instance memory size options of both 2,048 MB and 4,096 MB.
When deciding on which instance memory size to use with your apps, here are some things to consider:
- The 2,048-MB instance memory size is the default and should be used for most scenarios. Use the 4,096-MB instance memory size for scenarios where your app requires more concurrency or higher processing power. For more information, see Configure instance memory.
- You can change the instance memory size at any time. For more information, see Configure instance memory.
- Instance resources are shared between your function code and the Functions host.
- The larger the instance memory size, the more each instance can handle as far as concurrent executions or more intensive CPU or memory workloads. Specific scale decisions are workload-specific.
- The default concurrency of HTTP triggers depends on the instance memory size. For more information, see HTTP trigger concurrency.
- Available CPUs and network bandwidth are provided proportional to a specific instance size.
Per-function scaling
Concurrency is a key factor that determines how Flex Consumption function apps scale. To improve the scale performance of apps with various trigger types, the Flex Consumption plan provides a more deterministic way of scaling your app on a per-function basis.
This per-function scaling behavior is a part of the hosting platform, so you don't need to configure your app or change the code. For more information, see Per-function scaling in the Event-driven scaling article.
In per-function scaling, decisions are made for certain function triggers based on group aggregations. This table shows the defined set of function scale groups:
Scale groups | Triggers in group | Settings value |
---|---|---|
HTTP triggers | HTTP trigger SignalR trigger |
http |
Blob storage triggers (Event Grid-based) |
Blob storage trigger | blob |
Durable Functions | Orchestration trigger Activity trigger Entity trigger |
durable |
All other functions in the app are scaled individually in their own set of instances, which are referenced using the convention function:<NAMED_FUNCTION>
.
Always ready instances
Flex Consumption includes an always ready feature that lets you choose instances that are always running and assigned to each of your per-function scale groups or functions. This is a great option for scenarios where you need to have a minimum number of instances always ready to handle requests, for example, to reduce your application's cold start latency. The default is 0 (zero).
For example, if you set always ready to 2 for your HTTP group of functions, the platform keeps two instances always running and assigned to your app for your HTTP functions in the app. Those instances are processing your function executions, but depending on concurrency settings, the platform scales beyond those two instances with on-demand instances.
To learn how to configure always ready instances, see Set always ready instance counts.
Concurrency
Concurrency refers to the number of parallel executions of a function on an instance of your app. You can set a maximum number of concurrent executions that each instance should handle at any given time. For more information, see HTTP trigger concurrency.
Concurrency has a direct effect on how your app scales because at lower concurrency levels, you need more instances to handle the event-driven demand for a function. While you can control and fine tune the concurrency, we provide defaults that work for most cases. To learn how to set concurrency limits for HTTP trigger functions, see Set HTTP concurrency limits.
Deployment
Deployments in the Flex Consumption plan follow a single path. After your project code is built and zipped into an application package, it is deployed to a blob storage container. On startup, your app gets the package and runs your function code from this package. By default, the same storage account used to store internal host metadata (AzureWebJobsStorage) is also used as the deployment container. However, you can use an alternative storage account or choose your preferred authentication method by configuring your app's deployment settings. In streamlining the deployment path, there's no longer the need for app settings to influence deployment behavior.
Billing
There are two modes by which your costs are determined when running your apps in the Flex Consumption plan. Each mode is determined on a per-instance basis.
Billing mode | Description |
---|---|
On Demand | When running in on demand mode, you are billed only for the amount of time your function code is executing on your available instances. In on demand mode, no minimum instance count is required. You're billed for: • The total amount of memory provisioned while each on demand instance is actively executing functions (in GB-seconds), minus a free grant of GB-s per month. • The total number of executions, minus a free grant (number) of executions per month. |
Always ready | You can configure one or more instances, assigned to specific trigger types (HTTP/Durable/Blob) and individual functions, that are always available to be able handle requests. When you have any always ready instances enabled, you're billed for: • The total amount of memory provisioned across all of your always ready instances, known as the baseline (in GB-seconds). • The total amount of memory provisioned during the time each always ready instance is actively executing functions (in GB-seconds). • The total number of executions. In always ready billing, there are no free grants. |
The minimum billable execution period for both execution modes is 1,000 ms. Past that, the billable activity period is rounded up to the nearest 100 ms. You can find details on the Flex Consumption plan billing meters in the Monitoring reference.
For details about how costs are calculated when you run in a Flex Consumption plan, including examples, see Consumption-based costs.
For the most up-to-date information on execution pricing, always ready baseline costs, and free grants for on demand executions, see the Azure Functions pricing page.
Supported language stack versions
This table shows the language stack versions that are currently supported for Flex Consumption apps:
Language stack | Required version |
---|---|
C# (isolated process mode)1 | .NET 82 |
Java | Java 11, Java 17 |
Node.js | Node 20 |
PowerShell | PowerShell 7.4 |
Python | Python 3.10, Python 3.11 |
1C# in-process mode isn't supported. You instead need to migrate your .NET code project to run in the isolated worker model.
2Requires version 1.20.0
or later of Microsoft.Azure.Functions.Worker and version 1.16.2
or later of Microsoft.Azure.Functions.Worker.Sdk.
Regional subscription memory quotas
Currently in preview each region in a given subscription has a memory limit of 512,000 MB
for all instances of apps running on Flex Consumption plans. This means that, in a given subscription and region, you could have any combination of instance memory sizes and counts, as long as they stay under the quota limit. For example, each the following examples would mean the quota has been reached and the apps would stop scaling:
- You have one 2,048 MB app scaled to 100 and a second 2,048 MB app scaled to 150 instances
- You have one 2,048 MB app that scaled out to 250 instances
- You have one 4,096 MB app that scaled out to 125 instances
- You have one 4,096 MB app scaled to 100 and one 2,048 MB app scaled to 50 instances
This quota can be increased to allow your Flex Consumption apps to scale further, depending on your requirements. If your apps require a larger quota please create a support ticket.
Deprecated properties and settings
In Flex Consumption, many of the standard application settings and site configuration properties used in Bicep, ARM templates, and overall control plane are deprecated or have moved and shouldn't be used when automating function app resource creation. For more information, see Flex Consumption plan deprecations.
Considerations
Keep these other considerations in mind when using Flex Consumption plan during the current preview:
- VNet Integration Ensure that the
Microsoft.App
Azure resource provider is enabled for your subscription by following these instructions. The subnet delegation required by Flex Consumption apps isMicrosoft.App/environments
. - Triggers: All triggers are fully supported except for Kafka and Azure SQL triggers. The Blob storage trigger only supports the Event Grid source. Non-C# function apps must use version
[4.0.0, 5.0.0)
of the extension bundle, or a later version. - Regions:
- Not all regions are currently supported. To learn more, see View currently supported regions.
- There is a temporary limitation where App Service quota limits for creating new apps are also being applied to Flex Consumption apps. If you see the following error "This region has quota of 0 instances for your subscription. Try selecting different region or SKU." please raise a support ticket so that your app creation can be unblocked.
- Deployments: These deployment-related features aren't currently supported:
- Deployment slots
- Continuous deployment using Azure DevOps Tasks (
AzureFunctionApp@2
) - Continuous deployment using GitHub Actions (
functions-action@v1
)
- Scale: The lowest maximum scale in preview is
40
. The highest currently supported value is1000
. - Managed dependencies: Managed dependencies in PowerShell aren't supported by Flex Consumption. You must instead define your own custom modules.
- Diagnostic settings: Diagnostic settings are not currently supported.
Related articles
Azure Functions hosting options Create and manage function apps in the Flex Consumption plan