When to use low priority nodes in Azure Batch
What is Azure Batch?
Azure Batch is designed to run parallel computing in the cloud across many nodes that can scale based on the workload being executed. It’s a perfect fit for parallelizing ETL or AI use-cases.
What are low-priority nodes?
Low-priority nodes are VMs which aren’t currently used in the Azure Data Center (i.e. it’s surplus capacity in Azure). When setting up your Azure Batch nodes one can configure to use either dedicated notes and/or low-priority nodes. Low-priority nodes which are offered at up to 80% discount, yet they aren’t guaranteed in availability.
This configuration is done either in the portal when creating a cluster or in the configuration JSON file (when using Azure Batch package in Python or ‘DoParallel’ in R):
Code configuration of low-priority nodes
Portal configuration of low-priority nodes
click to enlarge
What are the (dis)advantages ?
The advantage of low-priority nodes is cost reduction. Low-priority come with a discount of up to 80%. The disadvantage of using low-priority nodes is that these nodes might not always be available to be allocated to your application and they might be revoked.
If the nodes are not available at the time when your application has to run, your application will only truly start as soon as resources are available. Similarly, if you have a mixed cluster of dedicated and low-priority nodes, your pool might start with only dedicated nodes and thus not have the ability to run at full scale. This can impact the time completion of your jobs.
When to use them?
Suitable to use low-priority nodes
- When your job completion time is more flexible
- When you have flexible long-running jobs that have built in mechanism to save the progress as they are executed
- When you use low-priority nodes as additional support to dedicated nodes
Not suitable to use low-priority nodes:
- When your use-case demands fixed completion time of the batch jobs
- When you have long-running jobs distributed on many VMs
Low priority nodes can be leveraged to boost performance at a low cost when job completion time is flexible and periodic interruptions do not have a drastic effect on the final result of the job. If low-priority nodes are used just as a support to already dedicated nodes one limits the risk that certain jobs don’t run at all because of unavailability.
We recommend users to combine low-priority and dedicated nodes in a batch pool. A user can then benefit from the low-cost of the low-priority nodes yet still be under control of jobs running in due time.
Contact us for more information.