Friday, January 27, 2017

Understanding AzureML Web Services pricing

So, you have created a predictive experiment in Azure ML Studio, and it is time to go in production.
I am not going to explain how to deploy a web service - enough tutorials for that.
The question you need to ask yourself - how much it is going to cost me? And here come confusions. Most of people just click on deploy and go for default proposed. I haven't found any article explaining differences and options available. So, this post is about it.

At first, if we go to Azure Pricing calulator and add Machine Learning:
There is nothing here about cost of web services deployed from AzureML studio.

But click at the small information button and choose "Machine Learning pricing details" and the curtain lifts up! It redirects us to the page with explanation what is what and how much. Let's focus on the part "Production Web API pricing".

First of all, everybody understands that a web service requires some resources to spin. Those resources need to be allocated - at that is what we pay for. For standart web application those resources are allocated by creating App Sevice Plan. App Service Plans come in many cost options defines by how much compute power and how much disk place it includes. Behind an App Service Plan there is a virtual machine, so be aware - it's gonna cost no matter if you use it or not (unless the choosen plan is "Free"). Same as monthly abonement to a fitness club - you pay for it, and nobody cares if you actually visit it.

When deploy web service in AzureML studio, there is no such thing to choose as "App Service Plan". Those are for standart web services only. For AzureML web services there are 2 options: 

  • "classic" Web Service 
  • Machine Learning Web Service

What's the difference?
Classic is pretty much "pay-as-you-go" with cost:
And MachineLearning Web Service Plan is basically same as  "App Service Plan" but for AzureML. Prepaid, pre-allocated resources (there is a Free option - with 2 hours and place for 2 services).




Also, when creating a new ML Workspace we must now create a "Machine Learning Web Service Plan". And Free option can be used once per geo-location. So, when creating new Workspace pay attention to what you choose for Web Service Plan:
as chosing default suggestion will create new Web Service plan, and if Free one has already been used, the new one gonna cost.

Few words about scaling. As it says here, 20 concurrent requests are configured by default. If you need more - add more endpoints. Top limit is 200.

And yes, there is a new portal for just managing ML web services now - with the view for both deployment options:

I hope pricing options for AzureML web services are clear now :)