Working with Data Science and Machine Learning requires continuous education. Usually when I start on reading new paper or just google for some question I stumble apon a good course or a book or a resource, which I read fast to grasp the concept and bookmark to come back later.
*sometimes I even DO come back later :)
Today I just happen to organise my bookmarks, and than decided to share resources I prefer for online education - most of them offer free education, at least in audit mode.
NB! Order is arbitrary, and shows no preferences
edX - https://courses.edx.org
Very popular with a lot of content. Mostly free, and one can buy certificate upon course completion.
Cousera - https://www.coursera.org/
Another very popular. Unfortunately, some courses do not give access to practical excersizes unless paid.
LinkedIn - https://www.linkedin.com/learning
Some interesting content there. Still kind of new, so I am trying in out now.
Microsoft Virtual Academy - https://mva.microsoft.com/
A lot of free content regarding Microsoft technologies. The quality is also excellent.
Microsoft AI School - https://aischool.microsoft.com/
Somewhat similar to previous but with applied filter to AI stuff.
Pluralsights - https://app.pluralsight.com/library/
The really good ones are not free. But maybe your work could pay for access?
Udacity - https://classroom.udacity.com/
Another famous resource. Lectures are interactive and playfull.
DataCamp - https://www.datacamp.com/community/open-courses/
I find cources here very simple and interactive. Sometimes it is useful to repeat basics.
Yandex Data School - https://yandexdataschool.ru/edu-process/courses (this one is on russian)
This one gives good insights witht more mathematics and underlying theory compared to other ML courses. If you speak russian - absolutely reccomended and truly enjoyable. Russians love math, don't we? :)
Youtube - https://www.youtube.com/
Lot's of recordings of lectures and courses from most famous teachers and universities. No homework, but excellent for boosting theoretical understanding.
iladan
Путь звезды - одиночество во тьме
Sunday, August 12, 2018
Monday, June 11, 2018
NDC Oslo
This year is the 3rd time I have an honor to speak at NDC Conference. NDC Oslo is my favourite one, - great content, cool venue and fantastic mood :)
Speaking about Machine Learning on several occasions I have often recieved comments that there is a lot of theoretical knowledge all around about the concept but nothing really about real life expirience.
So,
For this year we (me and @KatyaGeek) have decided to talk about how running Machine Learning project is different from running a typical Software project.
https://ndcoslo.com/talk/not-a-magic-what-to-expect-from-machine-learning-projects/
Come along!
Speaking about Machine Learning on several occasions I have often recieved comments that there is a lot of theoretical knowledge all around about the concept but nothing really about real life expirience.
So,
For this year we (me and @KatyaGeek) have decided to talk about how running Machine Learning project is different from running a typical Software project.
https://ndcoslo.com/talk/not-a-magic-what-to-expect-from-machine-learning-projects/
Come along!
Monday, December 11, 2017
AzureML supported R packages
To avoiid the situation when you write a custom R code and execute it in AzureML... and suddenly get an error about package is not supported. Bummer!
There is a list of R packages supported : https://msdn.microsoft.com/en-us/library/mt741980.aspx
Save yourself a trouble, have a look first!
There is a list of R packages supported : https://msdn.microsoft.com/en-us/library/mt741980.aspx
Save yourself a trouble, have a look first!
Monday, August 28, 2017
Getting started with Machine Learning?
I am often asked what to look at if somebody wants to get started with Machine Learning. Usually I sent people to Coursera "Machine Learning" class by Andrew Ng. It's like a litmus test - after taking that one, people usually get a feeling if Machine Learning is something they want to continue with or not.
However, if you decide to continue, what's next? What kind of knowledge\skills to look at? What are buzzword in all those learning materials?
And than I found this blog post, which IMHO, summarize it all pretty well! Even addressing nesessity to understand Linear Algebra (while I usually take it for granted and never mention, but in fact - one have to know Linear Algebra. It's a corner stone of any engeneering skill) So here it is, enjoy:
http://abhijitannaldas.com/getting-started-with-machine-learning-in-one-hour/
Do not despair if it seems too much to deal at once. The learning approach can be bottom-up i.e. from theory to practice... but top-down works as well! I.e. get yourself a case and work down the limited scope of the theory nesessary to understand and develop the solution.
However, if you decide to continue, what's next? What kind of knowledge\skills to look at? What are buzzword in all those learning materials?
And than I found this blog post, which IMHO, summarize it all pretty well! Even addressing nesessity to understand Linear Algebra (while I usually take it for granted and never mention, but in fact - one have to know Linear Algebra. It's a corner stone of any engeneering skill) So here it is, enjoy:
http://abhijitannaldas.com/getting-started-with-machine-learning-in-one-hour/
Do not despair if it seems too much to deal at once. The learning approach can be bottom-up i.e. from theory to practice... but top-down works as well! I.e. get yourself a case and work down the limited scope of the theory nesessary to understand and develop the solution.
Friday, January 27, 2017
Understanding AzureML Web Services pricing
So, you have created a predictive experiment in Azure ML Studio, and it is time to go in production.
I am not going to explain how to deploy a web service - enough tutorials for that.
The question you need to ask yourself - how much it is going to cost me? And here come confusions. Most of people just click on deploy and go for default proposed. I haven't found any article explaining differences and options available. So, this post is about it.
At first, if we go to Azure Pricing calulator and add Machine Learning:
There is nothing here about cost of web services deployed from AzureML studio.
But click at the small information button and choose "Machine Learning pricing details" and the curtain lifts up! It redirects us to the page with explanation what is what and how much. Let's focus on the part "Production Web API pricing".
First of all, everybody understands that a web service requires some resources to spin. Those resources need to be allocated - at that is what we pay for. For standart web application those resources are allocated by creating App Sevice Plan. App Service Plans come in many cost options defines by how much compute power and how much disk place it includes. Behind an App Service Plan there is a virtual machine, so be aware - it's gonna cost no matter if you use it or not (unless the choosen plan is "Free"). Same as monthly abonement to a fitness club - you pay for it, and nobody cares if you actually visit it.
When deploy web service in AzureML studio, there is no such thing to choose as "App Service Plan". Those are for standart web services only. For AzureML web services there are 2 options:
What's the difference?
Classic is pretty much "pay-as-you-go" with cost:
And MachineLearning Web Service Plan is basically same as "App Service Plan" but for AzureML. Prepaid, pre-allocated resources (there is a Free option - with 2 hours and place for 2 services).
Also, when creating a new ML Workspace we must now create a "Machine Learning Web Service Plan". And Free option can be used once per geo-location. So, when creating new Workspace pay attention to what you choose for Web Service Plan:
as chosing default suggestion will create new Web Service plan, and if Free one has already been used, the new one gonna cost.
Few words about scaling. As it says here, 20 concurrent requests are configured by default. If you need more - add more endpoints. Top limit is 200.
And yes, there is a new portal for just managing ML web services now - with the view for both deployment options:
I hope pricing options for AzureML web services are clear now :)
I am not going to explain how to deploy a web service - enough tutorials for that.
The question you need to ask yourself - how much it is going to cost me? And here come confusions. Most of people just click on deploy and go for default proposed. I haven't found any article explaining differences and options available. So, this post is about it.
At first, if we go to Azure Pricing calulator and add Machine Learning:
But click at the small information button and choose "Machine Learning pricing details" and the curtain lifts up! It redirects us to the page with explanation what is what and how much. Let's focus on the part "Production Web API pricing".
First of all, everybody understands that a web service requires some resources to spin. Those resources need to be allocated - at that is what we pay for. For standart web application those resources are allocated by creating App Sevice Plan. App Service Plans come in many cost options defines by how much compute power and how much disk place it includes. Behind an App Service Plan there is a virtual machine, so be aware - it's gonna cost no matter if you use it or not (unless the choosen plan is "Free"). Same as monthly abonement to a fitness club - you pay for it, and nobody cares if you actually visit it.
When deploy web service in AzureML studio, there is no such thing to choose as "App Service Plan". Those are for standart web services only. For AzureML web services there are 2 options:
- "classic" Web Service
- Machine Learning Web Service
What's the difference?
Classic is pretty much "pay-as-you-go" with cost:
And MachineLearning Web Service Plan is basically same as "App Service Plan" but for AzureML. Prepaid, pre-allocated resources (there is a Free option - with 2 hours and place for 2 services).
Also, when creating a new ML Workspace we must now create a "Machine Learning Web Service Plan". And Free option can be used once per geo-location. So, when creating new Workspace pay attention to what you choose for Web Service Plan:
as chosing default suggestion will create new Web Service plan, and if Free one has already been used, the new one gonna cost.
Few words about scaling. As it says here, 20 concurrent requests are configured by default. If you need more - add more endpoints. Top limit is 200.
And yes, there is a new portal for just managing ML web services now - with the view for both deployment options:
I hope pricing options for AzureML web services are clear now :)
Tuesday, November 8, 2016
3 things I really miss in Azure Machine Learning
Azure Machine Learning is a handy tool, absolutely. If I need to run some model quickly to justify gut feeling or to have a simple overview over data, it fits really well. Or, for example, set up a web service from a Machine Learning experiment is really easy, so kudos for that!
But there are some things which annoy me time after time, which I really want to be implemented or done differently. Here is my top 3 "wish-list":
1. Navigation inside the experiment
mean, honestly... I can kindof accept zoom button, but for navigating inside the experiment window I really expect drag scene to work! As for know it requires moving mouse to a side every time and scroll up or down.
Update : yey, there is a possibility to drag scene in AzureML! Just need to be enabled by clicking this button
2. Delete several datasets together
I do load and save a lot of datasets, and to delete them as they become irrelevant one-by-one is a time killer.
3. Ability to script the experiment
Yes, if I only could upload experiment as a script... that opens so many possibilities. Like an essential one - VERSION CONTROL my experiments. Oh, don't even let me started...
Please Microsoft. Christmas is coming, and I was a nice girl :)
PS: do you have such things that annoy hell out of you? Comment!
But there are some things which annoy me time after time, which I really want to be implemented or done differently. Here is my top 3 "wish-list":
1. Navigation inside the experiment
mean, honestly... I can kindof accept zoom button, but for navigating inside the experiment window I really expect drag scene to work! As for know it requires moving mouse to a side every time and scroll up or down.
Update : yey, there is a possibility to drag scene in AzureML! Just need to be enabled by clicking this button
2. Delete several datasets together
I do load and save a lot of datasets, and to delete them as they become irrelevant one-by-one is a time killer.
3. Ability to script the experiment
Yes, if I only could upload experiment as a script... that opens so many possibilities. Like an essential one - VERSION CONTROL my experiments. Oh, don't even let me started...
Please Microsoft. Christmas is coming, and I was a nice girl :)
PS: do you have such things that annoy hell out of you? Comment!
Thursday, October 27, 2016
Convert order lines to weighted graph
Let's say we have orders history with some products. We need to perform community detection as a part of market basket analysis. Order lines are like OrderId - ProductId.
First thing need to be done is to convert order lines into weighted graph. Where Nodes are products and Edges connect nodes if products were purchased in the same order.
Something like this:
First thing need to be done is to convert order lines into weighted graph. Where Nodes are products and Edges connect nodes if products were purchased in the same order.
Something like this:
Weighted means that an Edge "weight" between two products is equal to amount of times those products were bought together.
Trying to find existing code for this task and not succeding, I have created a code snippet in R, which does the conversion from order lines to weigthed graph using adjacency matrix.
For the graph above adjacency matrix can look like this:
The ides is simple - convert order lines into adjacency matrix N x N, where N = number of products (all columns are products, and all rows are products, edges weight = number of times two products bought in the same order). And adjacency matrix is easy convertable to graph.
This approach happens to work relatively fast.
Code is located here: https://github.com/iladan/R/blob/master/codeSnippets/OrdersToGraph.R
Hope that saves somebody's time :)
Subscribe to:
Posts (Atom)