Tuesday, November 8, 2016

3 things I really miss in Azure Machine Learning

Azure Machine Learning is a handy tool, absolutely. If I need to run some model quickly to justify gut feeling or to have a simple overview over data, it fits really well. Or, for example, set up a web service from a Machine Learning experiment is really easy, so kudos for that!

But there are some things which annoy me time after time, which I really want to be implemented or done differently. Here is my top 3 "wish-list":

1. Navigation inside the experiment

 

mean, honestly... I can kindof accept zoom button, but for navigating inside the experiment window I really expect drag scene to work! As for know it requires moving mouse to a  side every time and scroll up or down.
Update : yey, there is a possibility to drag scene in AzureML! Just need to be enabled by clicking this button

2. Delete several datasets together
I do load and save a lot of datasets, and to delete them as they become irrelevant one-by-one is a time killer.

3. Ability to script the experiment
Yes, if I only could upload  experiment as a script... that opens so many possibilities. Like an essential one - VERSION CONTROL my experiments. Oh, don't even let me started...

Please Microsoft. Christmas is coming, and I was a nice girl :)

PS: do you have such things that annoy hell out of you? Comment!

Thursday, October 27, 2016

Convert order lines to weighted graph

Let's say we have orders history with some products. We need to perform community detection as a part of market basket analysis. Order lines are like OrderId - ProductId.

First thing need to be done is to convert order lines into weighted graph. Where Nodes are products and Edges connect nodes if products were purchased in the same order.
Something like this:
Weighted means that an Edge "weight" between two products is equal to amount of times those products were bought together.

Trying to find existing code for this task and not succeding, I have created a code snippet in R, which does the conversion from order lines to weigthed graph using adjacency matrix
For the graph above adjacency matrix can look like this:

The ides is simple - convert order lines into adjacency matrix N x N, where N = number of products (all columns are products, and all rows are products, edges weight = number of times two products bought in the same order). And adjacency matrix is easy convertable to graph.
This approach happens to work relatively fast.


Hope that saves somebody's time :)

Wednesday, September 21, 2016

Error in library(igraph) : there is no package called ‘igraph’

An odd error occured to me today, after
   install.packages("igraph") 
   library(igraph) 
returned error: Error in library(igraph) : there is no package called ‘igraph’
Checked library location for my R (it is, by the way, C:\Program Files\R\R-3.2.5\library) and igraph was indeed not there! The package was downloaded thougth, as my log contained:
   The downloaded binary packages are in C:\Users\iladan\AppData\Local\Temp\Rtmp63c5us\downloaded_packages
And yes, indeed, igraph_1.0.1.zip was there.

Well, if it doesn't want to play by the book, - simply extract the archive into library folder and problem solved.

Thursday, May 26, 2016

How to build .Net Core project with TFS vNext build

So, we created a wonderful .Net Core solution, it looking forward to build it with VSTS or TFS on-premises. And of course we want to use vNext build system :)

First of all - the Hosted agent in Azure CAN NOT build .Net Core projects yet. We will therefore need to use on-premises agent. If you need to know how to configure a vNext build agent - check this manual. Also Martin has a very detailed step-by-step post here.

After build agent is on, make sure you have Visual Studio 2015 Update 2 (ot upgrade to it) and install  .Net Core SDK from here. Now we are all set!

If you just go on the easiest way and do - get the code, build:

it will not work. Reason is that package management is all different in .Net Core, and nuget restore doesn't know how to deal with your .Net Core project packages. So you get:
The dependency Ix-Async >= 1.2.5 could not be resolved.
The dependency Microsoft.AspNetCore.Antiforgery >= 1.0.0-rc2-final could not be resolved.
The dependency Microsoft.AspNetCore.Authorization >= 1.0.0-rc2-final could not be resolved.
... 
and so on. Packages are listet in your project.json file. So unless you restore them with .Net CLI tool prior to build, it ain't work.

Therefore create a packagesRestore.ps1 powershell script like this:
Get-ChildItem -Path $PSScriptRoot\MyWebApp.NetCore -Filter project.json -Recurse | ForEach-Object { & dotnet restore $_.FullName 2>1 }
and save it next next to your .sln file. Check in. 
Pay attention that my project name is MyWebApp.NetCore - so the project.json file is under corresponding directory. 

Now, in you vNext build definition add a new execute powershell before Build solution:

and that's it! Now we can build .Net Core app with VSTS on our on-premises Build agent :)

Monday, May 23, 2016

How to use Microsoft R Server (Revolution R) in Visual Studio

First, you will need to install R Tools for Visual Studio followng instructions here. Then Microsoft R Server - the how-to is here.

After R Tools for Visual Studio are installed, open Visual Studio. You will see a new top menu "R Tools" available:

Click R Tools -> Options. Visual Studio options will open a tab for R settings:
Modify R Engine path to where you have your MRO-for-RRE installation (default is C:\Program Files\Microsoft\MRO-for-RRE\8.0\R-3.2.2)

Restart Visual Studio and enjoy using R server from your favourite IDE :)

Friday, May 13, 2016

Installing Microsoft R Server (Revolution R) on Windows

As we know, Microsoft bought Revolutionary R, and now it is available as a part of your MSDN ubscription. This is wonderfull! Installation though, is not still on a level with other Microsoft products, therefore - a short how-to install it post.

1. Download Microsoft R Server from msdn subscribers downloads page:
Thats's the one:


2. Download prerequisite - MRO 3.2.2 for RRE 8.0.0 from here:
https://mran.microsoft.com/install/mro4mrs/8.0.0/MRO-3.2.2-for-RRE-8.0.0-Windows.exe
as that one will not be installed automatically along with other prerequisites.

3. Install MRO for RRE 8.0.0 by running MRO-3.2.2-for-RRE-8.0.0-Windows.exe
4. Unpack en_rre_for_windows_x64_8183330.zip and run Revolution-R-Enterprise-8.0.0-Windows.exe

That's it. Some notes:

1. Yes, I had MRO 3.2.3 and MRO 3.2.4 installed, but Microsoft R server requires exactly MRO-for-RRE.
2. It will install some old components dating back to Visual Studio 2008,

Thursday, April 28, 2016

TFS update error: VS402642 found a backup job running against database

Last week, I was performing upgrade to TFS 2015 Update 2. It all looked well for a very long time, untill one of collection databases failed to upgrade on step 1163 of 1171.

Clicking on error gave me some more information:
VS402642 found a backup job running against tfs_xxx database . Wait for the backup job to complete and the rerun the failed collection job from the Status tab.

From TFS Administrator Console i could see that the collection is offline, and job status is Failed. Trying to rerun it immediately fails with the same error.
For the next 15 minutes we tryed to locate if there is anything taking backup of the database, but there was nothing active - and one failed earlier. I could only assume, that backup job on SQL server kicked in while collection was under upgrade and didn't manage to finish, but also blocked successfull TFS update.
Making it short, we had to : quiesce to stop all TFS activities, detach collection database in SQL Management studio (do not detach in TFS!), killing all connections, attach collection database back, unquiesce.
After that job was able to rerun and finish the upgrade.

The lesson learned - make sure no scheduled backups can kick in while you perform your time consuming upgrade to TFS 2015 Update 2.

Friday, April 15, 2016

Restore database to another name on the same SQL server

I often need to copy a database into another database on the same SQL server instance. So I take a backup and restore from it into another database. And here is the script:

RESTORE DATABASE [Tfs_tfs] FROM DISK=N'C:\Program Files\Microsoft SQL Server\MSSQL11.NATANSQL\MSSQL\Backup\BackupFile.bak'
WITH
   MOVE 'Tfs_MyCollection' TO 'C:\Program Files\Microsoft SQL Server\MSSQL11.NATANSQL\MSSQL\DATA\Tfs_tfs.mdf',
   MOVE 'Tfs_MyCollection_log' TO 'C:\Program Files\Microsoft SQL Server\MSSQL11.NATANSQL\MSSQL\DATA\Tfs_tfs_log.ldf'
GO

Where Tfs_MyCollection is the name of original database,
BackupFile.bak is the backup of it, and
Tfs_tfs is the new database name.

Thursday, April 14, 2016

Problem upgrading from TFS 2013 Update 4 to TFS 2015 Update 2

Microsoft announced relase of TFS 2015 Update 2 in the end of March, warning that it is a big one in context of internal data changes. Performing several upgrades from multiple versions to mutiple versions for last years, I can say, that usually it goes smoothly. Some hickups, of course, but no big ones.

This week I had an upgrade from TFS 2013 Update 4 to latest (2015 Update 2), and ran into rather interesting failure.

By the way, many people ask for timing - for a collection of 80 Gb it takes around 1.5 hours (with database on another machine with 2x4 cores and 64 Gb RAM).
On my virtual machine with 2Gb memory and all in one place - empty collection took 2 minutes.

The upgrade process itself went totally smooth. Then we start testing upgraded TFS, and got 2 collections totally fine, and one - unaccessible. While all 3 collection were shown fine and healthy in TFS admin console,  web page for one of them returned:

All the team projects in the "bad" collection were not available from either web or Visual Studio.

Event Log had this:
System.Web.HttpException (0x80004005): Page not found.
   at Microsoft.TeamFoundation.Server.WebAccess.Controllers.ErrorController.NotFound()
   at lambda_method(Closure , ControllerBase , Object[] )
   at System.Web.Mvc.ReflectedActionDescriptor.Execute(ControllerContext controllerContext, IDictionary`2 parameters)
   at System.Web.Mvc.ControllerActionInvoker.InvokeActionMethod(ControllerContext controllerContext, ActionDescriptor actionDescriptor, IDictionary`2 parameters)
...and lots of other Page Not Found.

At this moment we decided to rollback to TFS 2013 Update 4, as googling didn't give us any answer what is wrong and how to fix it.

Day after, exploring the error logs, I noticed that some urls looked wrong, which gave me an idea... so I was able to reproduce the failure on a totally clean environment. 
The problem was in the collection name. From TFS 2012 and on it is not allowed to call collection tfs, as it is a reserved word. My customers had their TFS from version 2010, so they managed to call collection "tfs" and strangely, problem never appeared before they got all the way into 2015.

Solution is rather simple. Prior to upgrade to TFS 2015, rename the collection into smth else. And then it all goes nicely.

I hope, Microsoft will add some checks during verification prior to upgrade. Even though case might be not so common.

PS: Microsoft provided us a solution in form of .dll to keep collection name unchanged and also announced that the problem will be fixed in 2015 Update 3. 

PPS: Follow up: we also hit a build controllers and agents issue - the just stopped with "Page not found" error. Copying the .dll under /Application Tier/Message Queue/bin/Plugins folder on app tier machine fixed it.

PS: reproducing error
Image 1: Windows 7 
  • Install TFS 2010 (SQL Server 2008 R2 Update 3). Create collection with name tfs
Image 2: Windows Server 2012
  • Migrate data and upgrade to TFS 2012 Update 4 (SQL Server 2012 Standard)
  • Upgrade to TFS 2013 Update 4
  • Upgrade to TFS 2015 Update 2
  • Open web and try to access any team project in the collection tfs - Page not found. Connecting from Visual Studio doesn't work either.
No reporting, no sharepoint integration.

Thursday, March 31, 2016

Pivot in U-SQL

In my case I needed to pivot a table with date by a week day. Meaning, having table like:

Transno Date
1223 11/01/2016
2795 12/01/2016
To get result as:
Transno 1 2 3 4 5 6 7
1223 0 1 0 0 0 0 0
2795 0 0 1 0 0 0 0
Where corresponding week day number column gets value 1, and others - 0... Outputting result to a .csv, of course.

First I select transno, day of week as a number, and some constant for future pivotting.
@res1 =
    SELECT Transno, Convert.ToInt32(Date.DayOfWeek) AS wd, 1 AS a1
    FROM dbo.LocalTransno;

Then use MAP_AGG, which is a base for pivot:
@res2 =
    SELECT Transno,           
           1 AS k1,
           MAP_AGG(wd, (int?) a1) AS mapwd
    FROM @res1
    GROUP BY Transno;

Here I create a dummy table with 1 row, just to select values from 1 to 7 (my weekdays numbers) - as a key values for pivot.
@one = SELECT * FROM (VALUES(1)) AS T(a);
@keys = 
    SELECT 
        1 AS k1, 
        Inmeta.USQLScripts.Helper.InitList(1, 7) AS dkeys
    FROM @one;

Where InitList s a code-behind function returning an object of SqlArray<int>:

  public partial class Helper  {
       public static SqlArray<int> InitList(int lower, int upper) {
            var values = new List<int>();
            for (int i = lower; i <= upper; i++)  {
                values.Add(i);
            }
            var res = new SqlArray<int>(values) ;
            return res;
        }
  }

And finally, pivotting and unrolling to comma-separated format at once:
@res =
    SELECT 
    a.Transno.ToString() + ","  
    String.Join(",", b.dkeys.ToList().Select(k => a.mapwd.ContainsKey(k) ? 1 : 0)) 
    AS x
    FROM @res2 AS a
    JOIN @keys AS b ON a.k1 == b.k1;
Note the trick of dummy join, to connect weekdays keys table with my data :)

And finally output - removing the quotes from string output gives a clean csv :
OUTPUT @res
TO @out
USING Outputters.Csv(quoting : false);

It's a bit of running around, but that's the only way I managed to make it work.

Thursday, March 3, 2016

Accessing Azure Data Lake Storage from .Net SDK - FsOpenStream error 0x83090aa2

For the topic there is a very good basic example located here:
https://azure.microsoft.com/en-us/documentation/articles/data-lake-store-get-started-net-sdk/
with application authentication and it worked.

But once I tried to connect to the existing data lake store - it won't. Giving me a cryptic error:
Exception of type 'Microsoft.Rest.Azure.CloudException' was thrown
with even more cryptic insides like:
FsOpenStream failed with error 0x83090aa2

I have tried fixing file location, assuming may be /myfile.txt is wrong for the file in the very root. Nope. It is all simple, ...as usual.

Remember giving access to your application to the resource group (or subscription or whatever you choosen)? Well, this is not good enough - you have to give access to the data lake store folder explicitely, as it does not inherit it (as I expected it would).

Browse the data lake store, and click on access - app is not there!



So, giving my app access directly to data lake store root folder solved the error.

Friday, January 22, 2016

The singularity is near!

So... yes, the singularity feels really near now. If you haven't read any of Ray Kurtzweil books, it's about time now. Or rather yesterday.

In fact, think about it - according to all the predictions on this field, we are going to experience singularity already in our lifetime. This is not a science fiction any more, this is reality. It moves into us like a huge locomotive, and it is your choice - get smashed by it or jump on and enjoy the ride.

For 50 years ago, such a thing as cellphone was unheard. Now - nearly everybody uses smartphone as a part of daily routines, hardly imagining being without it. Cell phones technology got integrated into our life. And that's exactly what is happening with Machine Learning nowadays.

Why now? Technology has finally developed to that point where we can build some kind of artificial intelligence on it. And nevertheless - in a programming world tools and languages has evolved, enabling us to create and apply Machine Learning solutions to broad set of problems.

Isn't it exciting? Stay tuned! :)