Thursday, October 27, 2016

Convert order lines to weighted graph

Let's say we have orders history with some products. We need to perform community detection as a part of market basket analysis. Order lines are like OrderId - ProductId.

First thing need to be done is to convert order lines into weighted graph. Where Nodes are products and Edges connect nodes if products were purchased in the same order.
Something like this:
Weighted means that an Edge "weight" between two products is equal to amount of times those products were bought together.

Trying to find existing code for this task and not succeding, I have created a code snippet in R, which does the conversion from order lines to weigthed graph using adjacency matrix
For the graph above adjacency matrix can look like this:

The ides is simple - convert order lines into adjacency matrix N x N, where N = number of products (all columns are products, and all rows are products, edges weight = number of times two products bought in the same order). And adjacency matrix is easy convertable to graph.
This approach happens to work relatively fast.


Hope that saves somebody's time :)