Basket Analysis

Hi, I currently have a table with to columns: transaction_id and items_list. The items_list could have one to a maximum of 30 sorted items.

I need to group the different groups of two items then the different groups of three items then the different or four items, etc. For every group I need to know on how many transaction is present.

I try to do it splitting the items_list column into 30 columns and then iterate the grouping in a module whit a table of all the possible combinations of the 30 columns in groups of 2, 3, 4 etc.

30 columns in distinct sorted groups of 2 (870 combinations) takes me like 15 minutes to get the answer but 30 columns in groups of 3 (24360 combinations) takes hours and my pc is still working.

Is there any other way?

You can just count the number of commas (separators) and add 1 to understand how many items you have in each list. Then aggregate by the counts.

See the example below:

delimited-count.morph (3.5 KB)