Listing the topic category structure

Date: Fri Jul 20 2007
With the product data in the database, we want to make sure the categorization is balanced. It's my feeling that each category should have a relatively small number of items in it, so that the user has a short list to look at. The categorization should be used to create categories with small numbers of items.

Since the categorization is primarily being generated algorithmically, this leads us to an iterative process of generating the categorization, looking at the size of the resulting categories, and tweaking the categorization algorithms.

The following is a useful script to list the category/subcategory list along with a count of products in each one. You can readily see which categories have too many items in them, and it also shows you misplaced categories etc. The script also demonstrates how to use Groovy to make database queries and generate a report:


import groovy.sql.Sql;

def db = Sql.newInstance('jdbc:derby:shareasale', '', '', 'org.apache.derby.jdbc.EmbeddedDriver');

db.eachRow('SELECT DISTINCT Category, SubCategory  FROM products ORDER BY Category') { 
    def cat = it.Category;
    def subcat = it.SubCategory
    try {
    db.eachRow("SELECT COUNT(*) AS count FROM products WHERE Category = ${cat} AND SubCategory = ${subcat}") {
        println cat+": "+subcat + " (${it.count})"
    }
    } catch (Exception e) {
        println "SELECT ${cat}: ${subcat} produced ${e.toString()}"
    }
}

Unfortunately I was unable to make the two database queries into one.