Home Manual Reference Source Repository
Manual » Overview

Supergroup.js

Supergroup brings extreme convenience and understandability to the manipulation of Javascript data collections, especially in the context of D3.js visualization programming.

As if in submission to the great programmers commandment--Don't Repeat Yourself--every time I find myself writing a piece of code that solves basically the same problem I've solved a dozen times before, a little piece of my soul dies.

Utilities for grouping record collections into maps or nests abound: d3.nest, d3.map, Underscore.groupBy, Underscore.Nest, to name a few. But after these tools relieve us of a certain amount of repetitive stress, we're often left with a tangle of hairy details that fill us with a dreadful sense of deja vu. Supergroup may seem like the kind of tacky wonder gadget you'd find on a late-night Ronco ad, but, for the low, low price of free, it makes data-centric Javascript programming fun again. And, when you find yourself in a D3.js callback routine holding a datum object that might have come from anywhere--for instance, with a tooltip callback used on disparate object types--everything you want to know about your object and its associated metadata and records is right there at your fingertips.

Just to be clear about the problem—you start with tabular data from a CSV file, a SQL query, or some AJAX call:

Some very fake hospital data in a CSV file...

    tabulate(d3.select('pre#csv'), data, ['Patient','Patient Age','PatientVisit','Date','Time','Unit','Physician','Charge','Copay','Insurance','Inpatient']); // # run

...turned into canonical array of Objects (using d3.csv, for instance)

    data; // #   render result.replace(/{/g,'\n   {').replace(/]/,'\n]');

Without Supergroup, you'd group the records on the values of one or more fields with a standard grouping function, giving you data like:


d3.nest().key(function(d) { return d.Physician; })
            .key(function(d) { return d.Unit; })
            .map(data);  // # show render indent2

or


d3.nest().key(function(d) { return d.Physician; })
            .key(function(d) { return d.Unit; })
            .entries(data);  // # show render indent2 result.replace(/,\n/g, ", ").replace(/("key"., )/g,"$1\n").replace(/,   /g, ", ")

To my mind, these are awkward data structures (not to mention the awkwardness of the calling functions.) The map version looks ok in the console, but D3 wants data in arrays, not as objects. The entries version gives us arrays of key/value pairs, but on upper levels values is another array of key/value pairs while on the bottom level values is an array of records. In both entries and map, you can't tell from a node at any level what dimension was being grouped at that level.

Supergroup gives you almost everything you'd want for every item in your nest (or in your single array if you have a one-level grouping):

  • An array of the values grouped on (could be strings, numbers, or dates) (Basics)
  • The records associated with each group
  • Parents of nested groups (Dimension Names and Paths)
  • Immediate child groups if any
  • All descendant groups (Retrieving sets of values)
  • Only descendant groups at the leaf level
  • For a group at any level, the name of the dimension (attribute, column, property, etc.) grouped on
  • Path of group names from root to current group
  • Path of group dimension names from root to current group
  • Aggregate calculations on records for that group and its descendants (Aggregates)
  • Ability to look up specific values (Finding specific values)
  • Any of these in a format D3 or some other tool expects (Using Supergroup for D3 hierarchy layouts)
  • Ability to include records in multiple groups if appropriate (Multi-valued Groups)