Supergroup.js

Supergroup brings extreme convenience and understandability to the manipulation of Javascript data collections, especially in the context of D3.js visualization programming.

As if in submission to the great programmers commandment–Don’t Repeat Yourself–every time I find myself writing a piece of code that solves basically the same problem I’ve solved a dozen times before, a little piece of my soul dies.

Utilities for grouping record collections into maps or nests abound: d3.nest, d3.map, Underscore.groupBy, Underscore.Nest, to name a few. But after these tools relieve us of a certain amount of repetitive stress, we’re often left with a tangle of hairy details that fill us with a dreadful sense of deja vu. Supergroup may seem like the kind of tacky wonder gadget you’d find on a late-night Ronco ad, but, for the low, low price of free, it makes data-centric Javascript programming fun again. And, when you find yourself in a D3.js callback routine holding a datum object that might have come from anywhere–for instance, with a tooltip callback used on disparate object types–everything you want to know about your object and its associated metadata and records is right there at your fingertips.

Just to be clear about the problem—you start with tabular data from a CSV file, a SQL query, or some AJAX call:

Some very fake hospital data in a CSV file…

...turned into canonical array of Objects (using d3.csv, for instance)

Without Supergroup, you’d group the records on the values of one or more fields with a standard grouping function, giving you data like:

d3.nest().key(function(d) { return d.Physician; }).key(function(d) { return d.Unit; }).map(data)

d3.nest().key(function(d) { return d.Physician; }).key(function(d) { return d.Unit; }).entries(data)

To my mind, these are awkward data structures (not to mention the awkwardness of the calling functions.) The map version looks ok in the console, but D3 wants data in arrays, not as objects. The entries version gives us arrays of key/value pairs, but on upper levels values is another array of key/value pairs while on the bottom level values is an array of records. In both entries and map, you can’t tell from a node at any level what dimension was being grouped at that level.

Supergroup gives you almost everything you’d want for every item in your nest (or in your single array if you have a one-level grouping):

Supergroup

var foo = bar;

Works as an Underscore (or Lo-Dash) mixin:



A plain Array of Strings, enhanced with children and records

_.supergroup(data, fieldname) returns an array whose elements are the distinct values of <fieldname> in the original data records. These elements, or Values can be String or Number objects (Dates to be implemented eventually). Each Value holds a .records property which is an array containing the subset of original records matching that Value.

In the example below we do a multi-level grouping by Physician and Unit. So sg = _.supergroup(data,['Physician','Unit']) returns a list of physicians (the top-level grouping). The first item in this list, sg[0], is “Adams”, a String object. sg[0].records is an array containing the records where Physician=“Adams”. sg[0].children is a list of the Units (our second-level grouping) in the records where Physician=“Adams”. sg[0].children[0].records would be the subset of records where Physician=“Adams” and Unit=“preop”.

Supergroup on physician and unit

It does a bunch more I still need to document.


Everything below is old documentation I’m trying to replace

var gradeBook = [
    {lastName: "Gold",    firstName: "Sigfried", class: "Remedial Programming",           grade: "C", num: 2},
    {lastName: "Gold",    firstName: "Sigfried", class: "Literary Posturing",             grade: "B", num: 3},
    {lastName: "Gold",    firstName: "Sigfried", class: "Documenting with Pretty Colors", grade: "B", num: 3},
    {lastName: "Sassoon", firstName: "Sigfried", class: "Remedial Programming",           grade: "A", num: 3},
    {lastName: "Androy",  firstName: "Sigfried", class: "Remedial Programming",           grade: "B", num: 3} 
];
var byLastName = _.supergroup(gradeBook, "lastName"); // an Array of Strings:  ["Gold","Sassoon","Androy"]
byLastName[0].records; // Array of Sigfried Gold's original 3 records
byLastName.rawValues(); // Array of native strings (easier to look at or use in contexts where you need a plain string)
var byName = _.supergroup(gradeBook, function(d) { return d.firstName + ' ' + d.lastName; });
// an Array of Strings:  ["Sigfried Gold","Sigfried Sassoon","Sigfried Androy"]
byName.lookup("Sigfried Gold").records.pluck("num").mean(); //  2.6666666666666665 

The above example shows how Supergroup can chain Underscore methods (and mixins), functionality it gets from underscore-unchained.

var byClassGrade = _.supergroup(gradeBook, ["class", "grade"]); // Array of top-level groups: ["Remedial Programming", "Literary Posturing", "Documenting with Pretty Colors"]
byClassGrade[0].children; // Children of a single group: ["C", "B"]
byClassGrade[0].records; // Array original records for a single group
byClassGrade.lookup("Remedial Programming"); // lookup a top-level group by name
byClassGrade.lookup(["Remedial Programming","B"]); // lookup a second-level group by name path
byClassGrade.lookup(["Remedial Programming","B"]).namePath(' -> '); // "Remedial Programming -> B"
byClassGrade.lookup(["Remedial Programming","B"]).dimPath() // "class/grade"

Supergroup can flatten a tree into an array of nodes much like D3’s hierarchy layout, but in a way that’s easier to use IMHO. javascript byClassGrade.flattenTree(); // ["Remedial Programming", "C", "A", "B", "Literary Posturing", "B", "Documenting with Pretty Colors", "B"] byClassGrade.flattenTree().invoke('namePath'); // ["Remedial Programming", "Remedial Programming/C", "Remedial Programming/A", "Remedial Programming/B", "Literary Posturing", "Literary Posturing/B", "Documenting with Pretty Colors", "Documenting with Pretty Colors/B"] // only want leaf nodes? byClassGrade.leafNodes().invoke('namePath'); // ["Remedial Programming/C", "Remedial Programming/A", "Remedial Programming/B", "Literary Posturing/B", "Documenting with Pretty Colors/B"]