Static Version

The Step of the Conductor

There have been several async management libraries proposed and written. I'm guilty of at least three of them. The reason for this proliferation of code is that they're all trying to solve a very real problem with writing non-trivial applications that make heavy use of async callbacks.

Parallel and Serial

Most of the libraries to date help solve two main common patterns of function use. They are parallel execution and serial execution. In parallel execution you fire off several asynchronous functions and want a common callback to be called then they all finish. The serial pattern is when you have a chain of steps that can't execute till the previous one is done. Combining these two patterns gives some pretty flexible uses of async functions without excessive boilerplate or nesting.

Step

A very small library that I've been using for these simple cases is based on the idea from Will Conant's flow-js. I simplified the idea down to it's core and made some little assumptions to make it easier to use with node's error handling pattern. I call it step.

Here is a snippet of using Step in the wheat blogging engine I'm working on:

step1.js
function loadArticle(name, callback) {
 
var props;
 
Step(
   
function readFile() {
     
Git.readFile(path.join("articles", name + ".markdown"), this);
   
},
   
function getAuthor(err, markdown) {
     
if (err) throw err;
      props
= markdownPreParse(markdown);
      props
.name = name;
      loadAuthor
(props.author, this);
   
},
   
function finish(err, author) {
     
if (err) throw err;
      props
.author = author;
     
return props;
   
}
 
);
}

In this example, I pass three steps as functions to the Step helper. The first two end in a call to an asynchronous function. I pass the value this as the callback. This hooks's into Step's system so that it know to call the next step when the first is done. The parameters given to the callback are passed through to the next step. Notice that I created a closure variable props. This is so that the third step has access to the props defined in the second step, but not passed through by the loadAuthor call. The third step then does some final processing and calls the main callback to the outer function.

In essence loadArticle is a composite asynchronous function that had two other asynchronous function calls mixed with other synchronous logic within it.

How about an example that makes use of the parallel feature of Step:

step2.js
// Reads the authors in the authors directory and returns a data structure
function loadAuthors(callback) {
 
var names;
 
Step(
   
function getFileNames() {
     
Git.readDir("authors", this);
   
},
   
function readFileContents(err, results) {
     
if (err) throw err;
     
var parallel = this.parallel;
      results
.files.forEach(function (filename) {
       
var name = filename.replace(/\.markdown$/, '');
        loadAuthor
(name, parallel());
     
});
   
},
   
function parseFileContents(err) {
     
if (err) throw err;
     
var authors = {};
     
Array.prototype.slice.call(arguments, 1).forEach(function (author) {
        authors
[author.name] = author;
     
});
     
return authors;
   
}
 
);
}

This example is similar, but with the new addition of the this.parallel function. This parallel function generates a new callback when called and sets an internal counter in the Step system. Though it's hard to see with this example, the arguments to parseFileContents are first a single err and then the second argument to each of the loadAuthor callbacks.

Perhaps this example will be more clear:

step3.js
var Step = require('step');

Step(
 
function loadData() {
   
Git.getTags(this.parallel());
    loadAuthors
(this.parallel());
 
},
 
function renderContent(err, tags, authors) {
   
if (err) return response.simpleText(500, err.stack);
   
var data = {}; // Truncated for clarity
    renderTemplate
('index', data, this);
 
},
 
function showPage(err, content) {
   
if (err) return response.simpleText(500, err.stack);
    render
(request, response, {
      title
: "Index",
      content
: content
   
});
 
}
);

This is the route handler for the front page of the blog. It needs data from two different async calls and can't render the main template till they're loaded. Then after the main template is rendered, the layout can be rendered. Both Git.getTags and loadAuthors output two arguments, but their errors arguments are compressed into a single err. If both emitted errors that the latter would overwrite the first.

More Advanced Patterns

You'll notice in these patterns that there is a fair bit of hacks to fit the cases where the logic isn't exactly parallel or serial. The closure variables are a kind of limited scope global. The repeated error handling code is redundant. Wouldn't it be nice if we could specify which output went to what input and chain arbitrary flows?

Conductor is born!

The other night, while talking with tmpvar(Elijah Insua), we decided it would be great to make a system that could calculate arbitrary control flows when given a set of dependencies. A few productive hours later conductor was born.

Instead of shoe-horning a problem into a preset pattern to make it easier on the computer, why don't we just explain the problem to the computer and let it figure out how to handle it for us?

Loading an Article

The example from above that uses Step could be rewritten to use Conduct (the function exported by the conductor library):

conductor1.js
var Conduct = require('conductor');

// Define the loadArticle function using Conduct from conductor.
var loadArticle = Conduct({
  M
: ["_1", function loadMarkdown(name, callback) {
   
// Async function that loads the contents of the markdown file.
   
var filename = path.join("articles", name + ".markdown");
   
Git.readFile(filename, callback);
 
}],
  P
: ["_1", "M1", function parseMarkdown(name, markdown) {
   
// Sync function that parses the markdown and adds the name property
   
var props = markdownPreParse(markdown);
    props
.name = name;
   
return props;
 
}],
  A
: ["P1", function loadAuthor(props, callback) {
   
// Async function that loads the author based on props.author
    loadAuthor
(props.author, callback);
 
}],
  F
: ["P1", "A1", function finalize(props, author) {
   
// Final sync function that attaches the author object.
    props
.author = author;
   
return props;
 
}]
}, "F1");

At first glance this looks like a classic case of over-engineering. For this simple case you'd be right, but we're keeping it simple for purposes of explanation.

There is much to explain about the conductor library, so in an effort to get this article out this year, I'll end here. It's fully functionally, but need some serious documentation. Look for more in a future article.

The true power of conductor will be realized when tmpvar finishes his visual interface to it. For now, read the commented code and have fun.

Conclusion

So which is better and why do I have three async libraries of my own. Well I think that's just a testament to the fact that there is no one library that fits all use cases perfectly. Also I've started to dive into the world of node Streams and this opens a whole new can of works. Expect future articles about node streams now that node v0.1.90 is out!

I tend to use Step mostly in my projects because it fits well with my style. For some fun working examples of Step check out the source to my new blogging engine Wheat.


Glad you liked it. Would you like to share?

Sharing this page …

Thanks! Close

Add New Comment

  • Image

Showing 9 comments

  • Tzl2000 1 comment collapsed Collapse Expand
    conductor.js may contain a bug.
    At the last of performerEngine() function body. (Step1) First it dispatches the main function parameters to dependencies, thus some dependencies may be triggerd to run and may come to the end. (Step2) Second performerEngine() executing the following code snipet:
    -----------------
    localPerformers.forEach(function (performer, name) {
          if (performer.counter === 0 && name !== '_')
    ...});
    -----------------

    Since Step1 and Step2 are runing paralleled, maybe in Step1, the localPerformers[0..n] have been deleted, so the if clause
    -------------------------------------------
    if (performer.counter === 0 && name !== '_')
    -------------------------------------------
    may get the execution crashed.

    So the checker should be added as following:
    ---------------------------------------------------------
    if (performer && performer.counter === 0 && name !== '_')
    ---------------------------------------------------------
  • Nthalk 1 comment collapsed Collapse Expand
    Creating state machines like this is complicated and tough to debug. I looked at this example and your code, and came up with a simpler method of multi-event dependency chaning:

    http://github.com/Nthalk/Lnmm/...

    The source file for observable is here:
    http://github.com/Nthalk/Lnmm/...
  • newbie 3 comments collapsed Collapse Expand
    Hi, newbie here!
    How does Conductor is different from webmachine?
  • Tim Caswell, JavaScript Hacker 2 comments collapsed Collapse Expand
    I've never used webmachine, but conductor isn't a web framework at all, it's a control-flow library for callback based programming like in nodejs. So it's an entirely different beast. Node itself is somewhat comparable to webmachine I think.
  • newbie 1 comment collapsed Collapse Expand
    It looks like I can build a webmachine flow (as seen in http://bitbucket.org/justin/we... in top of Conductor.

    There's a webmachine port to nodejs called nodemachine (http://github.com/tautologisti.... As you can see from that code, webmachine looks like a cool use case for Conductor.
  • Alex Fowler 1 comment collapsed Collapse Expand
    MAN, YOU RULE!!! I am new to js and to nodejs. Used to synched programming, so had troubles with the asynch one. But you have brought the light and now I see!!!
  • Arran Schlosberg 2 comments collapsed Collapse Expand
    A few questions... sorry if I should have been able to figure these out myself but I'm quite new to Step but think I have a decent understanding of it.

    In the conductor example what is done with F1, the second argument passed to Conduct()?

    I'm assuming that "callback" is always passed as the final argument to each of the functions. Is this the equivalent of step's "this"? If so then why not still use "this"?

    I notice that each of the arguments being passed to the particular functions refer to the function that they must be sourced from (e.g. 'A: ["P1"' means pass the return value of P / P's callback to the function). What is the relevance of the 1? What would P2 mean?
  • Arran Schlosberg 1 comment collapsed Collapse Expand
    I think I solved my last question. While a function that returns values can only go up to 1 those that utilise the callback can pass as many arguments as they want to.
  • Jon Bomgardner 1 comment collapsed Collapse Expand
    I know it's been a couple of months since this was posted but hopefully will see this and help me out......

    I just started using node recently and am trying to use the Step module. My question is if it's possible to nest calls to step? for example:

    file 1:

    exports.init = function(next){
    Step(
    function step1(){},
    function step2(){},
    function step3(){},
    next
    )};

    then in file2:

    var m = require('file1');

    Step(
    m.init(this),
    function(){}
    );

    In the above example, I don't want the second part of the Step chain to run until all of m.init() finishes which contains a Step chain. When I try this the second function runs before the functions in m.init().

    Is this possible with this library or am I doing something wrong?

    Thanks,
    Jon Bomgardner

Reactions

Trackback URL
View the discussion thread.blog comments powered byDisqus