Troubles with Asynchronous code flows in JavaScript, and the async/await solution of ES-2017

Date: Sat Jun 17 2017 JavaScript »»»» Asynchronous Programming »»»» Node Web Development »»»» Node.JS
Asynchronous coding is one of those powerful tools that can bite, if you're not careful. Passing around anonymous so-called "callback" functions is easy, and we do it all the time in JavaScript. Asynchronous callback functions are called, when needed, as needed, sometime in the future. The key result is that code execution is out-of-order with the order-of-appearance in the code.

NOTE: This is an excerpt from Asynchronous JavaScript with Promises, Generators and async/await, a new book that deeply studies asynchronous programming techniques in JavaScript

For years, JavaScript's primary use-case has been event handlers in web pages, a form of asynchronous coding. The term AJAX describes asynchronous JavaScript execution associated with retrieving data from a server. In Node.js the whole runtime library is built around asynchronous callbacks invoked by asynchronous events. It is a powerful model, indeed, but there's a slippery slope down which it's easy to slide into a tangled mess of unmaintainable code.

JavaScript's single-threaded model would otherwise be a major performance problem. For example, on a web page that's to be dynamically filled with data retrieved from a server, each retrieval would block the whole browser until that data arrived from the server. The more data retrievals, the longer the browser is blocked. Today's world of "single-page-applications" simply would not work because such app's are constantly retrieving data from web-services. Fortunately, JavaScript's model supports asynchronous code by way of these anonymous functions.

Browser events occur when they occur, trigger associated event handlers, executing event handler callbacks in a corresponding order. Likewise, a web service (for Node.js) asynchronously distributes incoming events to their handlers with the events dictating which callbacks are invoked in what order. This is useful and powerful, but often the programmer ends up struggling to organize the resulting code so it can be understood/maintained/debugged.

We start out by writing benign code like these snippets:


// Browser
$('.day').click(function() {
   window.location = "com_day_graph";
});
// Node.js
const fs = require('fs');
...
fs.readFile('example.txt', 'utf8', function(err, txt) {
    if (err) report the error;
    else {
        do something with the text
    }
});

Each of those are pretty simple, and the idea is straight-forward. You pass an anonymous function to another function, so that function can call back to your code, hence the name "callback function".

As your your programming confidence grows you do more and more with callbacks. You might need to tack a new bit of functionality at the end of an existing event handler, requiring a new callback function. Or processing a certain task may require several steps, all of which are asynchronous. Sooner-or-later you end up with Callback Hell or what some call the Pyramid of Doom.

The first circle of Callback Hell is entered when you write a callback function and need to invoke another, and think to yourself it's far easier to just implement the second callback function right there, nested inside the first. JavaScript is perfectly fine with you nesting stuff inside other stuff. We do it all the time.

Then one day you must write a task where, before reading a file you must first verify the directory exists (using fs.stat), then read the directory (fs.readdir), look through the file list for the one or more files to read, check that each is readable (fs.stat again), and finally read each file (fs.readFile), and then do something with the file content like store it in a database (db.query). Each stage of that means a callback function, and might look like this:


fs.stat(dirname, function(err, stats) {
    if (err) handle error;
    else {
        fs.readdir(dirname, function(err, filez) {
            if (err) handle error;
            else {
                async.eachSeries(filez,
                function (filenm, next) {
                    fs.stat(path.join(dirname, filenm),
                                function(err, stats) {
                        if (err) next(err);
                        else {
                            // check stats
                            fs.readFile(path.join(dirname, filenm),
                                        'utf8', function(err, text) {
                                if (err) next(err);
                                else {
                                    db.query('INSERT INTO ... ', [ text ],
                                                function(err, results) {
                                        if (err) next(err);
                                        else {
                                            do something with results;
                                            Indicate success;
                                            next();
                                        }
                                    });
                                }
                            });
                        }
                    });
                },
                function(err) {
                    if (err) {
                        console.error(err);
                        do something else;
                    } else {
                        do what;
                    }
                });
            }
        });
    }
});

This can get quite daunting. Because the fs.stat and fs.readFile and db.query operations are asynchronous, we cannot throw a simple for loop around the middle portion of this. We need a looping construct which can handle asynchronous execution. The async module has a large variety of asynchronous execution constructs, with async.eachSeries acting like a for loop executing the iteration function once for each entry in the provided array.

Each of these steps in this operation is asynchronous, and therefore requires another level of asynchronous callback functions. After awhile you're going to get lost in error handlers and making sure the results bubble up to the right place.

The problem with this code example is you have a sequence of steps, but the expression of those steps are nested function calls. The code is not written as a series of steps, and therefore the structure is all wrong for what your code describes. While it works, it's all sideways and kerflunky. Since it's a sequence of steps, shouldn't it be written as a sequence in your code?

This is the kind of code that works - but you don't want to touch it, and you hope that no bugs will be reported against it. But what if the Marketing Department wants a new feature -- that every Thursday the code must insert tributes to the Norse God Thor because, well, Thursday is Thor's Day? Then a few weeks later they want Friday's tribute to be to fish Friar's .. because. In other words, this code example is already is already a careful balancing act, and adding (or subtracting) anything is risky. Even fixing a bug is risky.

The generalized problem is this:


db.query('SELECT * FROM foobar', function(err, results) {
    if (err) {
        // errors arrive here
    } else {
        // data arrives here
    }
});
// While we'd prefer both errors and data to arrive here
// The data cannot arrive here because it's trapped inside the inner namespace

The very design of asynchronous coding in JavaScript leads us in an inconvenient direction. Errors and data inconveniently end up in the wrong place. To compensate the code inevitably takes on that pyramid shape.

Over the years programmers have developed various strategies to deal with the problem. Carefully chosen programming strategies help a lot. For example, Douglas Crawford's book JavaScript The Good Parts is a great example of that approach. From beginning to end it is full of carefully chosen strategies to enhance the good parts of JavaScript while ignoring the so-called "bad parts". In other cases libraries provide useful, well tested, asynchronous programming constructs one can easily use. Those are useful in the way a bandage helpfully stops the bleeding when you're cut. The bleeding might be staunched, but there is still pain.

One symptom is that the intent of our code is obscured. As an example, consider looping over a data structure with an asynchronous operation on each element. You can't do this with a simple loop because the JavaScript we grew up in doesn't synchronize the loop with asynchronous code. Various libraries are available to make this easy, but the code is not expressed as a simple loop. Instead it's expressed as a library function call with several callback functions representing different stages of loop execution.

For example an asynchronous do {..} while() loop structure might look like this:


async.doWhilst(
    next => {
        ... asynchronous operation
        next(); // call next when done
    },
    () => {
        if (test whether to continue) return false;
        else return true;
    },
    err => {
        if (err) console.error(err.stack);
        else {
            .. do something with result
        }
    }
);

While this is orderly, the intention is obscured. Instead of the simple loop structure everyone knows, we have a function call. The reader must know the library in order to interpret this function call, and to understand that it implements a loop.

Currently, the JavaScript language is being reworked to address the asynchronous programming problem. New language features, ushered in by the ECMAScript ES-2015, ES-2016, and subsequent revisions to JavaScript, make it easier to write asynchronous JavaScript code that's more reliable than older techniques. In this book we'll explore these new features, and how they solve (er... mitigate) these problems.

The Promise object came with ES-2015 and goes a long way towards simplifying most asynchronous patterns. Using Promises, it is easy to code a "Promise Chain" that is relatively flat, is written as a series of steps, and much easier to understand. You're still left with the problem of errors and data arriving in places other than where they belong. But, the code is more orderly and easier to understand.

Another ES-2015 feature, Generators, are complex. If used properly we write asynchronous code that looks like synchronous code like we'd write in other languages. When wrapped with the co library, asynchronous magic happens inside a Generator function allowing us to write clear code which matches our intention.

The "async/await" proposal will, if approved, extend the JavaScript language with a similar facility. The programmer declares their asynchronous function by prepending the function with the async keyword. Inside an async function the await keyword lets you easily write code which appears to be synchronous, while under the cover Promise objects are used to manage asynchronous code execution. The result is an amazing breath of fresh air. By the end of this book you should be breathing deeply the async/await magic. In the meantime Generators, coupled with the co library, perform most of the async/await magic today.

The Promise object with Generator's and the async/await proposal are a huge improvement over the older callback-oriented paradigm. It's such a big improvement that the whole JavaScript ecosystem should move to the Promise-based paradigm. This new paradigm makes code easier to read, improves error handling and helps to get rid of nasty “callback trees”. It's not a panacea, of course, and comes with some baggage of its own, but the Promise-based paradigm is so compelling it's worth the effort to make the change.

The earlier example could be rewritten as so:


async function doAsynchronously(dirname) {
    var stats = await fs.statAsync(dirname);
    if (!stats.isDirectory()) {
        throw new Error("Not a directory: "+ dirname);
    }
    var filez = await fs.readdirAsync(dirname);
    for (var filenm of filez) {
        var filestats = await fs.statAsync(path.join(dirname, filenm));
        if (!stats.isFile()) {
            // maybe instead, log an error and continue
            throw new Error("Not a file: "+ filenm);
        }
        var text = await fs.readFileAsync(path.join(dirname, filenm), 'utf8');
        await new Promise((resolve, reject) => {
            db.query('INSERT INTO ... ', [ text ],
            function(err, results) {
                if (err) reject(err);
                else {
                    do something with results;
                    Indicate success;
                    resolve();
                }
            });
        });
    }
}

What a breath of fresh air!

This landscape is discussed in a new book: Asynchronous JavaScript with Promises, Generators and async/await. The book goes deeply into several ways to tame asynchronous coding in JavaScript. The above is an excerpt from the first chapter.