Node.js: How Web Can Benefit from Non-Blocking I/O

Node.js: How Web Can Benefit from Non-Blocking I/O

150 150 admin

A programming language which has been around web development for a long time is now revolutionizing the status quo with a new face: JavaScript.

After years of only being considered for having a place at the browser level and only being good for client-side programming, it’s now on the server side scene, by the hand of Node.js, providing us with a steady new alternative for approaching web application development as a full stack language that can set a new bar on application performance and scalability.

How is this possible with a scripting language that has been considered by a lot of people (including me) like just another toy language? That’s what this blog post looks to explain, written by a long time supporter for the compiled languages gang.

Google V8 makes up JavaScript for more than just a “toy” language

While JavaScript is by all means a dynamic language, tagging JavaScript while running at Node.js as a fully interpreted language deserves a second thought. Google V8, the same JS engine used at Google Chrome browser [7], is designed for real fast JavaScript execution, including among its strategies, but not limited to it, machine code generation for JS at runtime [8]. So, while there will still be some time penalty making use of these dynamic features from JS, it’s not crazy to expect similar performance of the compiled languages when using Node.js. And this is just the start.

JavaScript callbacks or how to deal with single-threaded event loop

At the heart of any programming language lies a handful of concepts that start flourishing from the produced code, due to their semantic power, and sometimes they provide much more than just syntactic sugar. With JavaScript, this actually happens with one of the most used and evident features/constructs in the language, and it’s the fact that functions are first-class citizens[1]. This can be seen as one of the most common code constructs on almost every web page nowadays: How to execute JS code when a page is ready (using jQuery):

$(document).ready (function() {
     // Here goes the code you want to be executed when the page and DOM is ready
});
Notice how a function is declared at the exact point where it is going to be passed as a parameter for a different function. This, or a similar construct, can be coded with almost any programming language available out there, but callback construct is placed at the heart of JS code much more than in regular programming languages, especially for the web development stack. This is hardly a coincidence within this programming language as it was designed for its execution inside web browsers (by Netscape in 1995)[1].  This includes a single thread of execution with an event loop and getting blocked by any process, e.g. for I/O processing, which was totally prohibited, because that would block ALL code in execution on the current page. This makes it easy to develop code in an asynchronous fashion which must have been a primary goal in its design.

Non-Blocking I/O comes into the scene

As you might already know, performing I/O operations with any programming language temporarily blocks execution until any response is get. Accessing a network resource, reading a file in disk, expecting for user input at keyboard or even displaying data at a screen temporarily block current process execution and due to OS process timeshare algorithm, what truly happens is that CPU cycles doing *nothing* at that process instead of allowing another process to do what they need at that time. This is known as Blocking I/O and we have been living with it for decades. In the web server land, this is no different. Every time a web resource file is accessed, a DB query is executed or a network resource is retrieved, there are CPU cycles lost in waiting for the I/O operation to complete, instead of doing something else in the meantime.
Non Blocking I/O looks to overcome this situation, by making I/O to be processed in a asynchronous way. Instead of getting blocked by waiting for I/O process to complete, a *callback* is defined to be triggered when I/O operation completes, and CPU is made available to attend another event or code while this happens.  This is by no means exclusive for JavaScript, as this can be done in Java and .NET for a long time.
However, Node.js is designed to explicitly make use of Non Blocking I/O as a main strategy for building high performance, real-time web applications, and picking JavaScript as the language to do so is simply a straight-forward decision, given where JavaScript is coming from and how deep down its core asynchronous programming and Non Blocking I/O lies. And while not all web applications need to be qualified as real-time applications, even regular web applications with very high expectations on their performance requirements may definitely benefit out of the same strategies.

So… What’s the big deal about this?

Before we get in more complex land, let’s take a look at some results and benchmarks available in the latest years. Probably the most outstanding and commented study case was made available when PayPal build their account overview application using Node.js, which is undoubtedly one of the most traficked applications at a very high traffic website. For that reason, and as a mean for risk mitigation, they also created in parallel a backup Java version of the same application and shared their results at a blog post[2], showing up performance comparisons between their 5 cores backed up Java Application and a 1 core only Node.js application. This is the graphic shown at that blog post:

Graphic comparison of Java and Node.js web application including same features as shown at [2].Besides, PayPal development team claim that on the development side they were able to build Node.js version of this application almost twice as fast with fewer people, using 33% fewer lines of code and using 40% fewer files, all in comparison with its Java counterpart[2]. Similar claims can be found at [3] and [6].Of course all of this has resulted in a polemical subject with a lot of discussion[4][6 at comments section] in the cloud about it, but that hasn’t prevented big players like Yahoo, Microsoft, LinkedIn and EBay to have parts of their server platform backed by Node.js[5].There is enough evidence to make it easy to predict that a good share of future web applications will be built in Node.js and so they will require knowledgeable developers to maintain and support such applications.

Enough of benchmarks… Show me the code

Instead of starting with a common web task for our Hello World example code, we are going to take the console version of it, and it will be much more enlightening (at least it was for me).

This is how an improved version of Hello World, which requests the user to type its name for the greeting, looks like in Ruby:

puts “What’s your name? “

puts “Hello, ” << $stdin.readline
There is no surprise here. A message appears in the console, requesting us to type our name, and program execution holds up until we pressed Enter, to finally print a personalized greeting. Let’s take a look at the Node.js version of this.

process.stdin.once(‘data’, function(data) {

     process.stdout.write(‘Hello, ‘ + data.toString());
     process.stdin.pause();
});
process.stdout.write(‘What’s your name? ‘);
process.stdin.resume();
Whoa!? What the heck is this code doing? If you give it a try running this code, you’ll see it has exactly the same behavior as the ruby counterpart. However, let’s explain several items here:
1) stdin is an input stream which default status is paused, and needs to be resumed if we want to get any data from it.
2) ‘once’ is a method that adds a self-disposable listener for an event, which will be removed automatically after first execution.
3) ‘data’ is the event triggered by input streams in Node.js when new data is available.
If you experienced this like me, you would need to give this code at least 2 entire reads to grasp what it’s doing. What we are doing here is creating an asynchronous handler for reading a value from the keyboard, and set everything for that event to occur.
The program ends when no event handler is available and stdin stream goes back to paused status.
But let’s take it to a more common ground for the web development: Performing a query against a DB server.
This is how a direct query to WordPress database would look like in PHP:
<?php
$mysqli = new mysqli(‘host’, ‘user’, ‘pass’, ‘wordpress’);

if ($mysqli->connect_errno) {
    printf(“Connection error: %sn”$mysqli->connect_error);
}

if ($result mysqli->query(‘SELECT * FROM wp_posts’)) {
     
     while($row $result->fetch_array())
     {
          echo ‘Post titles: ‘ . $row[‘post_title’];
     }
    $result->close();
}

$mysqli->close();
And this is how it would look like in Node.JS:
var mysql = require(‘mysql’);
var connection = mysql.createConnection(‘mysql://user:pass@host/wordpress‘);
connection.connect(function(error) {
     if (error) {
          return console.error(error.message);
     }
     varqueryString=‘SELECT * FROM wp_posts’;
   connection.query(queryString,function(err,rows,fields){  
      if(err)throwerr;  
      for(variinrows){      
         console.log(‘Post Titles: ‘,rows[i].post_title);  
      }
      connection.end();
   });
}
It’s important to notice how things are executed as callbacks passed at a first moment to connect function, and then to the query function. While I/O processing is occurring for both connect and query operations, CPU is left available to attend other events at Node.js event loop.
It’s a complete different programming model what we’re seeing here.

The market will request for higher and higher performance

In the last years, enterprise web applications and internet statups web applications have gone a little different ways. Apps in the former category have been built using more traditional and time tested technology, and have only shifted when new technology help them to improve runtime performance and scalability, even if development process is not at rapid as with newer web scripting technology; those in the latter category look to provide the best user experience, which of course includes site responsiveness and quick page load, while ensuring development process helps to minimize time-to-market, as that’s critical for bootstrapping any startup business. Node.js, which as we’ve seen has been able to capture attention from big enterprise players, encloses features that represent the best of both worlds, even if that represents to include a different programming model. But that’s a price a lot of companies will be happy to pay if that allow them to set a new record in their page load times. Once quick page loading times become a regular thing to have, trying to sell slow webpages will just be like selling a Pentium III processor in our current world of multicore, vertiginously fast processors, like those even the smallest web apps are served with.
—–
This post was written by our Sr. Dev Arístides Castillo, and here are his references: