Chapter 7. Important External Modules

Although the Node core is extremely useful, many of its abstractions are very low-level. So a lot of development in Node is done using higher abstraction libraries built by the community, similar to how Ruby-based websites use Rails or Sinatra rather than custom-written Ruby code. Although these modules aren’t technically Node itself, they are extremely important for getting things done, and many of them are mature projects in themselves. This chapter explores some of the most popular and useful community modules for Node.

Express

Express, an MVC framework for Node, is probably the most widely used Node module. It was inspired by the Sinatra framework for Ruby and enables a lot of features that make it very easy to throw together a website with Node.

A Basic Express App

Express works by defining page handlers for routes. The routes can be as simple as a path, or much more complex. The handlers could be as simple as emitting “Hello, world” or as complex as a whole page-rendering system that interacts with a database. You’ll need to install Express using npm install express before you can start using it. Example 7-1 shows how to create a simple application with Express.

Example 7-1. Creating a simple Express app
var express = require('express');

var app = express.createServer();

app.get('/', function(req, res) {
  res.send('hello world');
});

app.listen(9001);

This code is obviously pretty similar to http in terms of creating a server. However, a few things are a lot more straightforward. First, app.get() is creating a response to a specific route—in this case, '/'. Unlike a regular http server, which provides a listener for generic requests, Express offers a listener for specific HTTP verbs. So get() will answer only GET requests, put() will answer only PUT requests, etc. Combine that with the route we specified, and you immediately have some powerful functionality. A typical Express program specifies a series of expressions, and Express matches the route in each incoming request against each expression in turn, executing the code associated with the first expression that matches.

It is possible to have Express skip over expressions under certain conditions, using the next() function discussed later in this section.

The next thing to notice in the example is how we responded. We still use the response object as in http, but Express has provided a send() method. We didn’t need to provide any HTTP headers or call end(). The send() method figures out things such as the HTTP headers that should be sent and includes end() automatically.

The point here is that Express takes the basic structure laid out by http and enriches it significantly with a lot of functionality to create real applications quickly. You shouldn’t have to create routing code every time you want to deal with HTTP requests, so Express takes care of that stuff.

Setting Up Routes in Express

Routes are one of the core concepts in Express, and one of the things that make it really useful. As mentioned in the previous section, routes are applied to an HTTP verb via a method with the same name, such as get() or post(). The routes consist of a simple string or a regex and can contain variable declarations, wildcards, and optional key flags. Let’s take a look at some examples, starting with Example 7-2.

Example 7-2. Route with variable and optional flag
var express = require('express');
var app = express.createServer();

app.get('/:id?', function(req, res) {
  if(req.params.id) {
    res.send(req.params.id);
  } else {
    res.send('oh hai');
  }
});

app.listen(9001);

This example shows a route that includes an optional variable called id. The variable name does not have any special meaning to Express, but it will be available to use inside the callback function. In Express routes, you use a preceding colon (:) to mark a variable you want preserved. The string passed in the URL will be captured into the variable. All routes in Express are actually turned into regular expressions (more on this later) and tokenized[16] for use by application code.[17] The regex used will match up to the next known token in your route. Notice that this variable is also optional. If you run this program and go to http://localhost:9001, you’ll just get “oh hai” back because you did not put a slash after the port, and the variable part of the route was optional. If you append anything else (so long as you don’t include another /), you’ll get it back as your response body; matching the id token, it will be stored in req.params.id.

Express routes will always treat / as a token, but they will also treat it as optional if it terminates the request. So our route /:id? will match localhost, localhost/ localhost/tom, and localhost/tom/, but not localhost/tom/tom.

Routes can also use wildcards, as shown in Example 7-3. (*) will match anything except the token following it (nongreedy regex matching).

Example 7-3. Using wildcards in routes
app.get('/a*', function(req,res) {
  res.send('a');
  //matches /afoo /a.bar /a/qux etc.
});

app.get('/b*/c*d', function(req,res) {
  res.send('b');
  //matches /b/cd /b/cfood /b//c/d/ etc.
  //does not match /b/c/d/foo
});

app.get('*', function(req, res) {
  res.send('*');
  //matches /a /c /b/cd /b/c/d /b/c/d/foo
  //does not match /afoo /bfoo/cbard
});

When you use a wildcard to make routes, any tokens between the wildcards must match, unless they are optional. Wildcards are often used for things such as filenames containing periods (.). It’s also important to notice that unlike in many regular expression languages, * does not mean zero or more characters; it means one or more characters. A forward slash (/) can be considered a character when matching with wildcards.

Another important thing to note is that routes are ordered. Multiple routes can match a given URL, but only the first one that matches will trigger the associated activity. This means that the order in which routes are defined is very significant. In the previous example, the general wildcard will catch everything that wasn’t already caught by a previous route, even though it matches all of them.

You can also use regexes to define routes (Example 7-4). If you do this, router won’t process the regex any further. Because you still might want to get variables out of the URL, you can use captures to define them.

Example 7-4. Using a regex to define a route
var express = require('express');
var app = express.createServer();

app.get(/\/(\d+)/, function(req, res) {
  res.send(req.params[0]);
});

app.listen(9001);

In this example, the regex will match only URLs that start with a number (\d matches any digit, and the + allows one or more to match). This means that / will not match, but /12 will. However, the regex checking uses RegExp.match(), which finds a regex inside a larger string. This means that /12abc will also match. If you want to make sure that a regex represents the complete route, use the $ token at the end of the regex, such as /\/(\d+)$/. $ checks for the end of the line, so the regex will match only if it terminates. You probably want to keep the default Express behavior of loosely matching a / at the end of URLs, though. Do this with \/?$ instead of just $, to allow an optional / at the end of the string.

Notice how we accessed the capture in our regex in Example 7-4. If you use a regex for your route, you can use req.params as an array to access the captures as variables. This also works when router converts your route to a regex, but you probably want to use the variable names in that case, as we showed earlier. You can also use regex to make better-named variables in routes by constraining what will match that variable, as in Example 7-5.

Example 7-5. Use regex to be more specific about variable types
var express = require('express');
var app = express.createServer();

app.get('/:id(\\d+)', function(req, res) {
  res.send(req.params[0]);
});

app.listen(9001);

This example constrains the id parameter to numbers by asking route to match only numbers with the regex \d+. The capture will still be exposed as req.params.id, but it will match only if the regex matched. Because the regex is highly flexible, you can use this technique to capture or restrict URL matching to pretty much anything while still getting named variables to use. Remember to escape any backslash (\) you use in JavaScript strings. (This was not necessary in Example 7-4, because it used a regular expression directly rather than inside a string.)

Sometimes there are multiple routes that match a URL that you want to use in various circumstances. We’ve already seen that the order in which routes are defined is significant in determining which will be selected. However, it is possible to pass control back to the next route if some criteria isn’t met (Example 7-6). This is a great option for a number of scenarios.

Example 7-6. Passing control to another route
app.get('/users/:id', function(req, res, next){
  var id = req.params.id;

  if (checkPermission(id)) {
    // show private page
  } else {
    next();
  }
});

app.get('/users/:id', function(req, res){
  // show public user page 
});

We’ve added another argument to the function that handles the routes. The next argument tells the router middleware (we’ll discuss middleware shortly in more depth) to call the next route. The argument is always passed to the callback, but this example is the first where we choose to name and use it. In this case, we can check the id to see whether the user has permission to view the private version of this page, and if not, send her to the next route, which has the public version.

This combines really well with app.all(), the method that describes all HTTP verbs. As Example 7-7 demonstrates, we can capture across a range of HTTP verbs and routes, apply some logic, and then pass control onto more specific routes.

Example 7-7. Using app.all( ) to select multiple HTTP verbs and routes and then pass control back
var express = require('express');

var app = express.createServer();

var users = [{ name: 'tj' }, { name: tom }];

app.all('/user/:id/:op?', function(req, res, next){
  req.user = users[req.params.id];

  if (req.user) {
    next();
  } else {
    next(new Error('Cannot find user with ID: ' + req.params.id));
  }
});

app.get('/user/:id', function(req, res){
  res.send('Viewing ' + req.user.name);
});

app.get('/user/:id/edit', function(req, res){
  res.send('Editing ' + req.user.name);
});

app.put('/user/:id', function(req, res){
  res.send('Updating ' + req.user.name);
});

app.get('*', function(req, res){
  res.send('Danger, Will Robinson!', 404);
});

app.listen(3000);

This example is similar to Example 7-6, in that we are validating whether a user exists before passing on control. However, we are not doing this only for all the subsequent routes; we are also doing it across all HTTP verbs. Normally when only one route matches, this doesn’t make any difference, but it’s important to note how you can pass state between routes.

When the req.user attribute is added in the app.all() method, it is available in all the subsequent methods because the middleware owns the request object. When each callback is fired, the variable .req is really a pointer to the request object owned by the middleware, and any changes to the request object are visible to every other function and route using the middleware.

Example 7-8 shows how a file extension can be made either optional or mandatory within a specific range. In the first get(), the :format parameter is optional (as denoted by the question mark), so Express will respond to requests for a user by ID, regardless of which format has been requested. It is up to the programmer to capture the formats (JSON, XML, text, etc.) via a switch statement in order to do special processing.

In the second example, the :format parameter looks for json or xml as predefined file types. If those are not found, the book request will not be processed, regardless of whether the :id parameter is valid. This gives us greater control over which requests are responded to and ensures that only formats for which a view can be generated are available to respond.

Example 7-8. Optional and required route extensions
var express = require('express');
var app = express.createServer();

app.get('/users/:id.:format?', function(req, res) {
  res.send(req.params.id + "<br/>" + req.params.format);
  // Responds to:
  // /users/15
  // /users/15.xml
  // /users/15.json
});

app.get('/books/:id.:format((json|xml))', function(req, res) {
  res.send(req.params.id + "<br/>" + req.params.format);
  // Responds to:
  // /books/7.json
  // /books/7.xml
  // But NOT to:
  // /books/7
  // /books/7.txt
});

app.listen(8080);

Handling Form Data

Most examples have demonstrated the GET verb, but Express is built to support RESTful architecture in the style of Ruby on Rails. Using hidden fields inside web forms, you can indicate whether a form’s intention is to PUT (replace data), POST (create data), DELETE (remove data) or GET (retrieve data). See Example 7-9.

Example 7-9. Handling forms using Express
var express = require('express');
var app = express.createServer();

app.use(express.limit('1mb'));
app.use(express.bodyParser());
app.use(express.methodOverride());

app.get('/', function(req, res) {
  res.send('<form method="post" action="/">' +
           '<input type="hidden" name="_method" value="put" />' +
           'Your Name: <input type="text" name="username" />' +
           '<input type="submit" />' +
           '</form>');
});

app.put('/', function(req, res) {
  res.send('Welcome, ' + req.body.username);
});

app.listen(8080);

This simple application demonstrates the use of a form. First, an Express application is created and configured to use the bodyParser() and methodOverride() functions. The bodyParser() function parses the request body sent by the web browser and translates form variables into objects usable by Express. The methodOverride() function allows the hidden _method variable in form posts to override the GET method in favor of the RESTful method types.

The express.limit() function instructs Express to limit the length of request bodies to 1 MB. This is an important security consideration because otherwise it would be possible to send a large post to the application to be processed by bodyParser(), making it very easy to launch a denial-of-service (DoS) attack.

Be sure to call methodOverride() after bodyParser(). Otherwise, the form variables will not be processed when Express checks to see whether it should be responding to a GET or some other command.

Template Engines

Clearly, it isn’t practical to continue writing HTML directly in application code. For starters, it is unreadable and unmaintainable; but more importantly, it is bad form to mix application logic with presentation markup. Template engines allow developers space to focus on how to present information to the user—often in different formats, such as screen or mobile—and inject specific data separately from processing.

Express is minimalist and does not come with built-in template engines, opting instead for community-supported modules. Some of the more popular engines are Haml, Jade, Embedded Javascript (EJ), CoffeeKup (a CoffeeScript-based engine), and jQuery templates.

In Example 7-10, an application is set up to render a simple Jade template.

Example 7-10. Using a basic Jade template in Express
var express = require('express');
var app = express.createServer();

app.get('/', function(req, res) {
  res.render('index.jade', { pageTitle: 'Jade Example', layout: false });
});

app.listen(8080);

To run this example, you will need to install the Jade template engine:

npm install jade

The first thing to notice is the lack of any reference to the Jade library. Express parses the view template’s filename and uses the extension (in this case, the jade from index.jade) to determine which view engine should be used. Therefore, it is possible to mix and match different view engines into the same project. You are not limited to using only Jade or only CoffeeKup, for example; you can use both.

This example passes two arguments into the render function. The first is the name of the view to display, and the second contains options and variables needed for the rendering. We’ll come back to the filename in a minute. There are two variables passed into the view in this example: pageTitle and layout. The layout variable is interesting in this case because it is set to false, which instructs the Jade view engine to render the contents of index.jade without first going through a master layout file (more on this later).

pageTitle is a local variable that will be consumed by the contents of the view. It represents the point of templating: whereas the HTML is specified mostly in index.jade file, that file has a placeholder named pageTitle where Jade will plug in the value we provide.

The file (index.jade) from the first parameter needs to be placed in the views folder (/views/index.jade) and looks like Example 7-11.

Example 7-11. A basic Jade file for Express
!!! 5
html(lang="en")
  head
    title =pageTitle
  body
    h1 Hello, World
    p This is an example of Jade.

After Jade plugs in the value for pageTitle that we supplied, the page renders as:

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Jade Example</title>
  </head>

  <body>
    <h1>Hello, World</h1>
    <p>This is an example of Jade.</p>
  </body>
</html>

The Jade template aims to make the page more succinct by paring down the markup to the bare minimum. Instead of the closing tags you may be accustomed to in HTML, Jade uses indentation to communicate position within the page’s hierarchy, resulting in a clean and generally easy-to-read file.

The very first line, "!!! 5", identifies the content type as HTML5, manifesting as an HTML5 doctype in the resulting output. The default document types supported by Jade are 5, xml, default (which is XHTML 1.0 Transitional), transitional (the default), strict, frameset, 1.1, basic, and mobile. You can supply your own, though, such as doctype html PUBLIC "-//W3C//DATA XHTML Custom 1.10a//DE".

Look in the title tag on the fourth line of the Jade input. The string =pageTitle is interpreted by Jade as “insert the contents of the variable named pageTitle here.” In the resulting output, this becomes Jade Example, the value provided by the previous application code.

As we mentioned, there are many other templating options, each of which does essentially what Jade does, but with different syntax and conventions.

Layouts and partial views

Layouts allow views to share common structural elements in your site, providing an even greater separation of content and data. By standardizing parts of the layout, such as navigation, header, and footer, you can focus your development efforts on the actual content for each view.

Example 7-12 takes the view engine concept already discussed and turns it into a “real” website.

Example 7-12. Defining global template engines in Express
var express = require('express');
var app = express.createServer();

app.set('view engine', 'jade');

app.get('/', function(req, res) {
  res.render('battlestar')
});

New to this example is the set command on the “view engine” parameter. The Jade view engine will now be considered the default by Express, although it is still possible to override it in the render method.

The render function is markedly different. Because the Jade engine has been set as the default view engine, this example does not need to specify the full filename, so battlestar actually refers to /views/battlestar.jade. The layout: false parameter from Example 7-10 is no longer needed, because this time Express will be making use of this layout file located at views/layout.jade, shown in Example 7-13.

Example 7-13. A Jade layout file in Express
html
  body
    h1 Battlestar Galactica Fan Page
    != body

The layout file is very similar to the view file created earlier, but in this case there is a special body variable. We’re talking here about the != body line; please don’t confuse that with the body keyword near the top of the file. The second body is not the name of a variable passed in through the application code, so where does it come from?

When the layout option is set to true (the default) in Express, the render method works by parsing the contents of the first parameter and passing the rendered output to the layout as a variable called body. The battlestar.jade file looks like Example 7-14.

Example 7-14. A Jade partial view in Express
p Welcome to the fan page.

This is called a partial view because it does not contain the full content needed to generate a page, and it needs to be combined with a layout to become useful output. The final web browser output for all this work looks like this:

<html>
  <body>
    <h1>Battlestar Galactica Fan Page</h1>
    <p>Welcome to the fan page.</p>
  </body>
</html>

Partial views are powerful because they allow developers to focus on the specific content being displayed, rather than the web page as a whole. This means the contents don’t have to be tied to a web page and can be output to mobile web pages, AJAX requests (for in-place page refreshes), and more.

Be careful not to confuse the variable named body, which contains the actual content of your view, with the keyword body, which is an HTML tag used by the web browser.

Middleware

Some of the examples up to this point have included a rather innocuous-looking function: app.use(). This function invokes the Connect library and exposes many powerful tools that make it simple to add functionality. Now it's time to take a step back and examine what all this glue—known as middleware—is, and why it is so important to developing with Express.

Although it might sound like one of those obscure buzzwords that programmers like to use when they want to appear “in the know,” middleware—as we’ve mentioned in previous chapters—refers to a piece of software that acts as a link between two other programs, typically between a higher-level application and a wider network. In the real world, middleware is analogous to the telephone lines you might find in your home or office building. All telephones (applications) connect to the same telephone lines (middleware), which in turn broker communication from the application to the underlying network.

Your phone may or may not support call waiting or voicemail, but the line behaves the same, regardless of which features are available to you. You may have voicemail built into your phone, or it may be provided by your telco (network); in either case, the line itself is happy to support your usage.

Connect provides the middleware functionality used by Express (see Table 7-1). As shown in Figure 7-1, Connect extends Node’s base http module, giving it all of the base capabilities provided by http, upon which it adds its own functionality. Express in turn inherits from Connect, gaining its abilities and, by extension, http’s as well. Any module plugged into Connect is automatically made available to Express. Connect is the middle layer between Express and the network, and as such exposes and uses a myriad of features that may not be used directly by Express, but are available all the same. Finally, because Express derives itself from Connect, most of Connect’s functionality is available directly from Express, allowing you to issue commands such as app.bodyParser() rather than connect.bodyParser().

Express’s middleware stack
Figure 7-1. Express’s middleware stack
Table 7-1. Middleware bundled with Connect
Name Description
basicAuth Accepts a callback function that accepts username and password parameters, then returns true if the credentials are permitted access to the site.
bodyParser Parses the contents of the request body.
compiler Compiles .sass and .less files to CSS and CoffeeScript files to JavaScript.
.cookieParser Parses the contents of cookies sent by the web browser in the request headers.
csrf Provides cross-site request forgery (CSRF) protection by mutating the request through an additional form variable. Requires session and bodyParser middleware.
directory Prints directory listings inside a root path, with options to display hidden files and icons.
errorHandler Traps errors encountered by the application and provides options to log errors to stderr or request output in multiple formats (JSON, plain text, or HTML).
favicon Serves favicon files from memory, with cache control.
limit Limits the size of requests accepted by the server to protect against DoS attacks.
logger Logs requests to output or a file, in multiple formats, either on response (default) or on request. Optional buffer size controls how many requests are collected before writing to disk.
methodOverride Combine with bodyParser to provide DELETE and PUT methods along with POST. Allows for more explicit route definitions; for example, use app.put() rather than detecting the user’s intention from app.post(). This technique enables RESTful application design.
profiler Typically placed before all other middleware, profiler records the response time and memory statistics for requests.
query Parses query strings and populates the req.query parameter.
responseTime Populates the X-Response-Time header with the time (in milliseconds) to generate a response.
router Provides advanced routing (discussed in “Setting Up Routes in Express”)
session The session manager for persisting user data across requests.
static Enables streaming of static files from a root directory. Allows for partial downloads and custom expiry aging.
staticCache Adds a caching layer to the static middleware, keeping the most popular downloaded files in memory for greatly improved response times.
vhost Enables multiple sites for different vhosts on a single machine.

Middleware factories

By now you may have noticed that middleware consists of little more than functions that are executed sequentially by Express. JavaScript closures give us a lot of power to implement the factory pattern[18] inside Node, which can be exploited to provide contextual functionality to your web routes.

Express’s routing functions use internal middleware during their processing cycle, which you can override to add extra functionality—for example, to add custom headers to your HTML output. Let’s look at Example 7-15 and see how we can use a middleware factory to intercept a page request and enforce role-based authentication.

Example 7-15. Middleware factories in Express
var express = require('express');
var app = express.createServer(
  express.cookieParser(),
  express.session({ secret: 'secret key' })
);

var roleFactory = function(role) {
  return function(req, res, next) {
    if ( req.session.role && req.session.role.indexOf(role) != -1 ) {
      next();    } else {
      res.send('You are not authenticated.');
    }
  }
};

app.get('/', roleFactory('admin'), function(req, res) {
  res.send('Welcome to Express!');
});

app.get('/auth', function(req, res) {
  req.session.role = 'admin';
  res.send('You have been authenticated.');
});

app.listen(8080);

Right off the bat, if you visit http://localhost:8080/ you will receive the message “You are not authenticated.” However, if you look at the contents of the route for '/', you will notice that the actual page contents are 'Welcome to Express!'. The second parameter, roleFactory('admin'), launched before the page was displayed and detected that there was no role property in your session, so it output its own message and stopped the page execution.

If you visit http://localhost:8080/auth followed by http://localhost:8080/ you will receive the “Welcome to Express!” message. In this circumstance, the /auth URL attached the 'admin' variable to your session’s role property, so when roleFactory was executed it passed the execution control to next(), which is the app.get('/') function.

Therefore, it could be said that by using internal middleware, we changed the order of execution to:

  1. roleFactory('admin')

  2. app.get('/')

What if we wanted to authenticate based on more than one role? In that case, we could change the route to:

var powerUsers = [roleFactory('admin'),roleFactory('client')];
app.get('/', powerUsers, function(req, res) {
  res.send('Welcome to Express!');
});

Because we passed an array as the middleware, we have limited the page execution to users belonging to the “admin” and “client” roles, and changed the order of execution to:

  1. roleFactory('admin')

  2. roleFactory('client')

  3. app.get('/')

Because each roleFactory demands that the role be present in the session, the user must be both a “client” and an “admin” in order to access the page.

Socket.IO

Socket.IO is a simple little library that’s a lot like Node’s core net library. Socket.IO allows you to send messages back and forth with browser clients that connect with your Node server, using an efficient, low-level socket mechanism. One of the nice things about the module is that it provides a shared interface between the browser and the server. That is, you can write the same JavaScript on both in order to do messaging work once you have a connection established.

Socket.IO is so named because it supports the HTML5 WebSockets standard on browsers that support it (and have it enabled). Fortunately, the library also supports a number of fallbacks:

  • WebSocket

  • WebSocket over Flash

  • XHR Polling

  • XHR Multipart Streaming

  • Forever Iframe

  • JSONP Polling

These options ensure that you’ll be able to have some kind of persistent connection to the browser in almost any environment. The Socket.IO module includes the code to power these connection paths on both the browser and the server side with the same API.

Instantiating Socket.IO is as simple as including the module and creating a server. One of the things that’s a little different about Socket.IO is that it requires an HTTP server as well; see Example 7-16.

Example 7-16. Creating a Socket.IO server
    var http = require('http'), 
    io = require('socket.io');
    
server = http.createServer();
server.on('request', function(req, res){
  //Typical HTTP server stuff
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World');
});

server.listen(80);

var socket = io.listen(server);

socket.on('connection', function(client){
  console.log('Client connected');
});

The HTTP server in this example could do anything. In this case, we simply return “Hello World.” However, Socket.IO doesn’t care what the HTTP server will do; it simply wraps its own event listener around all requests to your server. This listener will look for any requests for Socket.IO’s client libraries and service these requests. It passes on all others to be handled by the HTTP server, which will function as usual.

The example creates a socket.io server by calling io.listen(), which is a factory method for the Listener class. listen() takes a function as an argument, which it uses as a callback when a client connects to the server. Because the sockets are persistent connections, you aren’t dealing with a req and res object as you do with an HTTP server. As with net, you need to use the passed client object to communicate with each browser. Of course, it’s also important to have some code in the browser (Example 7-17) to interact with the server.

Example 7-17. A small web page to interact with a Socket.IO server
<!DOCTYPE html>
<html>
  <body>
    <script src="/socket.io/socket.io.js"></script> 
    <script> 
      var socket = io.connect('http://localhost:8080');
      socket.on('message', function(data){ console.log(data) }) 
    </script> 
  </body>
</html>

This simple page starts by loading the necessary Socket.IO client library directly from the Node server, which is localhost on port 8080 in this case.

Although port 80 is the standard HTTP port, port 8080 is more convenient during development because many developers run web servers locally for testing that would interfere with Node’s work. In addition, many Linux systems have built-in security policies preventing nonadministrator users from using port 80, so it is more convenient to use a higher number.

Next, we create a new Socket object with the hostname of the Socket.IO server we are connecting to. We ask the Socket to connect with socket.connect(). Then we add a listener for the message event. Notice how the API is like a Node API. Whenever the server sends this client a message, the client will output it to the browser’s console window.

Now let’s modify our server to send this page to clients so we can test it (Example 7-18).

Example 7-18. A simple Socket.IO server
    var http = require('http'), 
    io = require('socket.io'),
    fs = require('fs');

var sockFile = fs.readFileSync('socket.html');

server = http.createServer();
server.on('request', function(req, res){
  res.writeHead(200, {'content-type': 'text/html'});
  res.end(sockFile);  
});

server.listen(8080);

var socket = io.listen(server);

socket.on('connection', function(client){
  console.log('Client connected');
  client.send('Welcome client ' + client.sessionId);
});

The most striking change in this example is the addition of the fs.readFileSync function, which brings the web page’s external file into the socket server. Now instead of responding to web browser requests with “Hello World,” the Node server will respond with the contents of socket.html. Because readFileSync is a synchronous function, it will block Node’s event loop until the file is read, ensuring that the file is ready to be delivered to clients immediately when the server becomes available for connections.

Now whenever anyone requests anything from the server, unless it is a request to the Socket.IO client library, he will get a copy of socket.html (which might be the code in Example 7-17). The callback for connections has been extended to send a welcome message to clients, and a client running the code from Example 7-18 might get a message in its console like Welcome client 17844937089830637. Each client gets its own sessionId. Currently, the ID is an integer generated using Math.random().

Namespaces

Creating websockets as shown is fine when you are in full control of your application and architecture, but this will quickly lead to conflicts when you are attaching them to an existing application that uses sockets or when you are writing a service to be plugged into someone else’s project. Example 7-19 demonstrates how namespaces avoid this problem by effectively dividing Socket.IO’s listeners into channels.

Example 7-19. A modified web page to interact with Socket.IO namespaces
<!DOCTYPE html>
<html>
  <body>
    <script src="/socket.io/socket.io.js"></script>
    <script>
      var upandrunning = io.connect('http://localhost:8080/upandrunning');
      var weather = io.connect('http://localhost:8080/weather');
      upandrunning.on('message', function(data){
        document.write('<br /><br />Node: Up and Running Update<br />');
        document.write(data);
      });
      weather.on('message', function(data){
        document.write('<br /><br />Weather Update<br />');
        document.write(data);
      });
    </script>
  </body>
</html>

This updated socket.html makes two Socket.IO connections, one to http://localhost:8080/upandrunning and the other to http://localhost:8080/weather. Each connection has its own variable and its own .on() event listener. Apart from these differences, working with Socket.IO remains the same. Instead of logging to the console, Example 7-20 displays its message results within the web browser window.

Example 7-20. A namespace-enabled Socket.IO server
var sockFile = fs.readFileSync('socket.html');

server = http.createServer();
server.on('request', function(req, res){
  res.writeHead(200, {'content-type': 'text/html'});
  res.end(sockFile);
});

server.listen(8080);

var socket = io.listen(server);

socket.of('/upandrunning')
  .on('connection', function(client){
    console.log('Client connected to Up and Running namespace.');
    client.send("Welcome to 'Up and Running'");
});

socket.of('/weather')
  .on('connection', function(client){
    console.log('Client connected to Weather namespace.');
    client.send("Welcome to 'Weather Updates'");
});

The function socket.of splits the socket object into multiple unique namespaces, each with its own rules for handling connections. If a client were to connect to http://localhost:8080/weather and issue an emit() command, its results would be processed only within that namespace, and not within the /upandrunning namespace.

Using Socket.IO with Express

There are many cases where you would want to use Socket.IO by itself within Node as its own application or as a component of a larger website architecture that includes non-Node components. However, when it’s used as part of a full Node application using Express, you can gain an enormous amount of efficiency by writing the entire software stack—including the client-facing views—in the same language (JavaScript).

Save Example 7-21 as socket_express.html.

Example 7-21. Attaching Socket.IO to an Express application: client code
<script src="/socket.io/socket.io.js"></script>
<script>
var socket = io.connect('http://localhost:8080');
socket.on('news', function(data) {
  document.write('<h1>' + data.title + '</h1>' );
  document.write('<p>' + data.contents + '</p>' );
  if ( data.allowResponse ) {
    socket.emit('scoop', { contents: 'News data received by client.' });
  }
});
</script>

This example starts by connecting to the Socket.IO on port 8080. Whenever the Socket.IO server sends a “news” event, the client writes the new item’s title and contents to the browser page. If the news item allows a response, the client socket emits a “scoop” event. The scoop wouldn’t be very interesting to a real reporter; it only contains an acknowledgment that the client received the original news.

This being an example press, the news server responds to the “scoop” event by emitting another news story. The client will receive this new story and print it to the screen also. To prevent this cycle from continuing out of control, an allowResponse parameter is sent with the news story. If it is false or not present at all (see Example 7-22), the client will not send a scoop.

Example 7-22 shows the Express server.

Example 7-22. Attaching Socket.IO to an Express application: server code
var app = require('express').createServer(),
     io = require('socket.io').listen(app);

app.listen(8080);

app.get('/', function(req,res) {
  res.sendfile(__dirname + '/socket_express.html');
});

io.sockets.on('connection', function(socket) {
  socket.emit('news', {
    title: 'Welcome to World News',    contents: 'This news flash was sent from Node.js!',
    allowResponse: true
  });
  socket.on('scoop', function(data) {
    socket.emit('news', {
      title: 'Circular Emissions Worked',
      contents: 'Received this content: ' + data.contents
    });
  });
});

The Express server is created first and then passed into Socket.IO as a parameter. When the Express application is started with the listen() function, both the web server and socket server are activated. Next, a route for the base path (/) is defined as a pass-through for sending the client-side file created in Example 7-21.

The server-side code for the news broadcaster looks very similar to the client-side code for good reason. The same events (emit, on message, connection) behave similarly in Node and in the web browser, making connectivity straightforward. Because data is passed as JavaScript objects between both endpoints, no additional parsing or serialization is needed.

Clearly, we can very quickly gain a lot of power by plugging Socket.IO into Express, but astute programmers will realize that this is one-way communication of limited value, unless the connection initiated by the user’s web browser is represented in the socket stream. Any changes (logout, profile settings, etc.) should be reflected in any socket actions, and vice versa. How to accomplish that? Sessions.

To illustrate the use of a session for authentication, let’s look first at the client-side code, views/socket.html, shown in Example 7-23.

Example 7-23. Client HTML (Jade template): Socket.IO sessions
!!! 5
html(lang='en')
  head
    script(type='text/javascript', src='/socket.io/socket.io.js')
    script(type='text/javascript')
      var socket = io.connect('http://localhost:8080');
      socket.on('emailchanged', function(data) {
        document.getElementById('email').value = data.email;
      });
      var submitEmail = function(form) {
        socket.emit('emailupdate', {email: form.email.value});
        return false;
      };
  body
    h1 Welcome!

    form(onsubmit='return submitEmail(this);')
      input(id='email', name='email', type='text', value=locals.email)
      input(type='submit', value='Change Email')

When rendered in a web browser, this page will display a form text box with a “Change Email” call to action whose default value comes from Express’s session data through the locals.email variable. Upon user input, the application performs these actions:

  1. Create a Socket.IO connection and send all of the user’s email updates as an emailupdate event.

  2. Listen for emailchanged events and replace the contents of the text box with the new email from the server (more on this soon).

Next, have a look at the Node.js portion of Example 7-24.

Example 7-24. Sharing session data between Express and Socket.IO
var io = require('socket.io');
var express = require('express');
var app = express.createServer();
var store = new express.session.MemoryStore;
var utils = require('connect').utils;
var Session = require('connect').middleware.session.Session;

app.configure(function() {
  app.use(express.cookieParser());
  app.use(express.session({secret: 'secretKey', key: 'express.sid', store: store}));
  app.use(function(req, res) {
    var sess = req.session;
    res.render('socket.jade', {
      email: sess.email || ''
    });
  });
});

// Start the app
app.listen(8080);

var sio = io.listen(app);

sio.configure(function() {
  sio.set('authorization', function (data, accept ) {
    var cookies = utils.parseCookie(data.headers.cookie);
    data.sessionID = cookies['express.sid'];
    data.sessionStore = store;
    store.get(data.sessionID, function(err, session) {
      if ( err || !session ) {
        return accept("Invalid session", false);
      }
      data.session = new Session(data, session);
      accept(null,true);
    });
  });

  sio.sockets.on('connection', function(socket) {
    var session = socket.handshake.session;
    socket.join(socket.handshake.sessionId);
    socket.on('emailupdate', function(data) {
      session.email = data.email;
      session.save();
      sio.sockets.in(socket.handshake.sessionId).emit('emailchanged', {
        email: data.email
      });
    });
  });
});

This example uses Connect, a middleware framework that simplifies common tasks such as session management, working with cookies, authentication, caching, performance metrics, and more. In this example, the cookie and session tools are used to manipulate user data. Socket.IO is not aware of Express and vice versa, so Socket.IO is not aware of sessions when the user connects. However, both components need to use the Session object to share data. This is an excellent demonstration of the Separation of Concerns (SoC) programming paradigm.[19]

This example demonstrates using Socket.IO’s authorization, even after connection, to parse the user’s headers. Because the session ID is passed to the server as a cookie, you can use this value to read Express’s session ID.

This time, the Express setup includes a line for session management. The arguments used to build the session manager are a secret key (used to prevent session tampering), the session key (used to store the session ID in the web browser’s cookie), and a store object (used to store session data for later retrieval). The store object is the most important. Instead of letting Express create and manage the memory store, this example creates a variable and passes it into Express. Now the session store is available to the entire application, not just Express.

Next, a route is created for the default (/) web page. In previous Socket.IO examples, this function was used to output HTML directly to the web browser. This time, Express will render the contents of views/socket.jade to the web browser. The second variable in render() is the email address stored in the session, which is interpreted and used as the default text field value in Example 7-23.

The real action happens in Socket.IO’s 'authorization' event. When the web browser connects to the server, Socket.IO performs an authentication routine to determine whether the connection should proceed. The criteria in this case is a valid session, which was provided by Express when the user loaded the web page. Socket.IO reads the session ID from the request headers using parseCookie (part of the Connect framework), loads the session from the memory store, and creates a Session object with the information it receives.

The data passed to the authorization event is stored in the socket’s handshake property. Therefore, saving the session object into the data object makes it available later in the socket’s lifecycle. When creating the Session object, use the memory store that was created and passed into Express; now both Express and Socket.IO are able to access the same session data—Express by manipulating the req.session object, and sockets by manipulating the socket.handshake.session object.

Assuming all goes well, calling accept() authenticates the socket and allows the connection to continue.

Now suppose the user accesses your site from more than one tab in his web browser. There would be two connections from the same session created, so how would you handle events that need to update connected sockets? Socket.IO provides support for rooms, or channels if you prefer. By initiating a join() command with sessionId as the argument in Example 7-24, the socket transparently created a dedicated channel you can use to send messages to all connections currently in use by that user. Logging out is an obvious use for this technique: when the user logs out from one tab, the logout command will instantly transmit to all the others, leaving all of the user’s views of the application in a consistent state.

Always remember to execute session.save() after changing session data. Otherwise, the changes will not be reflected on subsequent requests.



[16] Tokenized refers to the process of breaking apart a string of text into chunks (or words) called tokens.

[17] This functionality is actually part of a submodule of Express called router. You can look at the source code of router to see the details of routing regexes.

[18] A factory is an object that creates other objects with specific parameters, whereas creating those objects manually would involve a lot of repetitive or complex program code.

[19] SoC refers to the practice of breaking down software into smaller single-purpose parts (concerns) that have as little overlapping functionality as possible. Middleware enables this style of design by allowing totally separate modules to interact in a common environment without needing to be aware of each other. Although, as we have seen with modules such as bodyParser(), it remains up to the programmer to understand how the concerns ultimately interact and use them in the appropriate order and context.