Chapter 6. Access Control

Access control models are responsible for granting or restricting access to resources. They depend on two things: user identification (verified by one or more authentication schemes) and feature authorization.

Before you grant access to a resource, you need to know that the user is who she claims to be (authentication) and whether or not the user should have access to a given resource (authorization).

Authentication

Authentication is the mechanism that confirms the identity of users trying to access a system. In order for users to be granted access to a resource, they must first prove that they are who they claim to be. Generally this is handled by passing a key with each request (often called an access token). The server verifies that the access token is genuine, and that the user does indeed have the required privileges to access the requested resource. Only then is the request granted.

There are many ways to grant a user an access token. The most common is a password challenge.

Passwords

Passwords should be stored with a one-way encryption hash, so that even if a malicious intruder obtains access to the user database, he still won't have access to user passwords. The hash should be long enough to prevent an attack from a single machine and to prevent an attack from a large cluster of machines. I recommend 512 bits (64 bytes).

Worms targeting vulnerable versions of popular website platforms such as WordPress and Drupal have become common. Once such worm takes control of a website and installs its payload, recruits all of the site's traffic into a JavaScript botnet, and, among other things, uses visitor CPU power to crack stolen password databases that fail to implement the security precautions outlined here.

There are botnets that exist today with over 90,000 nodes. Such botnets could crack MD5 password hashes at a rate of nine billion per second.

Passwords are vulnerable to the following common attacks:

  • Rainbow tables

  • Brute force

  • Variable time equality

  • Passwords stolen from third parties

Rainbow tables

Rainbow tables are precomputed tables used to look up passwords using stolen hashes. Once bad guys get their hands on user passwords, they'll attempt to attack popular services such as email and bank accounts—which spells very bad PR for your service.

There are rainbow tables that exist today that can discover almost every possible password up to 14 characters. To prevent password theft by rainbow table, users should choose passwords of at least 14 characters. Sadly, such passwords are definitely not convenient, particularly on mobile devices. In other words, don't trust users to select appropriate passwords.

Rainbow tables can significantly reduce the time it takes to find a password, at the cost of memory, but with terabyte hard drives and gigabytes of RAM, it's a trade-off that is easily made. That said, it is possible to protect your service against rainbow table attacks.

Password salts

One defense you can employ against rainbow tables is password salting. A salt is a sequence of random characters that gets paired with a password during the hashing process. Salts should be cryptographically secure random values of a length equal to the hash size. Salts are not secrets and can be safely stored in plain text alongside the user's other credentials.

Salting can protect passwords in a couple of ways:

First, a uniquely generated salt can protect your password databases against existing rainbow tables. Using a random salt makes your site immune from these attacks. However, if you use the same salt for every password, a new rainbow table can be generated to attack the password database.

Second, if two different users utilize the same password, the compromised password will grant access to both user accounts. To prevent that, you must use a unique salt for each password. Doing so makes a rainbow table attack impractical.

Node.js supplies a suitable random generator called crypto.randomBytes(). It returns a buffer. Wrap it to get a suitable salt string:

    /**
     * createSalt(keylength, callback) callback(err, salt)
     *
     * Generates a cryptographically secure random string for
     * use as a password salt using Node's built-in
     * crypto.randomBytes().
     *
     * @param  {Number} keyLength
     * @param  {Function} callback 
     * @return {undefined}
     */
    var createSalt = function createSalt(keyLength, callback) {
      crypto.randomBytes(keyLength, function (err, buff) {
        if (err) {
          return callback(err);
        }
        callback(null, buff.toString('base64'));
      });
    };

The operation is asynchronous because the cryptographically secure random-number generator takes time to collect enough entropy to complete the operation.

Brute force

Rainbow tables get all the blogger attention, but Moore's law is alive and well, and brute force has become a very real threat. Attackers are employing GPUs, super-computing clusters that cost less than $2,000, and JavaScript botnets comprised of tens of thousands of browsers visiting infected websites.

A brute-force attack will attempt to crack a password by seeking a match using every possible character combination. A simple single-iteration hash can be tested at the rate of millions of hashes per second on modern systems.

One way to thwart brute-force attacks is to programatically lock a user's account after a handful of failed login attempts. However, that strategy won't protect passwords if an attacker gains access to the password database.

Key stretching can make brute-force attacks impractical by increasing the time it takes to hash the password. This can be done by applying the hash function in a loop. The delay will be relatively unnoticed by a user trying to sign in, but will significantly hamper an attacker attempting to discover a password through brute force.

Don't pick any random hash function and apply it in a loop. You could unwittingly open up attack vectors. Instead, use an established standard for iterative hashing, such as bcrypt or PBKDF2.

I discovered 100 hashes in less than 1 ms using a simple MD5 algorithm, and then tried the same thing with Node's built-in crypto.pbkdf2() function (HMAC-SHA1) set to 80,000 iterations. PBKDF2 took 15.48 seconds. To a user performing a single login attempt per response, the slowdown is barely noticed, but it slows brute force to a crawl.

Usage is deceptively simple:

    crypto.pbkdf2(password, salt,
      iterations, keyLength, function (err, hash) {
        if (err) {
          return callback(err);
        }
        callback(null, new Buffer(hash).toString('base64'));
      });

However, there are important considerations that shouldn't be overlooked, such as generating the appropriate unique, crytographically secure random salt of the right length, and calculating the number of iterations in order to balance user experience and security.

Variable time equality

If it takes your service longer to say no to a slightly wrong password than a mostly wrong password, attackers can use that data to guess the password, similar to how you guess a word-playing hangman. You might think that random time delays and network timing jitter would sufficiently mask those timing differences, but it turns out an attacker just needs to take more timing samples to filter out the noise and obtain statistically relevant timing data:

From Crosby et al. "Opportunities And Limits Of Remote Timing Attacks":

We have shown that, even though the Internet induces significant timing jitter, we can reliably distinguish remote timing differences as low as 20 µs. A LAN environment has lower timing jitter, allowing us to reliably distinguish remote timing differences as small as 100 ns (possibly even smaller). These precise timing differences can be distinguished with only hundreds or possibly thousands of measurements.

The best way to beat these attacks is to use a constant time hash equality check, rather than an optimized check. That is easily achieved by iterating through the full hash before returning the answer, regardless of how soon the answer is known.

For more information, see Coda Hale's "A Lesson in Timing Attacks".

Here is an example of a constant time string equality algorithm in JavaScript:

    /**
     * constantEquals(x, y)
     *
     * Compare two strings, x and y with a constant time
     * algorithm to prevent attacks based on timing statistics.
     */
    constantEquals = function constantEquals(x, y) {
      var result = true,
        length = (x.length > y.length) ? x.length : y.length,
        i;

      for (i=0; i<length; i++) {
        if (x.charCodeAt(i) !== y.charCodeAt(i)) {
          result = false;
        }
      }
      return result;
    };

Stolen passwords

By far the biggest threat to password security is the fact that these tactics have already worked against other websites, and users have a tendency to reuse passwords across different sites. Since you don't have access to the user's other accounts for verification, there's little you can do to enforce unique passwords on your website.

As you have seen, passwords alone are an ineffective authentication system, but they can still be useful in combination with other authentication factors.

Credential

I searched for a suitable open source password authentication module in npm, but I couldn't find one that met all of the criteria you should consider when you're implementing password authentication in your applications. This is a critical component of your system security, so it's important to get it right. I created a library to make it easy.

Credential was reviewed by a small army of security and JavaScript experts before publishing. Unless you're a security expert with access to a whole lot of other security experts, it's generally a really bad idea to roll your own security library. It's a much better idea to use something that's already well established and well tested.

Install credential:

    $ npm install --save credential

.hash():

    var pw = require('credential'),
      newPassword = 'I have a really great password.';

    pw.hash(newPassword, function (err, hash) {
      if (err) { throw err; }
      console.log('Store the password hash.', hash);
    });

.verify():

    var pw = require('credential'),
      storedHash = '{"hash":...', // truncated to fit on page
      userInput = 'I have a really great password.';

    pw.verify(storedHash, userInput, function (err, isValid) {
      var msg;
      if (err) { throw err; }
      msg = isValid ? 'Passwords match!' : 'Wrong password.';
      console.log(msg);
    });

You can wrap this to supply a simple verify() function that takes a username and password, and then calls a callback:

    var users = require('./users.js');

    var verify = function verify(username, password, verified) {
      var user = users.findOne(username);
      if (!user) {
        // No unexpected error, no user, reason for failure
        return verified(null, false, {
          message: 'User not found'
        });
      }

      pw.verify(user.hash, password, function (isValid) {
        if (!isValid) {
          // No unexpected error, no user, reason for failure
          return verified(null, false, {
            message: 'Incorrect password.'
          });
        }
        return verified(null, user);
      });
    };

You can then plug that into something like passport-local:

    var express = require('express'),
      passport = require('passport'),
      LocalStrategy = require('passport-local'),
      verify = require('./lib/password-auth.js'),
      app = express();

    passport.use( new LocalStrategy(verify) );

    app.post('/login', 
      passport.authenticate('local', { failureRedirect: '/login' }),
      function(req, res) {
        res.redirect('/');
      });

    app.listen(3000);

Multifactor Authentication

Because of the threat of stolen passwords, any policy that relies solely on password protection is unsafe. In order to protect your system from intruders, another line of defense is necessary.

Multifactor authentication is an authentication strategy that requires the user to present authentication proof from two or more authentication factors: the knowledge factor (something the user knows, like a password); the possession factor (something the user has, like a mobile phone); and the inherence factor (something the user is, like a fingerprint).

Knowledge factor

A common secondary security mechanism that was widely implemented in the financial industry just a few years ago are "security questions." Pairing a password with security questions does not qualify as multifactor authentication, though, because you need the user to pass challenges from two or more authentication factors. Using multiple knowledge factor challenges does not prevent a determined snoop from breaking in.

Multifactor authentication means that an attacker would have to be both a snoop and a thief, for instance.

Possession factor

For corporate and government intranets, it's common to require some type of physical token or key to grant access to systems. Mechanisms include USB dongles and flash card keys.

OTPs (one-time passwords) are short-lived passwords that work only for a single use. They satisfy the possession factor because they're usually generated by a dedicated piece of hardware, or by an app on the user's mobile phone. The device is paired with the service that is being authenticated against in a way that cannot be easily spoofed by impostors.

Google released a product called Google Authenticator that generates one time passwords for mobile devices. There is a node module called speakeasy that lets you take advantage of Google authenticator to authenticate users using the possession factor.

Install Speakeasy:

    $ npm install --save speakeasy

Then take it for a spin:

    var speakeasy = require('speakeasy');

    // Returns a key object with ascii, hex, base32, and
    // QR code representations (the QR code value is a
    // Google image URL):
    var key = speakeasy.generate_key({
      length: 20,
      google_auth_qr: true
    });

    // This should match the number on your phone:
    speakeasy.time({key: key.base32, encoding: 'base32'});

Authy is a product similar to Google Authenticator that recently announced support for Bluetooth pairing. If your phone is near your computer, the Authy desktop agent will detect your phone's key over Bluetooth so that you won't have to type the token manually.

Inherence factor

The inherence factor is something the user is—in other words, some information that discloses some physical property of the user. Common examples include fingerprint and retina scanners. While technology does exist to scan fingerprints, retinas, palm prints, and more, it's possible to defeat security devices such as fingerprint scanners if an attacker can convince the scanner that a printed image is actually the user's fingerprint. Printed images have been used to defeat facial recognition and fingerprint scanners in cell phone devices.

Because of the risk of compromise and a user's inability to change her own fingerprints, security experts like to say that the inherence factor is equivalent to a username, not a password. In other words, inherence can be used to make user recognition more convenient but should not be used to prove the user's identity.

The closest I have seen to a security-enhancing inherence factor is a process known as geofencing. Geofencing allows you to use location APIs to determine what part of the world the current user is in. Because users travel, geofencing should be used as a warning mechanism. For example, it could be used to trigger an additional authentication using another factor. It's also worth mentioning that geofencing can be defeated by a simple web-proxy mechanism. It may discourage a casual or unsophisticated attacker, but a determined attacker may eventually defeat it.

You can use the new HTML Geolocation API to establish compliance the location of the user, provided that the user grants permission. The following function will return the user's geolocation in latitude and longitude:

    var getLocation = function getLocation(cb) {
      if (!navigator.geolocation) {
        return cb(new Error('Geolocation is not supported by this browser.'));
      }
      navigator.geolocation.getCurrentPosition(function (position) {
        cb(null, position);
      });
    };
    
    getLocation(function (err, position) {
      if (err) {
        return console.log(err);
      }
      console.log(position);
    });

To use the data for geofencing, simply save users’ preferred locations along with their profile, and ask them to authorize any new locations that they log in from. The size of the geofence perimeter should depend on your particular security needs. For example, a bank may chose a 5-mile radius, whereas a discussion forum may select a 50-mile radius.

Federated and Delegated Authentication

Federated authentication is a mechanism that allows users to share their identity across multiple services with a single-sign-on (SSO) solution. OpenID is a good example of a federated SSO solution, but it hasn't been widely adopted by users due to usability concerns. That could change soon with OpenID Connect, an identity solution built on top of OAuth 2.0 (similar to Facebook Connect).

Mozilla Persona

Mozilla's Persona is an open source federated identity system that uses email addresses and short-lived tokens for identification. Persona allows you to add login and logout buttons to your website, and watch() for login and logout actions. Persona has identity provider bridge support for both Yahoo! and Gmail. Sadly, Persona has failed to catch fire, and Mozilla announced in March 2014 that it would no longer actively develop the service.

WebID

WebID is a W3C-backed federated identity system that works in all current browsers built on top of existing web standards. Unfortunately, it currently relies on certificate selection UIs that are built into browsers—parts of the browser user interface that have long been neglected and are rather ugly and cumbersome for users. Several years after the first WebID specification was published, the UIs have not improved much.

The primary advantages of WebID are that it works over TLS and does not rely on email to prove identity. That said, no website can afford to rely on an authentication mechanism that is unfriendly to its user community.

For this reason, I can't recommend WebID for production use today, and neither does the W3C. At the time of this writing, it has not yet evolved into an official working group, and it is probably years from reaching an official recommendation state, if that ever happens.

Delegated authentication

Delegated authentication allows you to delegate authentication to a specific third-party provider (such as Facebook or Twitter). Unlike federated authentication, delegated authentication systems are not distributed, meaning that there is a single, centralized provider for identity. Your site can implement multiple delegate providers, of course. For example, you can give your users the choice to log in with Twitter or Facebook, but from the perspective of end users, they're presented with a choice of providers, and they are forced to remember which one they use to log in to your site (unless you detect that the accounts belong to the same user and link the accounts).

A single federated login mechanism tends to present a better user experience than offering users a choice of mechanisms because the user doesn't have to remember which mechanism they chose. For example, if your site implements Persona, all the user has to remember is his email address.

Facebook login is the most successful delegated authentication system as of this writing, by virtue of the size of its user base. It has a huge market saturation, and most users will be familiar with both the branding and the Facebook login and authorization flow.

To enable Facebook login on your site, first, create a new Facebook app, and then retrieve the app ID and app secret. To create a Facebook app, visit the Facebook Apps page.

Facebook supplies a JavaScript SDK to help you interact with their API. Here's the code you need to work with it:

    <div id="fb-root"></div>
    <script>
      window.fbAsyncInit = function() {
        FB.init({
          appId      : 'YOUR_APP_ID', // App ID

          // Channel File
          channelUrl : '//WWW.YOUR_DOMAIN.COM/channel.html', 
          status     : true, // check login status
          // enable cookies to allow the server to access 
          // the session
          cookie     : true, 
          xfbml      : true  // parse XFBML
        });

        // Additional init code here

      };

      // Load the SDK asynchronously
      (function(d){
         var js,
           id = 'facebook-jssdk',
           ref = d.getElementsByTagName('script')[0];

         if (d.getElementById(id)) {return;}
         js = d.createElement('script');
         js.id = id;
         js.async = true;
         js.src = "//connect.facebook.net/en_US/all.js";
         ref.parentNode.insertBefore(js, ref);
       }(document));
    </script>

The channel file is a single-line file that addresses cross-domain issues in some browsers. It only needs to be one line:

    <script src="//connect.facebook.net/en_US/all.js"></script>

The Facebook API is notorious for changing. Visit Facebook's “Getting Started With Facebook Login” webpage for the latest details.

Authorization

Authorization ensures that agents (users or applications) have access to only the resources they are allowed to access according to some attributes of the agent; resource policies, such as ACLs (access control lists); or both, as in MAC (mandatory access control) models.

An ACL is essentially a table that lists each user with access to a particular resource. ACLs can be stored at the system level, listing each user and what she can do or see within the system, or they can be stored at the resource level.

In a MAC system, each resource has an associated minimum trust level, and each user has an associated trust level. If the user is not trusted enough, access to the resource is denied.

Role-based access controls (RBAC) allow you to authorize users with specific roles. A user can have any number of roles, and a user is granted privileges based on her role. For example, a blog might have a small number of administrators who can change anything, a larger set of editors who can edit and publish posts, an even larger number of contributors who can contribute blog posts, and an open membership whereby anybody can register to post comments.

It is possible to implement MAC using role-based access controls, and it is also possible to combine the use of RBAC and ACLs.

Protecting express resources with an authorization strategy is easy:

  app.put('/posts/:id', authorize(), putPost);

The authorize() function returns a middleware function that can check to see whether or not the user has permission to access the requested resource before the route handler has the chance to run. For example, if you want to ensure that the user is logged in, you can use an authorize function that looks like this:

  var authorize = function authorize(options) {
    return function auth(req, res, next) {
      if (options.requiresAuth &&
          !(req.session && req.session.user)) {
        return res.send(403);
      }
      next();
    };
  };

Authorizing Applications

Applications can act as user agents in order to access a user's data from a third-party system or even perform actions on the user's behalf, such as sharing on social networks or posting content. In order for an app to be granted such access, it must first prove that the user has authorized it. Typically that is accomplished by sending a token along with the request.

Applications gain the token using an application authorization grant. The user is directed to the target application and presented with an authorization request. If the user grants the requested permissions, a token is set and delivered to the requesting application.

Facebook's authorization is tied to the authentication system (Facebook Login). When a user attempts to log in to your app for the first time with Facebook, he will be presented with an authorization screen which displays the permissions your app asked for (called scope).

Facebook recommends that you ask for as few permissions as possible upfront. Requesting more than four permissions severely impedes the success rate of the authorization grant. Ask for as few as possible for your app to function properly, and then request additional permissions on an as-needed basis.

For example, initially you may only want to request access to the user's email and likes. Later on, let user actions trigger permission requests. That way, the user has context and knowledge of why you need that particular permission. Say the user wants to share a photo from your app on her own timeline. You'll want to check to see if you have been granted that permission, and if you haven't yet, ask for the publish_actions permission in response to her request.

The user is much more likely to grant permission if you need it to complete an action that she directly requested.

This principle holds true whether you're dealing with Facebook or any other third-party authorization.

OAuth 2.0

OAuth 2.0 is an open standard for application authorization that allows clients to access resources on behalf of the resource owner. Essentially, it allows you to grant applications limited access to your accounts on third-party services. For example, Facebook, Google Docs, and Dropbox all allow you to access resources on behalf of their users via their OAuth 2.0 based public APIs.

By way of contrast, OpenID provides a means to request the user's identity from a federated ID provider (such as Google). That identity includes details such as the user's name and email address. Google returns proof to the application that the user is who he says he is (the owner of the identity in question).

OAuth 2.0, on the other hand, returns a token that grants the app access to a specific set of API resources. Think of it this way: using an authentication system like OpenID is similar to showing your driver's license or passport. Using OAuth 2.0 is like giving the app a temporary key that it can use to access your resources on another site (but only those resources you have explicitly authorized).

OAuth 2.0 is a framework that exposes several authorization grant flows. There are specific flows for desktop apps, web apps, native mobile apps, and other devices.

A basic OAuth 2.0 flow goes something like this:

  1. Client requests permissions from the user.

  2. The user is directed to the third-party service that provides those permissions, where the user is authenticated and the request is granted or rejected.

  3. The grant is passed back to the requesting client.

  4. The client exchanges the grant for an access token.

  5. In subsequent calls, the client provides the access token in order to prove that she has permission to access the requested resources.

  6. Optionally, the service can implement a token exchange, whereby a client can exchange an expiring token for a fresh token.

If you'd like your app to be an OAuth 2.0 provider, check out oauth2orize.

To verify issued bearer tokens, you'll also need a strategy to authenticate bearer tokens. Take a look at passport-http-bearer.

OAuth 2.0 has been the subject of a lot of security criticism. Security experts cite the large attack surface and the difficulty of correct implementation as reasons to avoid using the specification as-is. However, Facebook and other providers offer bounties for users who are able to uncover security vulnerabilities in their implementations, and those bounties have led to vulnerability discoveries and patches.

Despite security concerns, OAuth 2.0 is currently the dominant form of third-party API authorization. It has the best tooling and community support.

Conclusion

Security is an essential component of your application, and it's very important to get it right. Here are some keys to remember:

  • Use a good authentication library, such as Credential.

  • Passwords alone are never secure. Allow multifactor authentication for your app.

  • HTTP uses plain-text communication. In order to protect user passwords and data, enable HTTPS site-wide.

  • As a general rule, it's a good idea to use an authorize() middleware for all of your routes.

  • Enable OAuth 2.0 support to discourage third-party application vendors from requesting your user's login credentials.