Chapter 15. Web Applications and Web Services

We’re now going to take a leap from the client side to the server side to learn how to write web-based Java applications and services. What distinguishes a web-based application from a regular Java program is that much of the code, logic, or data resides on the server, at least initally, and the user utilizes a web browser or a lightweight client to access it. This is a very appealing model of software deployment facilitated by the increased standardization and power of HTML and JavaScript in web browsers as well as higher-speed Internet connectivity and better application-to-application web service standards.

Most of this chapter is about the mechanics of the Servlet API, which is a Java framework for writing application components for servers. The Servlet API is used in both Java web applications and often in the implementation of application-to-application web services. We’ll deal with servlets directly in the first part of this chapter, when writing examples used from a web browser. Later, we’ll look at application-level web services that are designed to provide data and services to all types of client applications in a more behind-the-scenes fashion. The two types of server-side applications have some things in common, including how they can be deployed to an application server using a Web Archive (WAR) file and the fact that they are often combined in advanced applications that both render pages on the server and use JavaScript to pull data from web services on the client side.

The Servlet API lives in the javax.servlet package, which is a standard Java API extension. Deploying and running servlets requires an application server or servlet container—a Java-based server that acts like a web server and handles requests bound for servlet components—and so the Servlet API is not bundled with the standard edition of Java. We will recommend that you download Apache Tomcat to run the examples in this chapter and at that time, you can grab the Servlet API JAR file from that distribution in order to compile the example classes. Many Java IDEs can also install the necessary JAR file for you automatically.

The APIs used for building and deploying application-to-application web services are part of the javax.jws package. Although the JWS API is also technically a standard extension, it is bundled with the standard edition of Java and so you can write Java web service clients out-of-the-box, with no additional components. You can even deploy web services directly using a minimal built-in server functionality bundled with the standard edition of Java with no additional application server required. However, this feature is mostly useful for testing, as the built-in server does not perform as well as the various other application servers such as Tomcat. This chapter covers Java Servlet API 3.0 and JWS (JAX-WS) version 2.2.

Servers that support the full set of Java Enterprise APIs including servlets, web services, JSPs, and older technology like Enterprise JavaBeans are called application servers. JBoss is a free, open source Java application server, and BEA’s WebLogic is a popular commercial application server. The free Apache Tomcat server that we’ll use in this chapter started out primarily as a servlet container, but now runs web services and everything needed for serious application development. Tomcat can be used by itself or in conjunction with another web server such as Apache. Tomcat is easy to configure and is a pure Java application, so you can use it on any platform that has a Java VM. You can download it from http://jakarta.apache.org/tomcat/.

Web Application Technologies

Many different ways of writing server-side software for web applications have evolved over the years. Early on, the standard was CGI, which provided a way to service web browser requests with scripting language such as Perl. Various web servers also offered native-language APIs, such as modules for the Apache web server written in C and C++. The Java Servlet API, however, rapidly became the most popular architecture for building web-based applications because it offered portability, security, and high performance. Today, Java-based web services compete with similar services offered by Microsoft .NET and alternatives such as Ruby on Rails for building web application components. However, the overriding trend in web applications today is to focus less on the server technology and more on client-side technologies such as JavaScript and HTML5 in communication with server-side components and web services regardless of the implementation language. We’ll try to offer some perspective on this throughout this chapter.

Page-Oriented Versus “Single Page” Applications

For most of the lifetime of Java, web-based applications followed the same basic paradigm: the browser makes a request to a particular URL; the server generates a page of HTML in response; and actions by the user drive the browser to the next page. In this exchange, most or all of the work is done on the server side, which is seemingly logical given that that’s where data and services often reside. The problem with this application model is that it is inherently limited by the loss of responsiveness, continuity, and state experienced by the user when loading new “pages” in the browser. It’s difficult to make a web-based application as seamless as a desktop application when the user must jump through a series of discrete pages and it is technically more challenging to maintain application data across those pages. After all, web browsers were not designed to host applications, they were designed to host documents.

But a lot has changed in web application development in recent years. Standards for HTML and JavaScript have matured to the point where it is practical to write applications in which most of the user interface and logic reside on the client side and background calls are made to the server for data and services. In this paradigm, the server effectively returns just a single “page” of HTML that references the bulk of the JavaScript, CSS, and other resources used to render the application interface. JavaScript then takes over, manipulating elements on the page or creating new ones dynamically using advanced HTML DOM features to produce the UI. JavaScript also makes asynchronous (background) calls to the server to fetch data and invoke services. In many cases, the results are returned as XML, leading to the term Asynchronous JavaScript and XML (AJAX) for this style of interaction.

This new model simplifies and empowers web development in many ways. No longer must the client work in a single-page, request-response regime where views and requests are ping-ponged back and forth. The client is now more equivalent to a desktop application in that it can respond to user input fluidly and manage remote data and services without interrupting the user.

Before we move on to our discussion of the Servlet API, we will briefly describe Java’s relationship to some related web technologies, old and new.

JSPs

JSPs are a document-centric (page-oriented) way to write server-side applications. They consist of HTML source utilizing custom tag libraries along with a Java-like syntax embedded within the pages. JSPs are compiled dynamically by the web server into Java servlets and can work with Java APIs directly and indirectly in order to generate dynamic content for the pages. Although all of the work still occurs on the server side, JSPs allow the developer to work as if code was running directly in the page, which has both benefits and drawbacks. The benefit of this sort of “immediate mode” programming style is that it is easy to grasp and quick to crank out. The drawback is that it can lead to an unmanageable mix of business logic and presentation logic in the pages. The more code that appears mixed in with the static content, the greater the maintenance headache.

Most large-scale JSP projects utilize custom tag libraries to minimize ad hoc code in the pages. JSPs are also used in combination with controller servlets that can do the heavy lifting and business logic for them. In this case, the term controller refers to the Model-View-Controller (MVC) separation of concerns that we introduced earlier when talking about Swing GUIs. Maintaining this separation leverages the advantages of JSP while avoiding its pitfalls.

XML and XSL

XML is a set of standards for working with structured information in text form. The Extensible Stylesheet Language (XSL) is a language for transforming XML documents into other kinds of documents, including HTML. The combination of servlets that can generate XML content and XSL stylesheets that can transform content for presentation is a very powerful combination, covered in detail in Chapter 24. As we’ll discuss later, web services also use XML as their native data format, making them completely portable across platforms and languages. And, of course, XML is the basis for returning data to JavaScript applications in the original AJAX style of web development.

Web Application Frameworks

If we think about web applications in terms of the classic MVC model, then a traditional page-oriented application generally has “view” components rendered in the browser, while the model (data) and controllers (logic) reside on the server side. We’ve mentioned some reasons why this style of web application is fading in favor of “single page” applications where more of these components move into the browser; however, over the years many frameworks have been developed to support this classic web app arrangement. Generally these frameworks work at a higher level than servlets, providing a convenient way to write controller components, connect them, and configure page views for the results.

One of the most popular frameworks for building page-oriented web applications has been the Apache Foundation’s Struts Web Application Framework. Struts implements the MVC paradigm by providing both a modular controller component architecture and an extensive tag library for JSP page view development. Struts abstracts some of the mapping and navigation aspects required to glue together a web application through the use of an XML-based configuration file and also adds the ability to do declarative mapping of HTML forms to Java objects as well as automated validation of form fields.

JSF was Sun’s response to Struts. Developed through the Java Community Process (including some of the original Struts people) it was intended to become the “official” Java-sanctioned web-application framework. JSF built upon lessons learned with Struts and refined the MVC model with server-side application components and more fine-grained navigation and event management. JSF met mixed reviews and never really surpassed Struts in popularity.

Spring Web Flow is another popular web application MVC system that is based on the Spring application framework. There are many, many examples of Java web application frameworks.

Google Web Toolkit

Google Web Toolkit, or GWT, is a free framework produced by Google that allows developers to write web applications using the Java programming language. GWT compiles Java components to JavaScript that runs in a web browser and communicates with the server via a custom RPC mechanism that acts something like Java RMI. The GWT environment provides its own set of Java GUI classes and a substantial subset of the standard Java libraries. GWT is a very powerful framework that makes it possible to write large and complex applications with most of the benefits of the Java programming language while running in a web browser. However, GWT has a somewhat steeper learning curve than some other web frameworks (especially for those unfamiliar with both Java and JavaScript).

HTML5, AJAX, and More...

Java lives on the server side of web applications. To build the client pieces of applications that run in browsers, we must cooperate with Java’s namesake, JavaScript. As we’ve mentioned, in recent years efforts to standardize advanced features of HTML and JavaScript have paid off in a real revolution in the capabilities of web applications and the way in which they are built. Much of this began with adding more dynamic behavior to clients via AJAX calls. More recently, the explosion of mobile browsers has fueled the adoption of the HTML5 standard, bringing web browsers a richer feature set including a more complete DOM, native video and audio media support, general canvas drawing and vector graphics support, and offline data storage. Even more exciting technologies can be used today while working their way through the standards process. One to keep an eye on is WebSockets, which provides for low-latency messaging between the browser and server and should enable many new types of applications.

Java Web Applications

So far we’ve used the term web application generically, referring to any kind of browser-based application that is located on a web server. Now we are going to be more precise with that term. In the context of the Java Servlet API, a web application is a collection of servlets and Java web services that support Java classes, content such as HTML or JSP pages and images, and configuration information. For deployment (installation on a web server), a web application is bundled into a WAR file. We’ll discuss WAR files in detail later, but suffice it to say that they are really just JAR archives that contain all the application files along with some deployment information. The important thing is that the standardization of WAR files means not only that the Java code is portable, but also that the process of deploying the application to a server is standardized.

Most WAR archives have at their core a web.xml file. This is an XML configuration file that describes which servlets are to be deployed, their names and URL paths, their initialization parameters, and a host of other information, including security and authentication requirements. In recent years, however, the web.xml file has become optional for many applications due to the introduction of Java annotations that take the place of the XML configuration. In most cases, you can now deploy your servlets and Java web services simply by annotating the classes with the necessary information and packaging them into the WAR file, or using a combination of the two. We’ll discuss this in detail later in the chapter.

Web applications, or web apps, also have a well-defined runtime environment. Each web app has its own “root” path on the web server, meaning that all the URLs addressing its servlets and files start with a common unique prefix (e.g., http://www.oreilly.com/someapplication/). The web app’s servlets are also isolated from those of other web applications. Web apps cannot directly access each other’s files (although they may be allowed to do so through the web server, of course). Each web app also has its own servlet context. We’ll discuss the servlet context in more detail, but in brief, it is a common area for servlets within an application to share information and get resources from the environment. The high degree of isolation between web applications is intended to support the dynamic deployment and updating of applications required by modern business systems and to address security and reliability concerns. Web apps are intended to be coarse-grained, relatively complete applications—not to be tightly coupled with other web apps. Although there’s no reason you can’t make web apps cooperate at a high level, for sharing logic across applications you might want to consider web services, which we’ll discuss later in this chapter.

The Servlet Lifecycle

Let’s jump now to the Servlet API and get started building servlets. We’ll fill in the gaps later when we discuss various parts of the APIs and WAR file structure in more detail. The Servlet API is very simple (reminiscent of the old Applet API). The base Servlet class has three lifecycle methods—init(), service(), and destroy()—along with some methods for getting configuration parameters and servlet resources. However, these methods are not often used directly by developers. Generally developers will implement the doGet() and doPost() methods of the HttpServlet subclass and access shared resources through the servlet context, as we’ll discuss shortly.

Generally, only one instance of each deployed servlet class is instantiated per container. More precisely, it is one instance per servlet entry in the web.xml file, but we’ll talk more about servlet deployment later. In the past, there was an exception to that rule when using the special SingleThreadModel type of servlet. As of Servlet API 2.4, single-threaded servlets have been deprecated.

By default, servlets are expected to handle requests in a multithreaded way; that is, the servlet’s service methods may be invoked by many threads at the same time. This means that you should not store per-request or per-client data in instance variables of your servlet object. (Of course, you can store general data related to the servlet’s operation, as long as it does not change on a per-request basis.) Per-client state information can be stored in a client session object on the server or in a client-side cookie, which persists across client requests. We’ll talk about client state later as well.

The service() method of a servlet accepts two parameters: a servlet “request” object and a servlet “response” object. These provide tools for reading the client request and generating output; we’ll talk about them (or rather their HttpServlet versions) in detail in the examples.

Servlets

The package of primary interest to us here is javax.servlet.http, which contains APIs specific to servlets that handle HTTP requests for web servers. In theory, you can write servlets for other protocols, but nobody really does that and we are going to discuss servlets as if all servlets were HTTP-related.

The primary tool provided by the javax.servlet.http package is the HttpServlet base class. This is an abstract servlet that provides some basic implementation details related to handling an HTTP request. In particular, it overrides the generic servlet service() request and breaks it out into several HTTP-related methods, including doGet(), doPost(), doPut(), and doDelete(). The default service() method examines the request to determine what kind it is and dispatches it to one of these methods, so you can override one or more of them to implement the specific protocol behavior you need.

doGet() and doPost() correspond to the standard HTTP GET and POST operations. GET is the standard request for retrieving a file or document at a specified URL. POST is the method by which a client sends an arbitrary amount of data to the server. HTML forms utilize POST to send data as do most web services.

To round these out, HttpServlet provides the doPut() and doDelete() methods. These methods correspond to a less widely used part of the HTTP protocol, which is meant to provide a way to upload and remove files or file-like entities. doPut() is supposed to be like POST but with slightly different semantics (a PUT is supposed to logically replace the item identified by the URL, whereas POST presents data to it); doDelete() would be its opposite.

HttpServlet also implements three other HTTP-related methods for you: doHead(), doTrace(), and doOptions(). You don’t normally need to override these methods. doHead() implements the HTTP HEAD request, which asks for the headers of a GET request without the body. HttpServlet implements this by default in the trivial way, by performing the GET method and then sending only the headers. You may wish to override doHead() with a more efficient implementation if you can provide one as an optimization. doTrace() and doOptions() implement other features of HTTP that allow for debugging and simple client/server capabilities negotiation. You shouldn’t normally need to override these.

Along with HttpServlet, javax.servlet.http also includes subclasses of the objects ServletRequest and ServletResponse, HttpServletRequest and HttpServletResponse. These subclasses provide, respectively, the input and output streams needed to read and write client data. They also provide the APIs for getting or setting HTTP header information and, as we’ll see, client session information. Rather than document these dryly, we’ll show them in the context of some examples. As usual, we’ll start with the simplest possible example.

The HelloClient Servlet

Here’s our servlet version of “Hello, World,” HelloClient:

@WebServlet(urlPatterns={"/hello"})
public class HelloClient extends HttpServlet 
{
    public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException 
    {
        response.setContentType("text/html"); // must come first
        PrintWriter out = response.getWriter();
        out.println(
            "<html><head><title>Hello Client!</title></head><body>"
            + "<h1>Hello Client!</h1>"
            + "</body></html>" );
    }
}

If you want to try this servlet right away, skip ahead to “WAR Files and Deployment”, where we walk through the process of deploying this servlet. Because we’ve included the WebServlet annotation in our class, this servlet does not need a web.xml file for deployment. All you have to do is bundle the class file into a particular folder within a WAR archive (a fancy ZIP file) and drop it into a directory monitored by the Tomcat server. For now, we’re going to focus on just the servlet example code itself, which is pretty simple in this case.

Let’s have a look at the example. HelloClient extends the base HttpServlet class and overrides the doGet() method to handle simple requests. In this case, we want to respond to any GET request by sending back a one-line HTML document that says “Hello Client!” First, we tell the container what kind of response we are going to generate, using the setContentType() method of the HttpServletResponse object. We specify the MIME type “text/html” for our HTML response. Then, we get the output stream using the getWriter() method and print the message to it. It is not necessary for us to explicitly close the stream. We’ll talk more about managing the output stream throughout this chapter.

ServletExceptions

The doGet() method of our example servlet declares that it can throw a ServletException. All of the service methods of the Servlet API may throw a ServletException to indicate that a request has failed. A ServletException can be constructed with a string message and an optional Throwable parameter that can carry any corresponding exception representing the root cause of the problem:

    throw new ServletException("utter failure", someException );

By default, the web server determines exactly what is shown to the user whenever a ServletException is thrown; often there is a “development mode” where the exception and its stack trace are displayed. Using the web.xml file, you can designate custom error pages. (See the section “Error and Index Pages” for details.)

Alternatively, a servlet may throw an UnavailableException, a subclass of ServletException, to indicate that it cannot handle requests. This exception can be thrown to indicate that the condition is permanent or that it should last for a specified period of seconds.

Content type

Before fetching the output stream and writing to it, we must specify the kind of output we are sending by calling the response parameter’s setContentType() method. In this case, we set the content type to text/html, which is the proper MIME type for an HTML document. In general, though, it’s possible for a servlet to generate any kind of data, including audio, video, or some other kind of text or binary document. If we were writing a generic FileServlet to serve files like a regular web server, we might inspect the filename extension and determine the MIME type from that or from direct inspection of the data. (This is a good use for the java.nio.file.Files probeConentType() method!) For writing binary data, you can use the getOutputStream() method to get an OutputStream as opposed to a Writer.

The content type is used in the Content-Type: header of the server’s HTTP response, which tells the client what to expect even before it starts reading the result. This allows your web browser to prompt you with the “Save File” dialog when you click on a ZIP archive or executable program. When the content-type string is used in its full form to specify the character encoding (for example, text/html; charset=ISO-8859-1), the information is also used by the servlet engine to set the character encoding of the PrintWriter output stream. As a result, you should always call the setContentType() method before fetching the writer with the getWriter() method. The character encoding can also be set separately via the servlet response setCharacterEncoding() method.

The Servlet Response

In addition to providing the output stream for writing content to the client, the HttpServletResponse object provides methods for controlling other aspects of the HTTP response, including headers, error result codes, redirects, and servlet container buffering.

HTTP headers are metadata name/value pairs sent with the response. You can add headers (standard or custom) to the response with the setHeader() and addHeader() methods (headers may have multiple values). There are also convenience methods for setting headers with integer and date values:

    response.setIntHeader("MagicNumber", 42);
    response.setDateHeader("CurrentTime", System.currentTimeMillis() );

When you write data to the client, the servlet container automatically sets the HTTP response code to a value of 200, which means OK. Using the sendError() method, you can generate other HTTP response codes. HttpServletResponse contains predefined constants for all of the standard codes. Here are a few common ones:

    HttpServletResponse.SC_OK
    HttpServletResponse.SC_BAD_REQUEST
    HttpServletResponse.SC_FORBIDDEN
    HttpServletResponse.SC_NOT_FOUND
    HttpServletResponse.SC_INTERNAL_SERVER_ERROR
    HttpServletResponse.SC_NOT_IMPLEMENTED
    HttpServletResponse.SC_SERVICE_UNAVAILABLE

When you generate an error with sendError(), the response is over and you can’t write any actual content to the client. You can specify a short error message, however, which may be shown to the client. (See the section “A Simple Filter”.)

An HTTP redirect is a special kind of response that tells the client web browser to go to a different URL. Normally this happens quickly and without any interaction from the user. You can send a redirect with the sendRedirect() method:

    response.sendRedirect("http://www.oreilly.com/");

While we’re talking about the response, we should say a few words about buffering. Most responses are buffered internally by the servlet container until the servlet service method has exited or a preset maximum size has been reached. This allows the container to set the HTTP content-length header automatically, telling the client how much data to expect. You can control the size of this buffer with the setBufferSize() method, specifying a size in bytes. You can even clear it and start over if no data has been written to the client. To clear the buffer, use isCommitted() to test whether any data has been set, then use resetBuffer() to dump the data if none has been sent. If you are sending a lot of data, you may wish to set the content length explicitly with the setContentLength() method.

Servlet Parameters

Our first example showed how to accept a basic request. Of course, to do anything really useful, we’ll need to get some information from the client. Fortunately, the servlet engine handles this for us, interpreting both GET and POST form-encoded data from the client and providing it to us through the simple getParameter() method of the servlet request.

GET, POST, and “extra path”

There are two common ways to pass information from your web browser to a servlet or CGI program. The most general is to “post” it, meaning that your client encodes the information and sends it as a stream to the program, which decodes it. Posting can be used to upload large amounts of form data or other data, including files. The other way to pass information is to somehow encode the information in the URL of your client’s request. The primary way to do this is to use GET-style encoding of parameters in the URL string. In this case, the web browser encodes the parameters and appends them to the end of the URL string. The server decodes them and passes them to the application.

As we described in Chapter 14, GET-style encoding takes the parameters and appends them to the URL in a name/value fashion, with the first parameter preceded by a question mark (?) and the rest separated by ampersands (&). The entire string is expected to be URL-encoded: any special characters (such as spaces, ?, and & in the string) are specially encoded.

Another way to pass data in the URL is called extra path. This simply means that when the server has located your servlet or CGI program as the target of a URL, it takes any remaining path components of the URL string and hands them over as an extra part of the URL. For example, consider these URLs:

    http://www.myserver.example/servlets/MyServlet
    http://www.myserver.example/servlets/MyServlet/foo/bar

Suppose the server maps the first URL to the servlet called MyServlet. When given the second URL, the server also invokes MyServlet, but considers /foo/bar to be “extra path” that can be retrieved through the servlet request getExtraPath() method. This technique is useful for making more human-readable and meaningful URL pathnames, especially for document-centric content.

Both GET and POST encoding can be used with HTML forms on the client by specifying get or post in the action attribute of the form tag. The browser handles the encoding; on the server side, the servlet engine handles the decoding.

The content type used by a client to post form data to a servlet is: “application/x-www-form-urlencoded.” The Servlet API automatically parses this kind of data and makes it available through the getParameter() method. However, if you do not call the getParameter() method, the data remains available, unparsed, in the input stream and can be read by the servlet directly.

GET or POST: Which one to use?

To users, the primary difference between GET and POST is that they can see the GET information in the encoded URL shown in their web browser. This can be useful because the user can cut and paste that URL (the result of a search, for example) and mail it to a friend or bookmark it for future reference. POST information is not visible to the user and ceases to exist after it’s sent to the server. This behavior goes along with the protocol’s intent that GET and POST are to have different semantics. By definition, the result of a GET operation is not supposed to have any side effects; that is, it’s not supposed to cause the server to perform any persistent operations (such as making a purchase in a shopping cart). In theory, that’s the job of POST. That’s why your web browser warns you about reposting form data again if you hit reload on a page that was the result of a form posting.

The extra path style would be useful for a servlet that retrieves files or handles a range of URLs in a human-readable way. Extra path information is often useful for URLs that the user must see or remember, because it looks like any other path.

The ShowParameters Servlet

Our first example didn’t do much. This next example prints the values of any parameters that were received. We’ll start by handling GET requests and then make some trivial modifications to handle POST as well. Here’s the code:

import java.io.*;
import javax.servlet.http.*;
import java.util.*;

public class ShowParameters extends HttpServlet
{
    public void doGet(HttpServletRequest request, HttpServletResponse response)
      throws IOException
    {
        showRequestParameters( request, response );
    }

    void showRequestParameters(HttpServletRequest request,
        HttpServletResponse response)
        throws IOException
    {
        response.setContentType("text/html");
        PrintWriter out = response.getWriter();

        out.println(
          "<html><head><title>Show Parameters</title></head><body>"
          + "<h1>Parameters</h1><ul>");

        Map<String, String[]> params = request.getParameterMap();
        for ( String name : params.keySet() )
        {
            String [] values = params.get( name );
            out.println("<li>"+ name +" = "+ Arrays.asList(values) );
        }

        out.close(  );
    }
}

As in the first example, we override the doGet() method. We delegate the request to a helper method that we’ve created, called showRequestParameters(), a method that enumerates the parameters using the request object’s getParameterMap() method, which returns a map of parameter name to values, and prints the names and values. Note that a parameter may have multiple values if it is repeated in the request from the client, hence the map contains String []. To make thing pretty, we listed each parameter in HTML with <li> tag.

As it stands, our servlet would respond to any URL that contains a GET request. Let’s round it out by adding our own form to the output and also accommodating POST method requests. To accept posts, we override the doPost() method. The implementation of doPost() could simply call our showRequestParameters() method, but we can make it simpler still. The API lets us treat GET and POST requests interchangeably because the servlet engine handles the decoding of request parameters. So we simply delegate the doPost() operation to doGet().

Add the following method to the example:

    public void doPost( HttpServletRequest request, HttpServletResponse response)
      throws ServletException, IOException 
    {
        doGet( request, response );
    }

Now, let’s add an HTML form to the output. The form lets the user fill in some parameters and submit them to the servlet. Add this line to the showRequestParameters() method before the call to out.close():

    out.println("</ul><p><form method=\"POST\" action=\"" 
            + request.getRequestURI() + "\">"
      + "Field 1 <input name=\"Field 1\" size=20><br>"
      + "Field 2 <input name=\"Field 2\" size=20><br>"
      + "<br><input type=\"submit\" value=\"Submit\"></form>"
    );

The form’s action attribute is the URL of our servlet so that our servlet will get the data back. We use the getRequestURI() method to get the location of our servlet. For the method attribute, we’ve specified a POST operation, but you can try changing the operation to GET to see both styles.

So far, we haven’t done anything terribly exciting. In the next example, we’ll add some power by introducing a user session to store client data between requests. But before we go on, we should mention a useful standard servlet, SnoopServlet, that is akin to our previous example.

User Session Management

One of the nicest features of the Servlet API is its simple mechanism for managing a user session. By a session, we mean that the servlet can maintain information over multiple pages and through multiple transactions as navigated by the user; this is also called maintaining state. Providing continuity through a series of web pages is important in many kinds of applications, such as handling a login process or tracking purchases in a shopping cart. In a sense, session data takes the place of instance data in your servlet object. It lets you store data between invocations of your service methods.

Session tracking is supported by the servlet container; you normally don’t have to worry about the details of how it’s accomplished. It’s done in one of two ways: using client-side cookies or URL rewriting. Client-side cookies are a standard HTTP mechanism for getting the client web browser to cooperate in storing state information for you. A cookie is basically just a name/value attribute that is issued by the server, stored on the client, and returned by the client whenever it is accessing a certain group of URLs on a specified server. Cookies can track a single session or multiple user visits.

URL rewriting appends session-tracking information to the URL, using GET-style encoding or extra path information. The term rewriting applies because the server rewrites the URL before it is seen by the client and absorbs the extra information before it is passed back to the servlet. In order to support URL rewriting, a servlet must take the extra step to encode any URLs it generates in content (e.g., HTML links that may return to the page) using a special method of the HttpServletResponse object. We’ll describe this later. You need to allow for URL rewriting by the server if you want your application to work with browsers that do not support cookies or have them disabled. Many sites simply choose not to work without cookies.

To the servlet programmer, state information is made available through an HttpSession object, which acts like a hashtable for storing any objects you would like to carry through the session. The objects stay on the server side; a special identifier is sent to the client through a cookie or URL rewriting. On the way back, the identifier is mapped to a session, and the session is associated with the servlet again.

The ShowSession Servlet

Here’s a simple servlet that shows how to store some string information to track a session:

    import java.io.*;
    import javax.servlet.ServletException;
    import javax.servlet.http.*;
    import java.util.Enumeration;

    public class ShowSession extends HttpServlet {

        public void doPost(
            HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException
        {
            doGet( request, response );
        }

        public void doGet(
            HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException
        {
            HttpSession session = request.getSession();
            boolean clear = request.getParameter("clear") != null;
            if ( clear )
                session.invalidate();
            else {
                String name = request.getParameter("Name");
                String value = request.getParameter("Value");
                if ( name != null && value != null )
                    session.setAttribute( name, value );
            }

            response.setContentType("text/html");
            PrintWriter out = response.getWriter();
            out.println(
              "<html><head><title>Show Session</title></head><body>");

            if ( clear )
                out.println("<h1>Session Cleared:</h1>");
            else {
                out.println("<h1>In this session:</h1><ul>");
                Enumeration names = session.getAttributeNames();
                while ( names.hasMoreElements() ) {
                    String name = (String)names.nextElement();
                    out.println( "<li>"+name+" = " +session.getAttribute( 
                        name ) );
                }
            }

            out.println(
              "</ul><p><hr><h1>Add String</h1>"
              + "<form method=\"POST\" action=\""
              + request.getRequestURI() +"\">"
              + "Name: <input name=\"Name\" size=20><br>"
              + "Value: <input name=\"Value\" size=20><br>"
              + "<br><input type=\"submit\" value=\"Submit\">"
              + "<input type=\"submit\" name=\"clear\" value=\"Clear\"></form>"
            );
        }
    }

When you invoke the servlet, you are presented with a form that prompts you to enter a name and a value. The value string is stored in a session object under the name provided. Each time the servlet is called, it outputs the list of all data items associated with the session. You will see the session grow as each item is added (in this case, until you restart your web browser or the server).

The basic mechanics are much like our ShowParameters servlet. Our doGet() method generates the form, which points back to our servlet via a POST method. We override doPost() to delegate back to our doGet() method, allowing it to handle everything. Once in doGet(), we attempt to fetch the user session object from the request object using getSession(). The HttpSession object supplied by the request functions like a hashtable. There is a setAttribute() method, which takes a string name and an Object argument, and a corresponding getAttribute() method. In our example, we use the getAttributeNames() method to enumerate the values currently stored in the session and to print them.

By default, getSession() creates a session if one does not exist. If you want to test for a session or explicitly control when one is created, you can call the overloaded version getSession(false), which does not automatically create a new session and returns null if there is no session. Alternately, you can check to see if a session was just created with the isNew() method. To clear a session immediately, we can use the invalidate() method. After calling invalidate() on a session, we are not allowed to access it again, so we set a flag in our example and show the “Session Cleared” message. Sessions may also become invalid on their own by timing out. You can control session timeout in the application server or through the web.xml file (via the “session-timeout” value of the “session config” section). It is possible, through an interface we’ll talk about later in this chapter, to find out when a session times out. In general, this appears to the application as either no session or a new session on the next request. User sessions are private to each web application and are not shared across applications.

We mentioned earlier that an extra step is required to support URL rewriting for web browsers that don’t support cookies. To do this, we must make sure that any URLs we generate in content are first passed through the HttpServletResponse encodeURL() method. This method takes a string URL and returns a modified string only if URL rewriting is necessary. Normally, when cookies are available, it returns the same string. In our previous example, we could have encoded the server form URL that was retrieved from getRequestURI() before passing it to the client if we wanted to allow for users without cookies.

The ShoppingCart Servlet

Now we build on the previous example to make a servlet that could be used as part of an online store. ShoppingCart lets users choose items and add them to their basket until checkout time. The page generated is not that pretty, but you can have your web designer guy clean that up with some CSS (smiley). Here we are just concentrating on the Servlet API:

    import java.io.*;
    import javax.servlet.ServletException;
    import javax.servlet.http.*;
    import java.util.Enumeration;

    public class ShoppingCart extends HttpServlet
    {
        String [] items = new String [] {
            "Chocolate Covered Crickets", "Raspberry Roaches",
            "Buttery Butterflies", "Chicken Flavored Chicklets(tm)" };

        public void doPost(
            HttpServletRequest request, HttpServletResponse response)
            throws IOException, ServletException
        {
            doGet( request, response );
        }

        public void doGet(
            HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException
        {
            response.setContentType("text/html");
            PrintWriter out = response.getWriter();

            // get or create the session information
            HttpSession session = request.getSession();
            int [] purchases = (int [])session.getAttribute("purchases");
            if ( purchases == null ) {
                purchases = new int [ items.length ];
                session.setAttribute( "purchases", purchases );
            }

            out.println( "<html><head><title>Shopping Cart</title>"
                         + "</title></head><body><p>" );

            if ( request.getParameter("checkout") != null )
                out.println("<h1>Thanks for ordering!</h1>");
            else  {
                if ( request.getParameter("add") != null ) {
                    addPurchases( request, purchases );
                    out.println(
                        "<h1>Purchase added.  Please continue</h1>");
                } else {
                    if ( request.getParameter("clear") != null )
                        for (int i=0; i<purchases.length; i++)
                             purchases[i] = 0;
                    out.println("<h1>Please Select Your Items!</h1>");
                }
                doForm( out, request.getRequestURI() );
            }
            showPurchases( out, purchases );
            out.close();
        }

        void addPurchases( HttpServletRequest request, int [] purchases ) {
            for (int i=0; i<items.length; i++) {
                String added = request.getParameter( items[i] );
                if ( added !=null && !added.equals("") )
                    purchases[i] += Integer.parseInt( added );
            }
        }

        void doForm( PrintWriter out, String requestURI ) {
            out.println( "<form method=POST action="+ requestURI +">" );

            for(int i=0; i< items.length; i++)
                out.println( "Quantity <input name=\"" + items[i]
                  + "\" value=0 size=3> of: " + items[i] + "<br>");
            out.println(
              "<p><input type=submit name=add value=\"Add To Cart\">"
              + "<input type=submit name=checkout value=\"Check Out\">"
              + "<input type=submit name=clear value=\"Clear Cart\">"
              + "</form>" );
        }

        void showPurchases( PrintWriter out, int [] purchases )
            throws IOException {

            out.println("<hr><h2>Your Shopping Basket</h2>");
            for (int i=0; i<items.length; i++)
                if ( purchases[i] != 0 )
                    out.println( purchases[i] +"  "+ items[i] +"<br>" );
        }
    }

Note that ShoppingCart has some instance data: a String array that holds a list of products. We’re making the assumption that the product selection is the same for all customers. If it’s not, we’d have to generate the product list on the fly or put it in the session for the user. We cannot store any per-request or per-user data in instance variables.

We see the same basic pattern as in our previous servlets, with doPost() delegating to doGet(), and doGet() generating the body of the output and a form for gathering new data. We’ve broken down the work using a few helper methods: doForm(), addPurchases(), and showPurchases(). Our shopping cart form has three submit buttons: one for adding items to the cart, one for checkout, and one for clearing the cart. In each case, we display the contents of the cart. Depending on the button pressed (indicated by the name of the parameter), we add new purchases, clear the list, or show the results as a checkout window.

The form is generated by our doForm() method, using the list of items for sale. As in the other examples, we supply our servlet’s address as the target of the form. Next, we placed an integer array called purchases into the user session. Each element in purchases holds a count of the number of each item the user wants to buy. We create the array after retrieving the session simply by asking the session for it. If this is a new session, and the array hasn’t been created, getAttribute() gives us a null value and we create an empty array to populate. Because we generate the form using the names from the items array, it’s easy for addPurchases() to check for each name using getParameter() and increment the purchases array for the number of items requested. We also test for the value being equal to the empty string, because some web browsers send empty strings for unused field values. Finally, showPurchases() loops over the purchases array and prints the name and quantity for each item that the user has purchased.

Cookies

In our previous examples, a session lived only until you shut down your web browser or the server. You can do more long-term user tracking or identification that lasts beyond a single browser session by managing cookies explicitly. You can send a cookie to the client by creating a javax.servlet.http.Cookie object and adding it to the servlet response using the addCookie() method. Later, you can retrieve the cookie information from the servlet request and use it to look up persistent information in a database. The following servlet sends a “Learning Java” cookie to your web browser and displays it when you return to the page:

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class CookieCutter extends HttpServlet
{
    public void doGet(HttpServletRequest request, HttpServletResponse response)
      throws IOException, ServletException
    {
        response.setContentType("text/html");
        PrintWriter out = response.getWriter(  );

        if ( request.getParameter("setcookie") != null ) {
            Cookie cookie = new Cookie("Learningjava", "Cookies!");
            cookie.setMaxAge(3600);
            response.addCookie(cookie);
            out.println("<html><body><h1>Cookie Set...</h1>");
        } else {
            out.println("<html><body>");
            Cookie[] cookies = request.getCookies(  );
            if ( cookies.length == 0 ) {
                out.println("<h1>No cookies found...</h1>");
            } else {
                for (int i = 0; i < cookies.length; i++)
                    out.print("<h1>Name: "+ cookies[i].getName() + "<br>"
                              + "Value: " + cookies[i].getValue() + "</h1>" );
            }
            out.println("<p><a href=\""+ request.getRequestURI()
              +"?setcookie=true\">"
              +"Reset the Learning Java cookie.</a>");
        }
        out.println("</body></html>");
    }
}

This example simply enumerates the cookies supplied by the request object using the getCookies() method and prints their names and values. We provide a GET-style link that points back to our servlet with a parameter setcookie, indicating that we should set the cookie. In that case, we create a Cookie object using the specified name and value and add it to the response with the addCookie() method. We set the maximum age of the cookie to 3,600 seconds, so it remains in the browser for an hour before being discarded (we’ll talk about tracking a cookie across multiple sessions later). Specifying a negative time period indicates that the cookie should not be stored persistently and should be erased when the browser exits. A time period of 0 deletes any existing cookie immediately.

Two other Cookie methods are of interest: setDomain() and setPath(). These methods allow you to specify the domain name and path component that determines where the client will send the cookie. If you’re writing some kind of purchase applet for L.L. Bean, you don’t want clients sending your cookies over to Eddie Bauer. In practice, however, this cannot happen. The default domain is the domain of the server sending the cookie. (You cannot in general specify other domains for security reasons.) The path parameter defaults to the base URL of the servlet, but you can specify a wider (or narrower) range of URLs on the host server by manually setting this parameter.

The ServletContext API

Web applications have access to the server environment through the ServletContext API, a reference to which can be obtained from the HttpServlet getServletContext() method:

    ServletContext context = getServletContext();

Each web app has its own ServletContext. The context provides a shared space in which a web app’s servlets may rendezvous and share objects. Objects may be placed into the context with the setAttribute() method and retrieved by name with the getAttribute() method:

    context.setAttribute("myapp.statistics", myObject);
    Object stats = context.getAttribute("myapp.statistics");

Attribute names beginning with “java.” and “javax.” are reserved for use by Java. You can opt to use the standard package-naming conventions for your attributes to avoid conflicts.

The ServletContext provides a listener API that can be used to add items to the servlet context when the application server starts up and to tear them down when it shuts down. This is a good way to initiate shared services. We’ll show an example of this in the next section when we talk about asynchronous servlets.

One standard attribute that can be accessed through the servlet context is a reference to a private working directory represented by a java.io.File object. This temp directory is guaranteed unique to the web app. No guarantees are made about it being cleared upon exit, however, so you should use the temporary file API to create files here (unless you wish to try to keep them beyond the server exit). For example:

    File tmpDir = (File)context.getAttribute("javax.servlet.context.tempdir");
    File tmpFile = File.createTempFile( "appprefix", "appsuffix", tmpDir );

The servlet context also provides direct access to the web app’s files from its root directory. The getResource() method is similar to the Class getResource() method (see Chapter 12). It takes a pathname and returns a special local URL for accessing that resource. In this case, it takes a path rooted in the servlet base directory (WAR file). The servlet may obtain references to files, including those in the WEB-INF directory, using this method. For example, a servlet could fetch an input stream for its own web.xml file:

    InputStream in = context.getResourceAsStream("/WEB-INF/web.xml");

It could also use a URL reference to get one of its images:

    URL bunnyURL = context.getResource("/images/happybunny.gif");

The method getResourcePaths() may be used to fetch a directory-style listing of all the resource files available matching a specified path. The return value is a java.util.Set collection of strings naming the resources available under the specified path. For example, the path / lists all files in the WAR; the path /WEB-INF/ lists at least the web.xml file and classes directory.

The ServletContext is also a factory for RequestDispatcher objects, which we won’t cover here, but which allow for servlets to forward to or include the results of other servlets in their responses.

Asynchronous Servlets

The following is a somewhat advanced topic, but we’ll cover it now to round out our discussion of the Servlet API. Servlets may run in an asynchronous mode, where the servlet service method is allowed to exit, but the response to the user is held open until it can be completed efficiently. While the response is held open, it does not actively consume resources or block threads in the servlet container. This is intended to support nonblocking, NIO-style services as discussed in Chapters 13 and 14.

Asynchronous servlets are an excellent way to handle very slow servlet processes, as long as there is a way to efficiently poll for or receive some truly asynchronous notification of their completion. As we discussed when talking about NIO, one of the limiting factors in the scalability of web services is thread consumption. Threads hold a lot of resources and so simply allowing them to block and wait for completion of a task is inefficient. As we saw earlier, NIO supports a style of programming where one thread can manage a large number of network connections. Asynchronous servlets allow servlets to participate in this model. The basic idea is that you pass a job to a background service and put the servlet request on the shelf until it can be completed. As long as the background processor is implemented in such a way that it can manage the jobs without waiting (via polling or receiving updates asynchronously), then there is no point where threads must block.

Later in this chapter, we’ll utilize a simple test servlet called WaitServlet that simply goes to sleep for a specified period of time before returning a result. This is a prime example of an inefficient use of threads. Our dumb WaitServlet blocks a thread (by sleeping) until it is “ready” to complete the transaction. In the following example, we’ll get ahead of ourselves a bit and create a more efficient version of this tool, BackgroundWaitServlet, that will not block any threads in the servlet container while it waits.

Before we start, let’s check our preconditions for whether an asynchronous servlet will be useful: do we have an efficient way to poll or receive notification when our “task” is complete without blocking a thread? (It’s important to ask this to avoid simply moving thread blocking from the servlet to another location.) Yes, in our case, we can use a timer to notify us when the time has passed. An efficient timer implementation like java.util.Timer will use only one thread to manage many timed requests. We’ll choose to use a ScheduledExecutorService from the java.util.concurrent package for this. It will execute any Runnable for us after a specified delay and makes a perfect shared background service for our asynchronous servlet.

The following example servlet returns a generic response after a delay of five seconds. The difference between this servlet and the naive one we use elsewhere in this chapter would become apparent if we flooded our server with requests. We should find that the asynchronous version would be limited primarily by TCP/IP resources in the host OS and not by more valuable memory on the server.

import javax.servlet.*;
import javax.servlet.annotation.*;
import javax.servlet.http.*;
import java.io.*;
import java.util.concurrent.*;

@WebServlet(
    urlPatterns={"/bgwait"},
    asyncSupported = true
)
public class BackgroundWaitServlet extends HttpServlet
{
    public void doGet( HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException
    {
        final AsyncContext asyncContext = request.startAsync();
        ScheduledExecutorService executor =
            (ScheduledExecutorService)request.getServletContext().getAttribute(
            "BackgroundWaitExecutor");
        executor.schedule( new RespondLaterJob( asyncContext ), 5,
            TimeUnit.SECONDS );
    }
}

class RespondLaterJob implements Runnable
{
    private AsyncContext asyncContext;

    RespondLaterJob( AsyncContext asyncContext ) {
        this.asyncContext = asyncContext;
    }

    @Override
    public void run()
    {
        try {
            ServletResponse response = asyncContext.getResponse();
            response.setContentType("text/html");
            PrintWriter out = response.getWriter();
            out.println(
                "<html><body><h1>WaitServlet Response</h1></body></html>"
            );
        } catch ( IOException e ) { throw new RuntimeException( e ); }

        asyncContext.complete();
    }
}

We’ve included the WebServlet annotation in this example in order to show the asyncSupported attribute. This attribute must be set on any servlets and servlet filters (discussed later) that will be involved in the request.

The implementation of our doGet() method is straightforward: we initiate the asynchronous behavior by calling the startAsync() method on the servlet request. That method returns to us an AsyncContext object that represents the caller context and includes the servlet request and response objects. At this point, we are free to arrange to service the request using any means we wish; the only requirement is that we must keep the AsyncContext object with our task so that it can be used later to send the results and close the transaction.

In our example, we look up our shared ScheduledExcecutorService from the servlet context by name (“BackgroundWaitExecutor”) and pass it a custom Runnable object. (We’ll talk about how the service got there in a bit.) We’ve created a RespondLaterJob that implements Runnable and holds onto the AsyncContext for later use. When the job runs in the future, we simply get the servlet response from the AsyncContext and send our response as usual. The final step is to call the complete() method on AsyncContext in order to close the call and return to the client.

The final step raises a couple of interesting issues: first, we do not necessarily have to call complete() immediately after writing to the response. Instead, we could write part of the result and go back to sleep, waiting for our service to wake us up when there is more data. Indeed, this is how we might work with an NIO data source. Second, instead of calling complete() to finalize the results for the client, we could use an alternate method, dispatch(), to forward the servlet request to another servlet, perhaps in a chain of servlets. The next servlet could write additional content or perhaps simply use resources put into the servlet context by the first servlet to handle the request. The dispatch() method accepts a URL string for the target servlet or, when called with no arguments, sends the request back to the original servlet.

OK, so how did our ScheduledExecutorService get into the servlet context? The best way to manage shared services and resources in the servlet context is via a ServletContextListener. A context listener has two lifecycle methods that can be used to set up and tear down services when the servlet container starts up and shuts down, respectively. We can deploy our listener simply by marking the class with a WebListener annotation and placing it in the WAR file as usual.

import javax.servlet.*;
import javax.servlet.annotation.*;
import java.util.concurrent.*;

@WebListener
public class BackgroundWaitService implements ServletContextListener
{
    ScheduledExecutorService executor;

    public void contextInitialized( ServletContextEvent sce )
    {
        this.executor = Executors.newScheduledThreadPool( 3 );
        sce.getServletContext().setAttribute( "BackgroundWaitExecutor",
            executor );
    }

    public void contextDestroyed(ServletContextEvent sce)
    {
        ScheduledExecutorService executor =
            Executors.newScheduledThreadPool( 3 );
        executor.shutdownNow();
    }
}

WAR Files and Deployment

As we described in the introduction to this chapter, a WAR file is an archive that contains all the parts of a web application: Java class files for servlets and web services, JSPs, HTML pages, images, and other resources. The WAR file is simply a JAR file (which is itself a fancy ZIP file) with specified directories for the Java code and one designated configuration file: the web.xml file, which tells the application server what to run and how to run it. WAR files always have the extension .war, but they can be created and read with the standard jar tool.

The contents of a typical WAR might look like this, as revealed by the jar tool:

    $ jar tvf shoppingcart.war

        index.html
        purchase.html
        receipt.html
        images/happybunny.gif
        WEB-INF/web.xml
        WEB-INF/classes/com/mycompany/PurchaseServlet.class
        WEB-INF/classes/com/mycompany/ReturnServlet.class
        WEB-INF/lib/thirdparty.jar

When deployed, the name of the WAR becomes, by default, the root path of the web application—in this case, shoppingcart. Thus, the base URL for this web app, if deployed on http://www.oreilly.com, is http://www.oreilly.com/shoppingcart/, and all references to its documents, images, and servlets start with that path. The top level of the WAR file becomes the document root (base directory) for serving files. Our index.html file appears at the base URL we just mentioned, and our happybunny.gif image is referenced as http://www.oreilly.com/shoppingcart/images/happybunny.gif.

The WEB-INF directory (all caps, hyphenated) is a special directory that contains all deployment information and application code. This directory is protected by the web server, and its contents are not visible to outside users of the application, even if you add WEB-INF to the base URL. Your application classes can load additional files from this area using getResource() on the servlet context, however, so it is a safe place to store application resources. The WEB-INF directory also contains the web.xml file, which we’ll talk more about in the next section.

The WEB-INF/classes and WEB-INF/lib directories contain Java class files and JAR libraries, respectively. The WEB-INF/classes directory is automatically added to the classpath of the web application, so any class files placed here (using the normal Java package conventions) are available to the application. After that, any JAR files located in WEB-INF/lib are appended to the web app’s classpath (the order in which they are appended is, unfortunately, not specified). You can place your classes in either location. During development, it is often easier to work with the “loose” classes directory and use the lib directory for supporting classes and third-party tools. It’s also possible to install JAR files directly in the servlet container to make them available to all web apps running on that server. This is often done for common libraries that will be used by many web apps. The location for placing the libraries, however, is not standard and any classes that are deployed in this way cannot be automatically reloaded if changed—a feature of WAR files that we’ll discuss later. Servlet API requires that each server provide a directory for these extension JARs and that the classes there will be loaded by a single classloader and made visible to the web application.

Configuration with web.xml and Annotations

The web.xml file is an XML configuration file that lists servlets and related entities to be deployed, the relative names (URL paths) under which to deploy them, their initialization parameters, and their deployment details, including security and authorization. For most of the history of Java web applications, this was the only deployment configuration mechanism. However, as of the Servlet 3.0 API, there are additional options. Most configuration can now be done using Java annotations. We saw the WebServlet annotation used in the first example, HelloClient, to declare the servlet and specify its deployment URL path. Using the annotation, we could deploy the servlet to the Tomcat server without any web.xml file. Another option with the Servlet 3.0 API is to deploy servlet procedurally—using Java code at runtime.

In this section we will describe both the XML and annotation style of configuration. For most purposes, you will find it easier to use the annotations, but there are a couple of reasons to understand the XML configuration as well. First, the web.xml can be used to override or extend the hardcoded annotation configuration. Using the XML, you can change configuration at deployment time without recompiling the classes. In general, configuration in the XML will take precedence over the annotations. It is also possible to tell the server to ignore the annotations completely, using an attribute called metadata-complete in the web.xml. Next, there may be some residual configuration, especially relating to options of the servlet container, which can only be done through XML.

We will assume that you have at least a passing familiarity with XML, but you can simply copy these examples in a cut-and-paste fashion. (For details about working with Java and XML, see Chapter 24.) Let’s start with a simple web.xml file for our HelloClient servlet example. It looks like this:

    <web-app>
        <servlet>
            <servlet-name>helloclient1</servlet-name>
            <servlet-class>HelloClient</servlet-class>
        </servlet>
        <servlet-mapping>
            <servlet-name>helloclient1</servlet-name>
            <url-pattern>/hello</url-pattern>
        </servlet-mapping>
    </web-app>

The top-level element of the document is called <web-app>. Many types of entries may appear inside the <web-app>, but the most basic are <servlet> declarations and <servlet-mapping> deployment mappings. The <servlet> declaration tag is used to declare an instance of a servlet and, optionally, to give it initialization and other parameters. One instance of the servlet class is instantiated for each <servlet> tag appearing in the web.xml file.

At minimum, the <servlet> declaration requires two pieces of information: a <servlet-name>, which serves as a handle to reference the servlet elsewhere in the web.xml file, and the <servlet-class> tag, which specifies the Java class name of the servlet. Here, we named the servlet helloclient1. We named it like this to emphasize that we could declare other instances of the same servlet if we wanted to, possibly giving them different initialization parameters, etc. The class name for our servlet is, of course, HelloClient. In a real application, the servlet class would likely have a full package name, such as com.oreilly.servlets.HelloClient.

A servlet declaration may also include one or more initialization parameters, which are made available to the servlet through the ServletConfig object’s getInitParameter() method:

    <servlet>
        <servlet-name>helloclient1</servlet-name>
        <servlet-class>HelloClient</servlet-class>
        <init-param>
            <param-name>foo</param-name>
            <param-value>bar</param-value>
        </init-param>
    </servlet>

Next, we have our <servlet-mapping>, which associates the servlet instance with a path on the web server:

    <servlet-mapping>
        <servlet-name>helloclient1</servlet-name>
        <url-pattern>/hello</url-pattern>
    </servlet-mapping>

Here we mapped our servlet to the path /hello. (We could include additional url-patterns in the mapping if desired.) If we later name our WAR learningjava.war and deploy it on www.oreilly.com, the full path to this servlet would be http://www.oreilly.com/learningjava/hello. Just as we could declare more than one servlet instance with the <servlet> tag, we could declare more than one <servlet-mapping> for a given servlet instance. We could, for example, redundantly map the same helloclient1 instance to the paths /hello and /hola. The <url-pattern> tag provides some very flexible ways to specify the URLs that should match a servlet. We’ll talk about this in detail in the next section.

Finally, we should mention that although the web.xml example listed earlier will work on some application servers, it is technically incomplete because it is missing formal information that specifies the version of XML it is using and the version of the web.xml file standard with which it complies. To make it fully compliant with the standards, add a line such as:

    <?xml version="1.0" encoding="ISO-8859-1"?>

As of Servlet API 2.5, the web.xml version information takes advantage of XML Schemas. (We’ll talk about XML DTDs and XML Schemas in Chapter 24.) The additional information is inserted into the <web-app> element:

   <web-app
        xmlns="http://java.sun.com/xml/ns/j2ee"  
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee
        http://java.sun.com/xml/ns/j2ee/web-app_2_5.xsd”
        version=2.5>          

If you leave them out, the application may still run, but it will be harder for the servlet container to detect errors in your configuration and give you clear error messages.

The equivalent of the preceding servlet declaration and mapping is, as we saw earlier, our one line annotation:

@WebServlet(urlPatterns={"/hello", "/hola"})
public class HelloClient extends HttpServlet {
   ...
}

Here the WebServlet attribute urlPatterns allows us to specify one or more URL patterns that are the equivalent to the url-pattern declaration in the web.xml.

URL Pattern Mappings

The <url-pattern> specified in the previous example was a simple string, /hello. For this pattern, only an exact match of the base URL followed by /hello would invoke our servlet. The <url-pattern> tag is capable of more powerful patterns, however, including wildcards. For example, specifying a <url-pattern> of /hello* allows our servlet to be invoked by URLs such as http://www.oreilly.com/learningjava/helloworld or .../hellobaby. You can even specify wildcards with extensions (e.g., *.html or *.foo, meaning that the servlet is invoked for any path that ends with those characters).

Using wildcards can result in more than one match. Consider URLs ending in /scooby* and /scoobydoo*. Which should be matched for a URL ending in .../scoobydoobiedoo? What if we have a third possible match because of a wildcard suffix extension mapping? The rules for resolving these are as follows.

First, any exact match is taken. For example, /hello matches the /hello URL pattern in our example regardless of any additional /hello*. Failing that, the container looks for the longest prefix match. So /scoobydoobiedoo matches the second pattern, /scoobydoo*, because it is longer and presumably more specific. Failing any matches there, the container looks at wildcard suffix mappings. A request ending in .foo matches a *.foo mapping at this point in the process. Finally, failing any matches there, the container looks for a default, catchall mapping named /*. A servlet mapped to /* picks up anything unmatched by this point. If there is no default servlet mapping, the request fails with a “404 not found” message.

Deploying HelloClient

Once you’ve deployed the HelloClient servlet, it should be easy to add examples to the WAR as you work with them in this chapter. In this section, we’ll show you how to build a WAR by hand. In “Building WAR Files with Ant” later in this chapter, we’ll show a more realistic way to manage your applications using the popular build tool, Ant. You can also grab the full set of examples, along with their source code, in the learningjava.war file from this book’s website at http://oreil.ly/Java_4E.

To create the WAR by hand, we first create the WEB-INF and WEB-INF/classes directories. If you are using a web.xml file, place it into WEB-INF. Put the HelloClient.class into WEB-INF/classes. Use the jar command to create learningjava.war (WEB-INF at the “top” level of the archive):

    $ jar cvf learningjava.war WEB-INF

You can also include documents and other resources in the WAR by adding their names after the WEB-INF directory. This command produces the file learningjava.war. You can verify the contents using the jar command:

    $ jar tvf learningjava.war
    document1.html
    WEB-INF/web.xml
    WEB-INF/classes/HelloClient.class

Now all that is necessary is to drop the WAR into the correct location for your server. If you have not already, you should download and install Apache Tomcat. The location for WAR files is the webapps directory within your Tomcat installation directory. Place your WAR here, and start the server. If Tomcat is configured with the default port number, you should be able to point to the HelloClient servlet with one of two URLs: http://localhost:8080/learningjava/hello or http://<yourserver>:8080/learningjava/hello, where <yourserver> is the name or IP address of your server. If you have trouble, look in the logs directory of the Tomcat folder for errors.

Reloading web apps

All servlet containers are supposed to provide a facility for reloading WAR files; many support reloading of individual servlet classes after they have been modified. Reloading WARs is part of the servlet specification and is especially useful during development. Support for reloading web apps varies from server to server. Normally, all that you have to do is drop a new WAR in place of the old one in the proper location (e.g., the webapps directory for Tomcat) and the container shuts down the old application and deploys the new version. This works in Tomcat when the “autoDeploy” attribute is set (it is on by default) and also in BEA’s WebLogic application server when it is configured in development mode.

Some servers, including Tomcat, “explode” WARs by unpacking them into a directory under the webapps directory, or they allow you explicitly to configure a root directory (or “context”) for your unpacked web app through their own configuration files. In this mode, they may allow you to replace individual files, which can be especially useful for tweaking HTML or JSPs. Tomcat automatically reloads WAR files when they change them (unless configured not to), so all you have to do is drop an updated WAR over the old one and it will redeploy it as necessary. In some cases, it may be necessary to restart the server to make all changes take effect. When in doubt, shut down and restart.

Tomcat also provides a client-side “deployer” package that integrates with Ant to automate building, deploying, and redeploying applications. We’ll discuss Ant later in this chapter.

Error and Index Pages

One of the finer points of writing a professional-looking web application is taking care to handle errors well. Nothing annoys a user more than getting a funny-looking page with some technical mumbo-jumbo error information on it when he expected the receipt for his Christmas present. Through the web.xml file, it is possible to specify documents or servlets to handle error pages that are shown for various conditions, as well as the special case of welcome files (index files) that are invoked for paths corresponding to directories. At this time, there is no corresponding way to declare error pages or welcome files using annotations.

You can designate a page or servlet that can handle various HTTP error status codes, such as “404 Not Found” and “403 Forbidden,” using one or more <error-page>declarations:

    <web-app>
    ...
        <error-page>
             <error-code>404</error-code>
             <location>/notfound.html</location>
        </error-page>
        <error-page>
            <error-code>403</error-code>
            <location>/secret.html</location>
        </error-page>

Additionally, you can designate error pages based on Java exception types that may be thrown from the servlet. For example:

    <error-page>
        <exception-type>java.lang.IOException</exception-type>
        <location>/ioexception.html</location>
    </error-page>

This declaration catches any IOExceptions generated from servlets in the web app and displays the ioexception.html page. If no matching exceptions are found in the <error-page> declarations, and the exception is of type ServletException (or a subclass), the container makes a second try to find the correct handler. It looks for a wrapped exception (the “cause” exception) contained in the ServletException and attempts to match it to an error page declaration.

In the Servlet 3.0 API, you can also designate a catchall error page that will handle any unhandled error codes and exception types as follows:

    <error-page>
        <location>/anyerror.html</location>
    </error-page>

As we’ve mentioned, you can use a servlet to handle your error pages, just as you can use a static document. In fact, the container supplies several helpful pieces of information to an error-handling servlet, which the servlet can use in generating a response. The information is made available in the form of servlet request attributes through the method getAttribute():

    Object requestAttribute = servletRequest.getAttribute("name");

Attributes are like servlet parameters, except that they can be arbitrary objects. We have seen attributes of the ServletContext in “The ServletContext API” section. In this case, we are talking about attributes of the request. When a servlet (or JSP or filter) is invoked to handle an error condition, the following string attributes are set in the request:

    javax.servlet.error.servlet_name
    javax.servlet.error.request_uri
    javax.servlet.error.message

Depending on whether the <error-page> declaration was based on an <error-code> or <exception-type> condition, the request also contains one of the following two attributes:

    // status code Integer or Exception object
    javax.servlet.error.status_code
    javax.servlet.error.exception

In the case of a status code, the attribute is an Integer representing the code. In the case of the exception type, the object is the actual instigating exception.

Indexes for directory paths can be designated in a similar way. Normally, when a user specifies a directory URL path, the web server searches for a default file in that directory to be displayed. The most common example of this is the ubiquitous index.html file. You can designate your own ordered list of files to look for by adding a <welcome-file-list> entry to your web.xml file. For example:

    <welcome-file-list>
        <welcome-file>index.html</welcome-file>
        <welcome-file>index.htm</welcome-file>
    </welcome-file-list>

<welcome-file-list> specifies that when a partial request (directory path) is received, the server should search first for a file named index.html and, if that is not found, a file called index.htm. If none of the specified welcome files is found, it is left up to the server to decide what kind of page to display. Servers are generally configured to display a directory-like listing or to produce an error message.

Security and Authentication

One of the most powerful features of web app deployment with the Servlet API is the ability to define declarative security constraints, meaning that you can spell out in the web.xml file exactly which areas of your web app (URL paths to documents, directories, servlets, etc.) are login-protected, the types of users allowed access to them, and the class of security protocol required for communications. It is not necessary to write code in your servlets to implement these basic security procedures.

There are two types of entries in the web.xml file that control security and authentication. First are the <security-constraint> entries, which provide authorization based on user roles and secure transport of data, if desired. Second is the <login-config> entry, which determines the kind of authentication used for the web application.

Protecting Resources with Roles

Let’s take a look at a simple example. The following web.xml excerpt defines an area called “Secret documents” with a URL pattern of /secret/* and designates that only users with the role “secretagent” may access them. It specifies the simplest form of login process: the BASIC authentication model, which causes the browser to prompt the user with a simple pop-up username and password dialog box:

    <web-app>
    ...
        <security-constraint>
            <web-resource-collection>
                <web-resource-name>Secret documents</web-resource-name>
                <url-pattern>/secret/*</url-pattern>
            </web-resource-collection>
            <auth-constraint>
                <role-name>secretagent</role-name>
            </auth-constraint>
        </security-constraint>

        <login-config>
            <auth-method>BASIC</auth-method>
        </login-config>

Each <security-constraint> block has one <web-resource-collection> section that designates a named list of URL patterns for areas of the web app, followed by an <auth-constraint> section listing user roles that are allowed to access those areas.

We can do the equivalent configuration for a given servlet using the SecurityServlet annotation with an HttpConstraint annotation element as follows:

@ServletSecurity(
    @HttpConstraint(rolesAllowed = "secretagent")
)
public class SecureHelloClient extends HttpServlet
{ ...

You can add this annotation to our test servlet or add the XML example setup to the web.xml file for the learningjava.war file and prepare to try it out. However, there is one additional step that you’ll have to take to get this working: create the user role “secretagent” and an actual user with this role in our application server environment.

Access to protected areas is granted to user roles, not individual users. A user role is effectively just a group of users; instead of granting access to individual users by name, you grant access to roles, and users are assigned one or more roles. A user role is an abstraction from users. Actual user information (name and password, etc.) is handled outside the scope of the web app, in the application server environment (possibly integrated with the host platform operating system). Generally, application servers have their own tools for creating users and assigning individuals (or actual groups of users) their roles. A given username may have many roles associated with it.

When attempting to access a login-protected area, the user’s valid login will be assessed to see if she has the correct role for access. For the Tomcat server, adding test users and assigning them roles is easy; simply edit the file conf/tomcat-users.xml. To add a user named “bond” with the “secretagent” role, you’d add an entry such as:

  <user username="bond" password="007" roles="secretagent"/>

For other servers, you’ll have to refer to the documentation to determine how to add users and assign security roles.

Secure Data Transport

Before we move on, there is one more piece of the security constraint to discuss: the transport guarantee. Each <security-constraint> block may end with a <user-data-constraint> entry, which designates one of three levels of transport security for the protocol used to transfer data to and from the protected area over the Internet. For example:

    <security-constraint>
    ...
        <user-data-constraint>
            <transport-guarantee>CONFIDENTIAL</transport-guarantee>
        </user-data-constraint>
    </security-constraint>

The three levels are NONE, INTEGRAL, and CONFIDENTIAL. NONE is equivalent to leaving out the section, which indicates that no special transport is required. This is the standard for normal web traffic, which is generally sent in plain text over the network. The INTEGRAL level of security specifies that any transport protocol used must guarantee the data sent is not modified in transit. This implies the use of digital signatures or some other method of validating the data at the receiving end, but it does not require that the data be encrypted and hidden while it is transported. Finally, CONFIDENTIAL implies both INTEGRAL and encrypted. In practice, the only widely used secure transport in web browsers is SSL. Requiring a transport guarantee other than NONE typically forces the use of SSL by the client browser.

We can configure the equivalent transport security for a servlet using the ServletSecurity annotation along with the HttpMethodConstraint annotation, as follows:

@ServletSecurity(
    httpMethodConstraints = @HttpMethodConstraint( value="GET",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
public class SecureHelloClient extends HttpServlet { ... }

@ServletSecurity(
    value = @HttpConstraint(rolesAllowed = "secretagent"),
    httpMethodConstraints = @HttpMethodConstraint( value="GET",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
public class SecureHelloClient extends HttpServlet { ... }

Here we use the httpMethodConstraints attribute with an HttpMethodConstraint annotation to designate that the servlet may only be accessed using the HTTP GET method and only with CONFIDENTIAL level security. Combining the transport security with a rolesAllowed annotation can be done as shown in the preceding example.

Authenticating Users

This section shows how to declare a custom login form to perform user login. First, we’ll show the web.xml style and then discuss the Servlet 3.0 alternative, which gives us more flexibility.

The <login-conf> section determines exactly how a user authenticates herself (logs in) to the protected area. The <auth-method> tag allows four types of login authentication to be specified: BASIC, DIGEST, FORM, and CLIENT-CERT. In our example, we showed the BASIC method, which uses the standard web browser login and password dialog. BASIC authentication sends the user’s name and password in plain text over the Internet unless a transport guarantee has been used separately to start SSL and encrypt the data stream. DIGEST is a variation on BASIC that obscures the text of the password but adds little real security; it is not widely used. FORM is equivalent to BASIC, but instead of using the browser’s dialog, we can use our own HTML form to post the username and password data to the container. The form data can come from a static HTML page or from one generated by a servlet. Again, form data is sent in plain text unless otherwise protected by a transport guarantee (SSL). CLIENT-CERT is an interesting option. It specifies that the client must be identified using a client-side public key certificate. This implies the use of a protocol like SSL, which allows for secure exchange and mutual authentication using digital certificates. The exact method of setting up a client-side certificate is browser-dependent.

The FORM method is most useful because it allows us to customize the look of the login page (we recommend using SSL to secure the data stream). We can also specify an error page to use if the authentication fails. Here is a sample <login-config> using the form method:

    <login-config>
        <auth-method>FORM</auth-method>
        <form-login-config>
            <form-login-page>/login.html</form-login-page>
            <form-error-page>/login_error.html</form-error-page>
        </form-login-config>
    </login-config>

The login page must contain an HTML form with a specially named pair of fields for the name and password. Here is a simple login.html file:

    <html>
    <head><title>Login</title></head>
    <body>
        <form method="POST" action="j_security_check">
            Username: <input type="text" name="j_username"><br>
            Password: <input type="password" name="j_password"><br>
            <input type="submit" value="submit">
        </form>
    </body>
    </html>

The username field is called j_username, the password field is called j_password, and the URL used for the form action attribute is j_security_check. There are no special requirements for the error page, but normally you will want to provide a “try again” message and repeat the login form.

In the Servlet 3.0 API, the HttpServletRequest API contains methods for explicitly logging in and logging out a user. However, it is also specified that a user’s login is no longer valid after the user session times out or is invalidated. Therefore, you can effectively log out the user by calling invalidate() on the session:

    request.logout();    request.getSession().invalidate();

With Servlet 3.0, we can also take control of the login process ourselves by utilizing the ServletRequest login() method to perform our own login operation. All we have to do is arrange our own login servlet that accepts a username and password (securely) and then calls the login method. This gives you great flexibility over how and when the user login occurs. And, of course, you can log the user out with the corresponding logout() method.

@ServletSecurity(
    httpMethodConstraints = @HttpMethodConstraint( value="POST",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
@WebServlet( urlPatterns={"/mylogin"} )
public class MyLogin extends HttpServlet
{
    public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException
    {
       String user = request.getParameter("user");
       String password = request.getParameter("pass");
       request.login( user, password );
       // Dispatch or redirect to the next page...
    }

Procedural Authorization

We should mention that in addition to the declarative security offered by the web.xml file, servlets may perform their own active procedural (or programmatic) security using all the authentication information available to the container. We won’t cover this in detail, but here are the basics.

The name of the authenticated user is available through the method HttpServletRequest getRemoteUser(), and the type of authentication provided can be determined with the getAuthType() method. Servlets can work with security roles using the isUserInRole() method. (Doing this requires adding some additional mappings in the web.xml file, which allows the servlet to refer to the security roles by reference names.)

For advanced applications, a java.security.Principal object for the user can be retrieved with the getUserPrincipal() method of the request. In the case where a secure transport like SSL was used, the method isSecure() returns true, and detailed information about how the principal was authenticated—the cipher type, key size, and certificate chain—is made available through request attributes. It is useful to note that the notion of being “logged in” to a web application, from the servlet container’s point of view, is defined as there being a valid (non-null) value returned by the getUserPrincipal() method.

Servlet Filters

The servlet Filter API generalizes the Java Servlet API to allow modular component “filters” to operate on the servlet request and responses in a sort of pipeline. Filters are chained, meaning that when more than one filter is applied, the servlet request is passed through each filter in succession, with each having an opportunity to act upon or modify the request before passing it to the next filter. Similarly, upon completion, the servlet result is effectively passed back through the chain on its return trip to the browser. Servlet filters may operate on any requests to a web application, not just those handled by the servlets; they may filter static content, as well. You can also control whether filters are applied to error and welcome pages as well as pages forwarded or included using the request dispatcher (from servlet to servlet).

Filters can be declared and mapped to servlets in the web.xml file or using annotations. There are two ways to map a filter: using a URL pattern like those used for servlets or by specifying a servlet by its servlet name as defined in its servlet config. Filters obey the same basic rules as servlets when it comes to URL matching, but when multiple filters match a path, they are each invoked.

When using web.xml, the order of the chain is determined by the order in which matching filter mappings appear in the web.xml file, with <url-pattern> matches taking precedence over <servlet-name> matches. This is contrary to the way in which servlet URL matching is done, with specific matches taking the highest priority. Filter chains are constructed as follows. First, each filter with a matching URL pattern is called in the order in which it appears in the web.xml file; next, each filter with a matching servlet name is called, also in order of appearance. URL patterns take a higher priority than filters specifically associated with a servlet, so in this case, patterns such as /* have first crack at an incoming request.

Servlet filters may be declared and mapped using the WebFilter annotation. There is no corresponding way to control filter ordering using annotations; however, as always you can mix annotations and web.xml to minimize the XML configuration by only declaring the filter mappings in the XML. (We’ll discuss configuration more later in this chapter.)

The Filter API is very simple and mimics the Servlet API. A servlet filter implements the javax.servlet.Filter interface and implements three methods: init(), doFilter(), and destroy(). The doFilter() method is where the work is performed. For each incoming request, the ServletRequest and ServletResponse objects are passed to doFilter(). Here, we have a chance to examine and modify these objects—or even substitute our own objects for them—before passing them to the next filter and, ultimately, the servlet (or user) on the other side. Our link to the rest of the filter chain is another parameter of doFilter(), the FilterChain object. With FilterChain, we can invoke the next element in the pipeline. The following section presents an example.

A Simple Filter

For our first filter, we’ll do something easy but practical: create a filter that limits the number of concurrent connections to its URLs. We’ll simply have our filter keep a counter of the active connections passing through it and turn away new requests when they exceed a specified limit:

import java.io.*;
import javax.servlet.*;
import javax.servlet.annotation.*;
import javax.servlet.http.*;

public class ConLimitFilter implements Filter
{
    int limit;
    volatile int count;

    public void init( FilterConfig filterConfig )
        throws ServletException
    {
        String s = filterConfig.getInitParameter("limit");
        if ( s == null )
            throw new ServletException("Missing init parameter: "+limit);
        limit = Integer.parseInt( s );
    }

    public void doFilter ( 
        ServletRequest req, ServletResponse res, FilterChain chain ) 
            throws IOException, ServletException 
    {
        if ( count > limit ) {
            HttpServletResponse httpRes = (HttpServletResponse)res;
            httpRes.sendError( httpRes.SC_SERVICE_UNAVAILABLE, "Too Busy.");
        } else {
            ++count;
            chain.doFilter( req, res );
            --count;
        }
    }

    public void destroy() { }
}

ConLimitFilter implements the three lifecycle methods of the Filter interface: init(), doFilter(), and destroy(). In our init() method, we use the FilterConfig object to look for an initialization parameter named “limit” and turn it into an integer. Users can set this value in the section of the web.xml file where the instance of our filter is declared or in the annotation as shown. The doFilter() method implements all our logic. First, it receives ServletRequest and ServletResponse object pairs for incoming requests. Depending on the counter, it then either passes them down the chain by invoking the next doFilter() method on the FilterChain object, or rejects them by generating its own response. We use the standard HTTP message “504 Service Unavailable” when we deny new connections.

Calling doFilter() on the FilterChain object continues processing by invoking the next filter in the chain or by invoking the servlet if ours is the last filter. Alternatively, when we choose to reject the call, we use the ServletResponse to generate our own response and then simply allow doFilter() to exit. This stops the processing chain at our filter, although any filters called before us still have an opportunity to intervene as the request effectively traverses back to the client.

Notice that ConLimitFilter increments the count before calling doFilter() and decrements it after. Prior to calling doFilter(), we can work on the request before it reaches the rest of the chain and the servlet. After the call to doFilter(), the chain to the servlet has completed, and the request is sent back to the client. This is our opportunity to do any post-processing of the response.

Finally, we should mention that although we’ve been talking about the servlet request and response as if they were HttpServletRequest and HttpServletResponse, the doFilter() method actually takes the more generic ServletRequest and ServletResponse objects as parameters. As filter implementers, we are expected to determine when it is safe to treat them as HTTP traffic and perform the cast as necessary (which we do here in order to use the sendError() HTTP response method).

A Test Servlet

Before we go on, here is a simple test servlet you can use to try out this filter and the other filters we’ll develop in this section. It’s called WaitServlet and, as its name implies, it simply waits. You can specify how long it waits as a number of seconds with the servlet parameter time. (This is the “dumb” version of the BackgroundWaitServlet that we created earlier in this chapter when discussing asynchronous servlets.)

    import java.io.*;
    import javax.servlet.*;
    import javax.servlet.http.*;

    public class WaitServlet extends HttpServlet
    {
        public void doGet( HttpServletRequest request,
            HttpServletResponse response )
            throws ServletException, IOException
        {
            String waitStr = request.getParameter("time");
            if ( waitStr == null )
                throw new ServletException("Missing parameter: time");
            int wait = Integer.parseInt(waitStr);

            try {
                Thread.sleep( wait * 1000 );
            } catch( InterruptedException e ) {
                throw new ServletException(e);
            }

            response.setContentType("text/html");
            PrintWriter out = response.getWriter();
            out.println(
                "<html><body><h1>WaitServlet Response</h1></body></html>");
            out.close();
        }
    }

By making multiple simultaneous requests to the WaitServlet, you can try out the ConLimitFilter. Note that some web browsers won’t open multiple requests to the same URL or may delay opening multiple tabs. You may have to add extraneous parameters to trick the web browser. Alternately, you may wish to use the curl command-line utility to make the requests if you have it.

Declaring and Mapping Filters

In the web.xml file filters are declared and mapped much as servlets are. Like servlets, one instance of a filter class is created for each filter declaration in the web.xml file. A filter declaration looks like this:

    <filter>
        <filter-name>defaultsfilter1</filter-name>
        <filter-class>RequestDefaultsFilter</filter-class>
    </filter>

It specifies a filter handle name to be used for reference within the web.xml file and the filter’s Java class name. Filter declarations may also contain <init-param> parameter sections, just like servlet declarations.

Filters are mapped to resources with <filter-mapping> declarations that specify the filter handle name and either the specific servlet handle name or a URL pattern, as we discussed earlier:

    <filter-mapping>
        <filter-name>conlimitfilter1</filter-name>
        <servlet-name>waitservlet1</servlet-name>
     </filter-mapping>

    <filter-mapping>
        <filter-name>conlimitfilter1</filter-name>
        <url-pattern>/*</url-pattern>
     </filter-mapping>

The corresponding WebFilter annotation can declare and map filters as well as supply filter parameters. The annotation will accept either a urlPatterns or a servletNames attribute for the mapping.

@WebFilter(
    urlPatterns = "/*",
    initParams = {
        @WebInitParam(name="limit", value="3")
    }
)

Filtering the Servlet Request

Our first filter example was not very exciting because it did not actually modify any information going to or coming from the servlet. Next, let’s do some actual “filtering” by modifying the incoming request before it reaches a servlet. In this example, we’ll create a request “defaulting” filter that automatically supplies default values for specified servlet parameters when they are not provided in the incoming request. Here is the RequestDefaultsFilter:

    import java.io.*;
    import javax.servlet.*;
    import javax.servlet.http.*;

    public class RequestDefaultsFilter implements Filter
    {
        FilterConfig filterConfig;

        public void init( FilterConfig filterConfig ) throws ServletException
        {
            this.filterConfig = filterConfig;
        }

        public void doFilter (
            ServletRequest req, ServletResponse res, FilterChain chain )
                throws IOException, ServletException
        {
            WrappedRequest wrappedRequest =
                new WrappedRequest( (HttpServletRequest)req );
            chain.doFilter( wrappedRequest, res );
        }

        public void destroy() { }

        class WrappedRequest extends HttpServletRequestWrapper
        {
            WrappedRequest( HttpServletRequest req ) {
                super( req );
            }

            public String getParameter( String name ) {
                String value = super.getParameter( name );
                if ( value == null )
                    value = filterConfig.getInitParameter( name );
                return value;
            }
        }
    }

To interpose ourselves in the data flow, we must do something drastic. We kidnap the incoming HttpServletRequest object and replace it with an imposter that does our bidding. The technique, which we’ll use here for modifying the request object and later for modifying the response, is to wrap the real request with an adapter, allowing us to override some of its methods. Here, we will take control of the HttpServletRequest’s getParameter() method, modifying it to look for default values where it would otherwise return null.

Again, we implement the three lifecycle methods of Filter, but this time, before invoking doFilter() on the filter chain to continue processing, we wrap the incoming HttpServletRequest in our own class, WrappedRequest. WrappedRequest extends a special adapter called HttpServletRequestWrapper. This wrapper class is a convenience utility that extends HttpServletRequest. It accepts a reference to a target HttpServletRequest object and, by default, delegates all of its methods to that target. This makes it very convenient for us to simply override one or more methods of interest to us. All we have to do is override getParameter() in our WrappedRequest class and add our functionality. Here, we simply call our parent’s getParameter(), and in the case where the value is null, we try to substitute a filter initialization parameter of the same name.

Try this example using the WaitServlet with a filter declaration and mapping or annotation as follows:

<filter>
    <filter-name>defaultsfilter1</filter-name>
    <filter-class>RequestDefaultsFilter</filter-class>
    <init-param>
        <param-name>time</param-name>
        <param-value>3</param-value>
    </init-param>
</filter>
<filter-mapping>
    <filter-name>defaultsfilter1</filter-name>
    <servlet-name>waitservlet1</servlet-name>
</filter-mapping>

@WebFilter(
    servletNames = "waitservlet1",
    initParams = {
        @WebInitParam(name="time", value="3")
    }
)

Now the WaitServlet receives a default time value of three seconds even when you don’t specify one.

Filtering the Servlet Response

Filtering the request was fairly easy, and we can do something similar with the response object using exactly the same technique. There is a corresponding HttpServletResponseWrapper that we can use to wrap the response before the servlet uses it to communicate back to the client. By wrapping the response, we can intercept methods that the servlet uses to write the response, just as we intercepted the getParameter() method that the servlet used in reading the incoming data. For example, we could override the sendError() method of the HttpServletResponse object and modify it to redirect to a specified page. In this way, we could create a servlet filter that emulates the programmable error page control offered in the web.xml file. But the most interesting technique available to us, and the one we’ll show here, involves actually modifying the data written by the servlet before it reaches the client. In order to do this, we have to pull a double “switcheroo.” We wrap the servlet response to override the getWriter() method and then create our own wrapper for the client’s PrintWriter object supplied by this method, one that buffers the data written and allows us to modify it. This is a useful and powerful technique, but it can be tricky.

Our example, LinkResponseFilter, is an automatic hyperlink-generating filter that reads HTML responses and searches them for patterns supplied as regular expressions. When it matches a pattern, it turns it into an HTML link. The pattern and links are specified in the filter initialization parameters. You could extend this example with access to a database or XML file and add more rules to make it into a useful site-management helper. Here it is:

    import java.io.*;
    import java.util.*;
    import javax.servlet.*;
    import javax.servlet.http.*;

    public class LinkResponseFilter implements Filter
    {
        FilterConfig filterConfig;

        public void init( FilterConfig filterConfig )
            throws ServletException
        {
            this.filterConfig = filterConfig;
        }

        public void doFilter (
            ServletRequest req, ServletResponse res, FilterChain chain )
                throws IOException, ServletException
        {
            WrappedResponse wrappedResponse =
                new WrappedResponse( (HttpServletResponse)res );
            chain.doFilter( req, wrappedResponse );
            wrappedResponse.close();
        }

        public void destroy() { }

        class WrappedResponse extends HttpServletResponseWrapper
        {
            boolean linkText;
            PrintWriter client;

            WrappedResponse( HttpServletResponse res ) {
                super( res );
            }

            public void setContentType( String mime ) {
                super.setContentType( mime );
                if ( mime.startsWith("text/html") )
                    linkText = true;
            }

            public PrintWriter getWriter() throws
            IOException {
                if ( client == null )
                    if ( linkText )
                        client = new LinkWriter(
                            super.getWriter(), new ByteArrayOutputStream() );
                    else
                        client = super.getWriter();
                return client;
            }

            void close() {
                if ( client != null )
                    client.close();
            }
        }

        class LinkWriter extends PrintWriter
        {
            ByteArrayOutputStream buffer;
            Writer client;

            LinkWriter( Writer client, ByteArrayOutputStream buffer ) {
                super( buffer );
                this.buffer = buffer;
                this.client = client;
            }

            public void close() {
                try {
                    flush();
                    client.write( linkText( buffer.toString() ) );
                    client.close();
                } catch ( IOException e ) {
                    setError();
                }
            }

            String linkText( String text ) {
                Enumeration en = filterConfig.getInitParameterNames();
                while ( en.hasMoreElements() ) {
                    String pattern = (String)en.nextElement();
                    String value = filterConfig.getInitParameter( pattern );
                    text = text.replaceAll(
                        pattern, "<a href="+value+">$0</a>" );
                }
                return text;
            }
        }
    }

That was a bit longer than our previous examples, but the basics are the same. We wrapped the HttpServletResponse object with our own WrappedResponse class using the HttpServletResponseWrapper helper class. Our WrappedResponse overrides two methods: getWriter() and setContentType(). We override setContentType() in order to set a flag that indicates whether the output is of type “text/html” (an HTML document). We don’t want to be performing regular-expression replacements on binary data such as images, for example, should they happen to match our filter. We also override getWriter() to provide our substitute writer stream, LinkWriter. Our LinkWriter class is a PrintStream that takes as arguments the client PrintWriter and a ByteArrayOutputStream that serves as a buffer for storing output data before it is written. We are careful to substitute our LinkWriter only if the linkText Boolean set by setContent() is true. When we do use our LinkWriter, we cache the stream so that any subsequent calls to getWriter() return the same object. Finally, we have added one method to the response object: close(). A normal HttpServletResponse does not have a close() method. We use ours on the return trip to the client to indicate that the LinkWriter should complete its processing and write the actual data to the client. We do this in case the client does not explicitly close the output stream before exiting the servlet service methods.

This explains the important parts of our filter-writing example. Let’s wrap up by looking at the LinkWriter, which does the magic in this example. LinkWriter is a PrintStream that holds references to two other Writers: the true client PrintWriter and a ByteArrayOutputStream. The LinkWriter calls its superclass constructor, passing the ByteArrayOutputStream as the target stream, so all of its default functionality (its print() methods) writes to the byte array. Our only real job is to intercept the close() method of the PrintStream and add our text linking before sending the data. When LinkWriter is closed, it flushes itself to force any data buffered in its superclass out to the ByteArrayOutputStream. It then retrieves the buffered data (with the ByteArrayOutputStream toString() method) and invokes its linkText() method to create the hyperlinks before writing the linked data to the client. The linkText() method simply loops over all the filter initialization parameters, treating them as patterns, and uses the StringreplaceAll() method to turn them into hyperlinks. (See Chapter 1 for more about replaceAll().)

This example works, but it has limitations. First, we cannot buffer an infinite amount of data. A better implementation would make a decision about when to start writing data to the client, potentially based on the client-specified buffer size of the HttpServletResponse API. Next, our implementation of linkText() could probably be speeded up by constructing one large regular expression using alternation. You will undoubtedly find other ways in which it can be improved.

Building WAR Files with Ant

Thus far in this book, we have not become too preoccupied with special tools to help you construct Java applications. Partly, this is because it’s outside the scope of this text, and partly it reflects a small bias of the authors against getting too entangled with particular development environments. There is, however, one universal tool that should be in the arsenal of every Java developer: the Jakarta Project’s Ant. Ant is a project builder for Java, a pure Java application that fills the role that make does for C applications. Ant has many advantages over make when building Java code, not the least of which is that it comes with a wealth of special “targets” (declarative commands) to perform common Java-related operations such as building WAR files. Ant is fast, portable, and easy to install and use. Make it your friend.

We won’t cover the usage of Ant in detail here. You can learn more and download it from its home page. To get you started, we give you a sample build file here. The Ant build file supplied with the examples for this chapter will compile the source and build the completed WAR file for you. You can find it with the example source.

A Development-Oriented Directory Layout

At the beginning of this chapter, we described the layout of a WAR, including the standard files and directories that must appear inside the archive. While this file organization is necessary for deployment inside the archive, it may not be the best way to organize your project during development. Maintaining web.xml and libraries inside a directory named WEB-INF under all of your content may be convenient for running the jar command, but it doesn’t line up well with how those areas are created or maintained from a development perspective. Fortunately, with a simple Ant build file, we can create our WAR from an arbitrary project layout.

Let’s choose a directory structure that is a little more oriented toward project development. For example:

    myapplication
    |
    |-- src
    |-- lib
    |-- docs
    |-- web.xml

We place our source code tree under src, our required library JAR files under lib, and our content under docs. We leave web.xml at the top where it’s easy to tweak parameters, etc.

Here is a simple Ant build.xml file for constructing a WAR from the new directory structure:

    <project name="myapplication" default="compile" basedir=".">

        <property name="war-file" value="${ant.project.name}.war"/>
        <property name="src-dir" value="src" />
        <property name="build-dir" value="classes" />
        <property name="docs-dir" value="docs" />
        <property name="webxml-file" value="web.xml" />
        <property name="lib-dir" value="lib" />

        <target name="compile" depends="">
            <mkdir dir="${build-dir}"/>
            <javac srcdir="${src-dir}" destdir="${build-dir}"/>
        </target>

        <target name="war" depends="compile">
            <war warfile="${war-file}" webxml="${webxml-file}">
                <classes dir="${build-dir}"/>
                <fileset dir="${docs-dir}"/>
                <lib dir="${lib-dir}"/>
            </war>
        </target>

        <target name="clean">
            <delete dir="${build-dir}"/>
            <delete file="${war-file}"/>
        </target>

    </project>

A build.xml file such as this comes with the source code for the examples from this chapter. You can use it to compile your code (the default target) simply by running ant, or you can compile and build the WAR by specifying the war target like this:

    % ant war

Our build.xml file tells Ant to find all the Java files under the src tree that need building and compile them into a “build” directory named classes. Running ant war creates the file myapplication.war, placing all of the docs and the web.xml file in the correct locations. You can clean up everything and remove the generated classes directory and WAR by typing antclean on the command line.

There is nothing really project-specific in this sample build file except the project name attribute in the first line, which you replace with your application’s name. And we reference that name only to specify the name of the WAR to generate. You can customize the names of any of the files or directories for your own layout by changing the Ant <property> declarations. The learningjava.war file example for this chapter comes with a version of this Ant build.xml file.

Deploying and Redeploying WARs with Ant

With Tomcat, you can download a client-side “deployer” package, which provides Ant targets for deploying, redeploying, starting, stopping, and undeploying a web app on a running Tomcat server. The deployer package utilizes the Tomcat manager. Similar Ant tasks exist for other servers, such as WebLogic. Making these tasks part of your Ant build script can save a great deal of time and effort. The deployer package can be found along with the main Tomcat download.

Implementing Web Services

Now that we’ve covered servlets and web applications in detail, we’d like to return to the topic of web services. In the previous chapter, we introduced the concept of a web service as an extension of the basic HTTP web transaction, using XML content for application-to-application communication instead of consumption by a web browser client. In that chapter, we showed how easy it is to invoke an RPC-style web service, by using client-side classes generated from a WSDL description file. In this section, we’ll show the other side of that equation and demonstrate how to implement and deploy a web service.

The world of web services has evolved quickly, as have the APIs, buzzwords, and hype. The appeal of this style of interapplication communication using simple web protocols has, to some extent, been tarnished by the design-by-committee approach of many standards bodies and competitors adding features and layers to the web services concept. The truth is that web services were originally simple and elegant when compared to more elaborate protocols, largely because they did not support all of the same semantics—state management, callbacks, transactions, authentication, and security. As these features are added, the complexity returned. We will not cover all aspects of web services in detail but instead focus on the basic RPC style that is appealing for a wide variety of simple applications.

In Chapter 14, we walked through generating and running the client side of a web service (the weather service). In this chapter, we’ll build and deploy our own web service, a simple one that echoes parameters back to the client: EchoService. We’ll be using the built in JAX-WS APIs tools and services container to run this example, although you could deploy the service to Tomcat as well with some additional configuration and packaging into a WAR file.

Defining the Service

To build our client-side API in Chapter 14, we began by downloading the WSDL description file for the (existing) weather service. The WSDL, again, is an XML file that describes the functions of the service and the types of arguments and return values they use. From this description, the wsimport command was able to generate the client-side classes that we needed to invoke the service remotely from Java.

In creating our own web service, we have (at least) two choices. We could follow an analogous process, writing a WSDL document describing our service and using it to generate the necessary server-side framework. The wsimport class that we used before can be used to generate the necessary, annotated service interface for us and we could implement it with our code. However, there is a much easier way: going code-first.

The wsgen command complements wsimport by adding the capability to read annotated Java classes and generate WSDL and related service classes for us. Even better, if we deploy our class using the built-in JAX-WS endpoint publisher, it will take care of generating all of this for us. This means that to test a simple web service, all we really have to do is write a service class that marks the class and service methods with the correct annotations and invoke the publisher. It really couldn’t get much easier.

Our Echo Service

We’ll create a simple service that echoes a few different kinds of values: an int, a String, and one of our own object types (a data holder object), MyObject. In the next section, we’ll examine the data types and how they are handled in more detail. Here is the code:

package learningjava.service;

import javax.jws.*;
import javax.xml.ws.Endpoint;

@WebService
public class Echo
{
    @WebMethod
    public int echoInt( int value ) { return value; }

    @WebMethod
    public String echoString( String value ) { return value; }

    @WebMethod
    public MyObject echoMyObject( MyObject value ) { return value; }

    public static void main( String[] args )
    {
        Endpoint endpoint = Endpoint.publish( "http://localhost:8080/echo", 
            new Echo() );
    }
}

public class MyObject 
{
    int intValue;
    String stringValue;

    public MyObject() { }

    public MyObject( int i, String s ) {
        this.intValue = i;
        this.stringValue = s;
    }

    public int getIntValue() { return intValue; }
    public void setIntValue( int intValue ) { this.intValue = intValue; }

    public String getStringValue() { 
        return stringValue; 
    }
    public void setStringValue( String stringValue ) { 
        this.stringValue = stringValue;
    }
}

We’ve named our {[QUOTE-REPLACEMENT]}echo" methods individually to differentiate them because WSDL doesn’t really handle overloaded methods. (If we’d had a name collision, JAX-WS would give us a runtime warning and choose one for us.) We’ve placed these into a learningjava.service package because it will be easier to work with the tools that way. This package name will be used in the default namespace and package name for generated client code. We could override the default using the targetNamespace attribute of the WebService annotation (and it would probably be wise to do so in order to keep your interface stable).

To deploy our web service, we use the JAX-WS Endpoint class publish() method. This method takes a URI string that indicates the desired host, port, and service path as well as an instance of our class. Obviously, the only host that will work in this arrangement is our local computer, which can normally be accessed by the name: “localhost.” Here, we ran the service on port 8080 under the path “/echo”.

Using the Service

After running the service, drive your web browser to the service URL to get a test page. If you are running the server on the same machine, the URL should be the same as the URI you passed to the publish() method. However, under some circumstances you may have to substitute “127.0.0.1” for “localhost.”

http://localhost:8080/echo
http://127.0.0.1:8080/echo

You should see a description of the service similar to the one shown in Figure 15-1. This tells you that the service is active and gives you its configuration information. You can click on the WSDL link to view the WSDL description file that was generated for our service. The WSDL URL should be your base service URL with “?wsdl” appended.

We can use the WSDL to generate a client and test our service, just as we did in Chapter 14. In the following command, we’ve specified that the generated classes should go into a separate package, learningjava.client.impl, to avoid confusion between the generated classes and our original. We’ve also used the -keep option to retain the source code instead of just the compiled class files (you may want to look at them). The final argument is the URL for our generated WSDL, which you can copy from the test page as shown previously.

Web services description
Figure 15-1. Web services description
% wsimport -p learningjava.client.impl -keep http://localhost:8080/echo?wsdl

Next, we’ll create a small client that uses these generated classes to test the service:

package learningjava.client;

import learningjava.client.impl.*;

public class EchoClient
{
    public static void main( String [] args ) throws java.rmi.RemoteException
    {
        Echo service = new EchoService().getEchoPort();
        int i = service.echoInt( 42 );
        System.out.println( i );
        String s = service.echoString( "Hello!" );
        System.out.println( s );
        MyObject myObject = new MyObject();
        myObject.setIntValue( 42 );
        myObject.setStringValue( "Foo!" );
        MyObject myObj = service.echoMyObject( myObject );
        System.out.println( myObj.getStringValue() );
    }
}

As you can infer from our code, wsimport has generated an EchoService class that represents our service. Service classes may contain multiple service groups, so in order to get our Echo interface, we ask for the Echo “port” with getEchoPort(). (Port is WSDL terminology for a service interface.)

Run the client, and it should bounce the values between the client and server and display them. And there we are! As we said in the introduction, the actual code required to implement and invoke our service is quite minimal and the fact that Java now bundles a simple web service container with the standard edition makes Java an ideal platform for working with web services.

Data Types

As you might guess, because the data for our service has to be expressed as XML in a standard way, there are some limitations to the type of objects that can be transferred. JAX-WS and WSDL support most of the common Java data types and many standard classes directly. Actually, it would be more appropriate to say that JAXB—the Java XML binding API—supports these Java types, as JAX-WS uses JAXB for this aspect. We’ll talk more about Java XML data binding and XML Schemas in Chapter 24.

JAX-WS and JAXB can also decompose JavaBeans-compliant data classes composed of these standard types so that you can use your own classes, as we saw with the MyObject argument in our Echo service.

Standard types

Table 15-1 summarizes the directly supported types (those types that map directly to W3C Schema types; see Chapter 24 for more on XML mapping of Java types.

Table 15-1. Standard types

Category

Types

Primitives and their wrappers

boolean, Boolean, byte, Byte, short, Short, float, Float, int, Integer, long, Long, double, Double

Class types

java.lang.String, java.math.BigDecimal, java.math.BigInteger, java.util.Calendar, java.util.Date, java.util.UUID, java.net.URI, java.awt.Image (as byte [])

Collections

Array types, List types, Set types

Maps and other complex collection types are not currently supported. To maintain the widest compatability for cross-platform web services, it’s best to stick with objects composed of simple data types and arrays or lists of those types.

Value data objects

As we said, JAX-WS can also work with our own object types, although there are several requirements and a caveat to mention. First, to be able to be marshaled, our objects must contain only fields that are supported data types (or further compositions of those). Next, our objects must follow two JavaBeans design patterns. It must have a public, no-args constructor and, if it contains any nonpublic fields, they must have “getter” and “setter” accessor methods. Chapter 22 provides more details about these issues.

Finally, unlike Java RMI, web services do not support the “behavior” or the real identity of our domain objects from end to end. When a Java client uses our WSDL document to generate implementation classes, they will be getting simple data-holder replicas of the classes we specify. These “value objects” will pass along all of the data content of our objects, but are not related to the originals in any other way. Our server-side implementation will, of course, receive the data in the form of our own “real” domain objects. That is why they need to have available constructors so that the server-side framework can create and populate them for us to consume.

Conclusion

This chapter, covering Java web applications and Java web services, is one of the fastest-changing topics that we cover in this book. It is a big topic, and we could only really address it here in the context of the Java APIs that support it. We recommend that you supplement what you have learned here with additional reading, especially on the techniques for building applications using HTML5 and JavaScript that communicate with Java using servlets or web services.