EoD SQL Applied – Part 3 / 5 (JSP / Servlets)

Static Data Display

In a Swing application with a direct connection to a database, it makes sense to leave the database to do much of the heavy lifting. The Swing client can keep a minimum amount of data in memory, while holding a scrollable Result Set open on the database (fetching more of the data as required). In a web application on the other hand, things are a little more complex. For one thing, your database isn’t exposed to the any part of the network; for another: your clients have no ability to keep Result Sets open; and finally: you generally are working with much higher loads than a Swing application.

EoD SQL has many different approaches to loading data from the database. We’ve already looked at DataSet objects, which remain connected to the database and work well with scrollable, cursor backed, Result Sets. DataSets are (in a way) a leftover from the origional EoD API from Java6-beta (where they were the only way to fetch objects from the database). In this post we’ll take a look at an EoD SQL idea: DataIterators.

Data Iterator Introduction

A DataIterator as the name implies as an Iterator which is connected directly to the database (like a DataSet). Here are the most important features of a DataIterator:

  • DataIterators are forward-only (no going backwards)
  • They are always connected to the database (no notion of a “disconnected” DataIterator)
  • They are not backed by any sort of cache (ie: DataSetCache)
    • There is no need for one, since you can’t go backwards
  • They also implement Iterable (their “iterator()” method returns “this”) for convenience
    • So they work easily with the Java-5 enhanced for loop
  • They close themselves when their last object has been read (ie: “hasNext()” returns false)
    • If you do not work through all of the DataIterator: you will need to invoke it’s “close” method by hand
  • You cannot modify a DataIterator (remove() will throw an UnsupportedOperationException)
    • The following is an error (InvalidQueryException)
      @Select(sql="SELECT * FROM users",readonly=false)
      DataIterator<User> selectUsers();

DataIterators are a great fit whenever you only need to work through the ResultSet once and in a single direction.  The database can stream the ResultSet to the JDBC driver, since there is no need for a scrollable cursor. This forward-only, streaming nature is a very good fit to web-applications. Web applications work with a user that is on the other side of an HTTP connection. So scrollable DataSets that hold onto their database Connections are of limited use, since a web application generally holds onto a database Connection for as short-a-time as possible (while DataSets are best used with long lived Connections).

Rubberstamping

Rubberstamping is the key feature we will be looking at in this post.  Rubberstamping is only available to a DataIterator, and is setup in the @Select annotation:

@Select(sql="SELECT * FROM users",rubberstamp=true)
DataIterator<User> selectUsers();

So what is rubberstamping? Take a look at the following code and you’ll see.

DataIterator<User> users = userQuery.selectUsers();
User user1 = users.next();
User user2 = users.next();

if(user1 == user2) {
  // this is the big deal, this will always be true
}

Rubberstamping only ever creates one “row” object, in this case a single User object is created. Each time you invoke next(), instead of creating a new User object (and so forcing the GC to clean the old one up): it pushes the new values into the exsting User object and returns it. This saves memory and time (both allocation time and GC time later on). You could argue that the eden-space GC will take care of these short lived objects, but in a web environment the rubber-stamped row object will also be handles by the eden-gc and so the eden GC will trigger less often (since there will be fewer objects).

Rubberstamping and Web Applications

So why is rubber-stamping so nice for web applications in particular. A web-application is connected (fairly directly) to a socket that the browser sits on the other end of. Objects being processed for display as HTML are almost never needed (within the same request) after the HTML has been written. In this case: rubber-stamping is a perfect fit, since the object is re-used instead of being discarded.

I find the best way to work with a DataIterator in a web-application is with the JSTL “forEach” tag which supports DataIterator as a Iterator instance. Generally you should try and avoid using the begin and end attributes of the tag, opting instead for a “LIMIT” (or similar) clause in your SQL statement. Because the for each tag uses the DataIterator as a plain Iterator object it fully supports the rubber-stamping ability, and because you are outputting directly to a JspWriter you almost certainly won’t need to keep references to objects already outputted.

Comparison to the JSP SQL tags

So why not just use the JSP SQL tags then? There are several reasons to rather make use of the DataIterator class.

  • The JSP remains agnostic to the type being passed in, any object supported by the “forEach” tag can be used, if a DataIterator becomes unsuitable in the future you can swap it without modification to the JSP.
  • The SQL code remains in an EoD SQL BaseQuery interface which is (or should be) executed by normal Java code and passed into the JSP using a request attribute (or similar mechanism).
  • The SQL query (and thus the DataIterator) is centralized, and so usable by the Java code that supports the JSP. So you avoid code duplication (and wondering why the JSP is still showing the wrong data).
  • You’re working with your objects and not just raw SQL data.

Caching Data from the Database

A major concern now is database performance. With a DataIterator you will be accessing the database every time you want to read some data. Many web-application make use of a caching system (such as memcache, terracotta, eh-cache or similar), how can you have a DataIterator interact with the cache? As I already said in the previous section: the “forEach” tag doesn’t care if the input type changes between 2 invocations.

I generally put my JSP pages “behind” a Servlet which invokes the JSP using a RequestDispatcher.forward call. Here’s a simple example of a Servlet that invokes a JSP, and plays nicely with an imaginary “cache” object.

public class TopicListServlet extends HttpServlet {
  private final Cache<Long, Collection<Topic>> cache = ...;
  private final TopicQuery query = QueryTool.getQuery(TopicQuery.class);

  private Iterator<Topic> getTopicsFromCache(final Long id) {
    final Collection<Topic> topics = cache.get(id);
    return topics != null
      ? topics.iterator()
      : null;
  }

  protected void doGet(
      final HttpServletRequest request,
      final HttpServletResponse response)
      throws ServletException,
      IOException {

    final Long id = Long.parseLong(request.getParameter("id"));
    Iterator<Topic> topics = getTopicsFromCache(id);

    if(topics == null) {
      topics = new CachingIterator<Long, Topic>(id, cache, query.selectTopics(id));
    }

    request.setAttribute("topics", topics);
    request.getRequestDispatcher("/topics.jsp").forward(request, response);
  }

  private static class CachingIterator<K, V> implements Iterator<V> {
    private final Iterator<V> delegate;
    private final Cache<K, Collection<V>> cache;
    private final K key;

    private final Collection<V> values = new ArrayList<V>();

    public CachingIterator(
        final K key,
        final Cache<K, Collection<V>> cache,
        final Iterator<V> iterator) {

      // ...
    }

    public boolean hasNext() {
      if(!delegate.hasNext()) {
        cache.add(key, values);
        return false;
      }

      return true;
    }

     public V next() {
       final V value = delegate.next();
       values.add(value);

       return value;
     }

     public void remove() {
       throw new UnsupportedOperationException();
     }
  }

  public static interface TopicQuery extends BaseQuery {
    @Select("SELECT * FROM topics WHERE parent_id = ?1 ORDER BY post_date DESC")
    DataIterator<Topic> selectTopics(long parentId);
  }
}

As you can see in the above code, the need for caching has a potential side-effect: you loose the ability to rubber-stamp the objects. However this is not the end of the world, since rubber-stamping is particularly useful with potentially very large data-sets (where caching is often a bad idea). Another, often easier option than was used here (although slower for your request) is to have EoD SQL return a Collection<Topic> or List<Topic> instead of a DataIterator. This mechanism however is much slower, since EoD SQL will read in the entire ResultSet  before returning from the select query. We’ll explore using Collections as return types in the next two posts.

Conclusion

We’ve played with rubber-stamping, perfect for those days when you need lots of data with minimal memory usage. We’ve also looked at how to get the objects from a DataIterator into a cache for later use. DataIterators are a generally good mix with web applications, since you have a socket open, so you really want to pre-buffer as little data as you possibly can. In the next post we’ll take a look at using EoD SQL with JavaScript applications. While they are web-applications as well, they have different communication requirements to a typical web-application. So we’ll look at applying much the same techniques to AJAX applications.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: