Improving Perceived Performance on Android

Android is one of the strictest platform I can think of when it comes to threading. Everything seems to have a rule around whether or not it is allowed on the main thread. All user interface code (code that touches the widgets on the screen) must be run on the main thread (since it also serves as the event loop) in order to avoid concurrency issues; but at the same time networking cannot happen on the main thread. For a good reason too: networking is always slow relative to a single event (just think about NodeJS or Vert.x and the fact that all networking is offloaded away from the event loops).

Most Android applications take things one step further and try to do any local database work (typically with SQLite) on a worker thread. The most common Android method of doing this is with AsyncTask (and friends such as AsyncTaskLoader). However: these systems are often heavy-weight to implement, requiring lots of boiler-plate code. The problem there is: many things that should be on background threads are left on the main thread because it’s just too much effort to write another AsyncTask. This is not a good situation.

After a few years of developing Android, and writing a book on the subject I found myself reusing a simple pattern that at first I called a BackgroundForegroundTask. At first a massive simplification of AsyncTask (which is totally over-engineered for most purposes). Instances of BackgroundForegroundTask would just:

  1. Call onBackground on a worker thread in the background
  2. Call onForeground on the main thread with the result of the onBackground method
  3. Call onError on the main thread instead of onForeground if onBackground threw an Exception

I used this class for almost every event handler. The difference to applications was massive. Everything became silky smooth, because the main thread is almost entirely left alone to process input like that user scrolling, tapping, typing, etc.

Just forward to the present day, and I’ve modified the pattern to allow these BackgroundForegroundTask (now called ActionCommand) objects to be chained together. This enables me to write commands that only do one thing, and with no state of their own. They can be chained together, the chains can be kept and reused over and over to perform defined sets of actions:

  1. update a local user object
  2. send it to the server
  3. update the user interface

Three jobs, three ActionCommand classes:

private final ActionCommand.Chain<User, User> saveUser =
        new UpdateUserCommand(database)
            .then(new UpdateUser(serverConnector))
            .then(onForeground(user -> {

In the third case above I’ve added a lambda that will be wrapped in an ActionCommand object. You can also run commands by only composing such lambdas together:

onBackground((name) -> "Hello <b>" + name + "</b>")
    .then(onBackground((msg) -> Html.fromHtml(msg)))
    .then(onForeground((msg) -> {
        Toast.makeText(this, msg, Toast.LENGTH_LONG).show();

Will offload the concatenation and parsing of the HTML onto a background thread, only returning to the foreground thread to create and display the Toast object. The actual execution pattern for the above code is a bit special, it actually runs like this:

  1. Run Html.fromHtml("Hello " + name + "") on background thread
  2. Run Toast.makeText(this, msg, Toast.LENGTH_LONG).show() on foreground thread

The Chain class automatically optimises the number of tasks and thread-hops when then is called with an ActionCommand returned by onBackground or onForeground. Check the code out on GitHub if you’re interested.

A CompactHashSet Implementation

There are two commonly used Set implementations in Java by default: HashSet and CopyOnWriteArraySet. The main attributes or each implementation are:


  • Not Thread-Safe
  • Seemingly random ordering of items
  • Very very fast
  • Sparse by default (much more space used than the number of elements)
  • Implemented on top of HashMap


  • Thread-Safe
  • Items are ordered according to their insertion order
  • Relatively fast for small data-sets
  • Compact (it’s size is exactly the number of elements)
  • Implemented on top of CopyOnWriteArrayList

What I wanted was an implementation with a mixture of these attributes. Basically a CopyOnWriteArraySet with the same Thread-Safety and size considerations, but that is nice and fast for medium and large data-sets. What I came up with I call the CompactHashMap. It’s a relatively simple implementation that stores it’s elements ordered by their hash-codes. This allows the implementation to binary-search the array and jump to where the element is likely to be before doing a normal Object.equals() search.

The implementation is below:

Read the rest of this entry »

Know you contention ratios.

After blogging about how Micro-Benchmarks are often misleading and focused on proving a point, I went ahead and wrote a little micro-benchmark to curb my curiosity. What I wanted to know is this:

In the face of massive contention, how do the following compare to each other:

  • Synchronized Block
  • Unfair ReentrantLock
  • Fair ReentrantLock
  • An Atomic Object

To do this test I wrote 3 different implementations that all did that same thing:

  1. Increment a long (or AtomicLong)
  2. Thread.yield()
  3. return

The results are not at all surprising, an atomic structure comes out much faster than the others, while Fair Locks are by far the slowest. Now I already said that Micro Benchmarks are misleading, so why write this? The fact is that most systems have very low contention ratios. The interesting thing about a contention is that: the more threads trying to access a resource, the worse the bottle-neck becomes. A locked structure must always enter and exit the lock, even when there is no contention. An atomic structure will never have to enter a lock, but may have to spin a few times. If you are concerned about atomic starvation (spinning forever), you can put a limit on how many times your code is willing to spin, and then enter a lock (this is in fact how a Java monitor works).

In short: make sure you really know your contention ratio before writing loads of complex code. You may find that in a continuously high load system, an unfair lock (or synchronized block) is the best way to go.

What does volatile actually do in Java?

Volatile to many that code in Java is a mystery keyword. Most keywords are really simple to understand, but volatile to many is defined by a lot of mathematical mumbo-jumbo. Volatile is a very very simple structure, if explained right:

Writes happen-before reads.

Still a little confused? There are three basic problems in the Java (or any other) multi-threaded environment:

  1. Each CPU (real or virtual) may have it’s own small cache of memory data
  2. Any compiler along the way (including Javac, or one of the Hotspot compilers) may change the order of instructions to improve performance
  3. You have no real idea of which instructions in another thread will run before or after those in any other thread

As a result of these problems your update to a field may not be seen by another thread for some arbitrary amount of time, in fact it may exit before seeing your changes. Enter the volatile modifier. It’s job is to guarantee “happens-before” behavior. This means three things:

  1. Write operations are executed before read operations.
  2. Compilers are not allowed to re-order accesses to the field.
  3. Caches must be flushed before reading the field

The combination of these guarantees means essentially that a “read” of the field will always give the “latest” state of the field. When a new value is written, any subsequent reads (including those scheduled to “happen at the same time”) will see the new value.

GWT RPC is (?:called) Aynchronous for a reason

GWT RPC is totally asynchronous. You have no option to implement a synchronous call to the server. Many people find passing a callback to every remote method in order to receive the return value frustrating, and the fact that it returns immediately decidedly strange.

For those new to GWT, here’s a description of GWT RPC in pure Java terms. Read the code carefully and things will make a lot more sense.

First the definition of the service. Think of this like an RMI service interface.

public interface MyService extends RemoteService {
 public String getText();

This is the interface the GWT client (Javascript) side of things will be using. The reason GWT makes you use this interface in because on the client side we need to call the method, and then receive the response some-time in the future. You can think of an AsyncCallback as an EventListener, when the server sends the response back, you get an event containing the success or failure data.

public interface MyServiceAsync {
 public void getText(AsyncCallback<String> callback);

Now our Servlet is the implementation of the MyServer interface. You can think of this like the implementation of an RMI service or an EJB. The reason you extend RemoveServiceServlet is two-fold: (1) You need an HTTP path that the client can send data to. (2) Rather than forcing you to decode GWT’s flavour of Serialization and invoke the methods by hand, RemoteServiceServlet does it all for you (so all you do is implement the actual methods).

An important note here. This code runs on the server, under a real Java VM. It’s not compiled by GWT, it’s not even looked at in fact. You can use any classes here (surprisingly, this is something that catches a lot of people out).

public class MyServiceImpl extends RemoveServiceServlet implements MyService {
 public String getText() {
  return "Hello World";

Now for our implementation on the client side. This is not how you would code this method call in GWT, this is a normal Java representation of what happens.

public void onModuleLoad() {
 // This is a purely local representation of what
 // GWT.create(MyService.class) would do for you
 MyServiceAsync async = new MyServiceAsync() {
  // Our pretend implementation. In real GWT,
  // this object would be on the other side of the network
  MyServiceImpl impl = new MyServiceImpl();

  public void getText(final AsyncCallback<String> callback) {
   // When this method gets called, we spawn a
   // Thread to make the call to the server.
   // In JavaScript the call is often put in a queue,
   // by the browser and executed in a pool.
   // However, whichever way things happen the
   // method call returns immediately and does
   // not wait for the server to respond.

   Thread runner = new Thread() {
    public void run() {
     try {
      // Once we have the content, pass it
      // to the AsynCallback we were given.

     } catch(Exception error) {
      // If an Exception occurs (unlikely in our
      // little example here), we pass it to the
      // AsyncCallback to deal with.


   // Start our Thread and return.

 final Label label = new Label("Foo");
 asyn.getText(new AsyncCallback<String>() {
  public void onSuccess(String message) {

  public void onFailure(Throwable error) {


So you can see from the example above that “Bar” may appear on the label, but it’s not likely. Far more likely is “Hello World” coming from our “server”.

There are a good reasons why GWT only allows for this sort of call.

  1. JavaScript has no threading model in place. It’s impossible to Object.wait() for something to Object.notify() you, which would be exactly how you would implement this sort of invocation in normal Java (if only under the hood)
  2. Anyone who has used Swing extensively will know that doing lots of work in the event-dispatch-thread is a disaster. It stops repaints from happening, the application is basically unusable until you’re “event” in complete.In order to get around race-conditions and such multi-threading problems, JavaScript is all executed from within the browser event-queue. So if we sent a request to the server synchronously, the user wouldn’t even be able to open the file menu until the server gave us a response. “But I’m in a LAN” I’ve heard some say. GWT’s RPC mechanism is built for general consumption. Lazy Developers + Synchronous Calls + Open Internet is a recipe for disaster (and a lot of complaints on the mailing-lists), and the GWT devs know it.

Asynchronous RPC with callbacks can be considered a small price to pay for an amazing amount of power. Personally I see it as even more power, as is breaks your code into smaller modules. I often have a single AsyncCallback class handling many different invocations from the server. Using this technique helps make your code smaller to deploy, and easier to maintain.

ReadWriteLock’s on Collections

Many people don’t realize that Collections.synchronized* causes an exclusive monitor lock to be held in every method called on the specified Collection object. For example:

final List list = Collections.synchronizedList(new ArrayList());

Thread th1 = new Thread(new Runnable() {
public void run() {
for(int i = 0; i < Integer.MAX_VALUE; i++) { list.add(Integer.valueOf(i)); } } }); Runnable reader = new Runnable() { public void run() { for(int i = 0; i < Integer.MAX_VALUE; i++) { list.get(i); } } }; Thread th2 = new Thread(reader); Thread th3 = new Thread(reader); Thread th4 = new Thread(reader); th1.start(); th2.start(); th3.start(); th4.start();[/sourcecode]I am well aware that the code for reading is very bad, but it serves the point. Four Threads, one writing to a synchronized Collection, the others reading from it. This as such is not a problem: writing to a Collection should only ever be done by one thread at a time, while no other threads are reading. However, any number of threads can read from the Collection at the same time. Collections.synchronized* acquires an exclusive lock for both read and write methods, so: only one thread will be allowed to access the Collection at a time.

The solution: Use a ReentrantReadWriteLock, which allows any number of Read locks at a time, but only one Write lock (and no Read locks while a Write lock is held).

There is an even more efficient way to make a Collection thread safe: use an inherently thread-safe collection. Just take a look in java.util.concurrent, and you’ll see that Java comes packed with plenty of them. You will need to decide which method is preferred for what you are doing. Take into account:

  • How many reads vs. how many writes
  • How large is the collection
  • How fast does the code need to execute
  • How often will the collection actually be accessed