Is Nashorn (JVM) faster than Node (V8)?

The answer to the question “Is Nashorn (JVM) faster than Node (V8)?” is in most people’s minds: a foregone conclusion, looking something like this expression.

no-way-9yykt8

It certainly was in my mind, until I actually ran a very simple benchmark that computes the Fibonacci sequence a few times (recursively). It’s a common enough benchmark, one frequently used to test method dispatching, recursion and maths in various languages / virtual machines. For disclosure, the code is listed below:

function fib(n) {
  if (n < 2)
    return 1;
  return fib(n - 2) + fib(n - 1);
}

try {
  print = console.log;
} catch(e) {
}

for(var n = 30; n <= 50; n++) {
  var startTime = Date.now();
  var returned = fib(n);
  var endTime = Date.now();
  print(n + " = " + returned + " in " + (endTime - startTime));
}

The results are, lets just say: “shocking”. The table below is both runs along side each other, the last numbers are the important ones: how many milliseconds each run took.

jason@Bender ~ $ node fib.js 
30 = 1346269 in 12
31 = 2178309 in 17
32 = 3524578 in 27
33 = 5702887 in 44
34 = 9227465 in 69
35 = 14930352 in 113
36 = 24157817 in 181
37 = 39088169 in 294
38 = 63245986 in 474
39 = 102334155 in 766
40 = 165580141 in 1229
41 = 267914296 in 2009
42 = 433494437 in 3241
43 = 701408733 in 5302
44 = 1134903170 in 8671
45 = 1836311903 in 13626
46 = 2971215073 in 22066
47 = 4807526976 in 45589
48 = 7778742049 in 74346
49 = 12586269025 in 120254
50 = 20365011074 in 199417
jason@Bender ~ $ jjs -ot fib.js 
30 = 1346269 in 70
31 = 2178309 in 16
32 = 3524578 in 19
33 = 5702887 in 30
34 = 9227465 in 48
35 = 14930352 in 76
36 = 24157817 in 123
37 = 39088169 in 197
38 = 63245986 in 318
39 = 102334155 in 517
40 = 165580141 in 835
41 = 267914296 in 1351
42 = 433494437 in 2185
43 = 701408733 in 3549
44 = 1134903170 in 5718
45 = 1836311903 in 9306
46 = 2971215073 in 15031
47 = 4807526976 in 34294
48 = 7778742049 in 38446
49 = 12586269025 in 61751
50 = 20365011074 in 100343

So what on earth is going on here? How can it be that the last result is 99 seconds faster on the Java VM than on V8, one of the fastest JavaScript engines on the planet?

The answer is actually hidden at the top of the tables: the -ot option being passed to JJS (Java JavaScript) is the key to this magic sauce. OT stands for Optimistic Typing, which aggressively assumes that variables are all compatible with the Java int type, and then falls back to more permissive types (like Object) when runtime errors happen. Because the code above only ever produces int values, this is a very good assumption to make and allows the Java VM to get on and optimise the code like crazy, where V8 continues to chug along with JavaScript’s Number type (actually itย is more intelligent than that, but it’s just not showing in this little benchmark).

The point of this post is: don’t believe everything you read in benchmarks. They are very dependant on the code being run, and micro-benchmarks are especially dangerous things. The work that has been done on Optimistic Typing in Nashorn is amazing, and makes it a very attractive JavaScript environment. But you shouldn’t believe that just because these benchmarks show such a wide gap in performance that your Express application would run faster in Nashorn than under V8… In fact you’d be lucky to get it to run at all.

Feel free to replicate this experiment on your machine. It’s amazing to watch, it surprises me every time I re-run it.

Advertisements

Writing Express Middleware to Modify the Response

I recently had a need for some Express middleware that would track any change to a variable, and send the details of any changes in the response, but only if the response is JSON data. This turned out to be rather more interesting delve into ExpressJS than I expected, and I thought the details of it were worth a post detailing how to do it, and my findings.

The first thing you notice when writing Express middleware functions is that they take three arguments (I’m skipping error-handling middleware deliberately), the request, the response and a next() function that triggers the next step in the processing chain. The next function doesn’t typically take any arguments, instead you’re expected to modify the request and response objects to expose or capture any changes you want.

This post will show you how to capture any changes to a variable (we’re going to pretend all our users are playing a game and we have a leader-board), we want to track any changes that happen during the request and report both the changes and the current state of the variable in every response. I’m going to assume some other middleware has injected a user property onto the request. First off, this is what the code will look like:

const leaderboard = require('./leaderboard');
function leaderboardTracking(req, resp, next) {
  const user = req.user;
  const startingPosition = leaderboard.getUserPosition(user);
  const json_ = resp.json; // capture the default resp.json implementation

  resp.json = function(object) {
    const endPosition = leaderboard.getUserPosition(user);
    object['leaderboard_info'] = {
      'delta':    endPosition - startPosition,
      'position': endPosition,
      'score':    user.score
    };

    json_.call(resp, object);
  };

  next();
}

So in this example we swap out the json function with our own delegate implementation, but you’ll notice we leave the send function alone. The send function will delegate to the json function if it’s given a plain JavaScript object, and json in-turn uses send after stringifying the object.

You’ll also notice that the code doesn’t just invoke the captured json_ function, it uses the call function and specifies the response as this. The json function in Express expects to be invoked within the context of the response object, so we need to keep it that way.

That’s it really, it’s not a terribly complicated pattern, but it can be extremely powerful because it bounds the request / response cycle end-to-end.

EoD SQL Applied โ€“ Part 5 / 5 (GWT Applications)

The Dreaded “Not Serializable” Problem

GWT turns all of your Java code into JavaScript, but it also obfuscates everything. With this in mind, it makes serialization a bit more complex than usual. GWT generates a serializer and deserializer for each class that could be transported across the wire to an RPC service. The difficulty comes in knowing which types had code generated, and which didn’t. GWT solves this problem for itself with the Serialization Policy file, where the compiler lists all of the classes that the client code will know how to deserialize.

This however leads to another problem: what happens when something unexpected gets in the way. Hibernate is the most used, and thus most complained about when it comes to Serialization problems. Hibernate turns any Collection object into a special “lazy” implementation that will only load your data when you ask for it. All very fine and well for a server bound application, but a GWT application needs all that data up front for serialization. When GWT comes to sending that List<Message> to the client, it chokes. It knows how to serialize an ArrayList, and LinkedList (since you worked with them on the client side), but a Hibernate List or PersistentCollection is a totally unknown type, so it’s not in the Serialization Policy, so the server side throws an Exception at you.

So how does EoD SQL help with these problems? Read on to find out! ๐Ÿ˜‰ Read the rest of this entry »

EoD SQL Applied – Part 4 / 5 (JavaScript)

JavaScript vs. Web Applications

So far in this series we’ve discussed using DataSets for Swing applications and DataIterators for web-applications. Why would I now bring in JavaScript as something outside of “web-application”? JavaScript applications have very different requirements to a normal web-application. Where a normal web-application has little ability to do things like preload data (like the next page), a JavaScript application may (for example) download the entire data-set and then display it in pages. This next section is about binding to JSON for JavaScript applications.

First thing to remember here is that an EoD SQL DataSet is a List and thus compatible with the Collections API. For this example we’re going to be working with the outstanding GSON API from our friends at Google. Our objective here is to minimize the amount of time spent between the Database and pushing the data to the client. Because GSON doesn’t appear to support Iterable object out-of-the-box, we’re going to start off using a DataSet.
Read the rest of this entry »

Eod SQL Applied – Part 1/5 (Introduction)

Introduction

This is the first part of a series of posts, dedicated to how best to apply EoD SQL in different types of applications. Each post in this series will cover a specific type of application, and the different parts of EoD SQL that generally work best within those applications. The articles may dive fairly deep into the workings of the API, but at the end you’ll have a much better idea of where things fit in, and how they fit together.

EoD SQL well understands that (a) people like to do things differently and (b) one solution is not right for everything. While under the hood: any one part of EoD SQL works much the same way as any other part, as you climb the structural ladder, things begin to behave very differently. These behaviors have a massive performance and memory impact on your application and how you will be treating the underlying database drivers.

What we’ll be covering

  1. Swing Applications
  2. JSP / Servlets and “Related Technologies”
  3. JavaScript / JSON Applications
  4. GWT Applications

Although you could go with the “one size fits all” route (and EoD SQL would still perform wonderfully), the objective of these posts is to expose you to the different flavors of EoD SQL data structures and get you thinking about how you can mix and match them in your application to produce different results.

Why TraceMonkey is maybe the wrong way

With Firefox 3.5 finally out the door and in use I thought it was time to write something about it that’s been bugging me for a while. TraceMonkey (the new FF JavaScript engine) employs a semi-Hotspot approach to JavaScript compilation. Although the approach is quite different under-the-hood, the basic principal is the same: interpret the code until you know what to optimize. This makes abundant sense to me: why go through the expense of native compiling a method that only ever gets run once? However: there is a hidden expense to doing things this way: your compilation unit is often much smaller. Smaller compilation-units means more compiled code that needs to go back and forth to interpreted code. It also means you can’t optimize as heavily (although that optimization has it’s own massive expense).

Where am I heading with this? I think the TraceMonkey approach adds huge value to a language like Java where an application consists of hundreds and / or thousands of classes (both those in the application and all of it’s dependent APIs), in this case: you’re loading time would be abysmal if you pre-loaded all the possibly used code and native compiled it on start-up. However in the JavaScript world we are limited by the connection, and JavaScript applications (even really huge ones like those written with GWT) are relatively small. In this case it makes more sense to me to either compile everything up front; or run everything interpreted and run a massive compilation in the background (hot-swapping the implementations when you’re finished the compile).

It’s just a thought. Maybe FF should adopt v8 and see what happens?

GWT’s new Event Model – Handlers in GWT 1.6

Theres a lot of concern and worry around the “new event model” in GWT 1.6: throwing out Listeners (a well know and used Java construct) and replacing them with Handlers (a GWT team invention). Let me put your mind at rest, Handlers will make your GWT life much easier.

So what are the differences between Listeners and Handlers?

Listeners Handlers
Have one Listener per event source (ie: “mouse”, “keyboard”, etc) Have one Handler per event type (ie: “mouse press”, “mouse down”, “mouse up”)
Are uncategorized, and the Widget is responsible for sinking / un-sinking the DOM events Are categorized into 2 types: DOM and Logical, adding a DOM Handler to any Widget deferrers to Widget.addDomHandler which sinks events as required
Each Widget uses a ListenerCollection for each class of Listener added All Handlers are stored with a HandlerManager which also manages dispatching any type of event (including new ones you create yourself)
Listeners are removed with the pattern Button .removeClickListener(listener); Adding a Handler returns a HandlerRegistration object, removing a Handler is done by calling HandlerRegistration .removeHandler();
GWT Listeners accept a parameter for each bit of relative event information (ie: sender, mouse x, mouse y) Handlers (more like normal Java Event Listeners) are passed an Event object with all the Event details as their only parameter

So what do Handlers actually buy you in code?

Possibly the most important thing to notice is they take up a lot less space when compiled to GWT. This in mainly due to the fact that their management is centralized in HandlerManager instead of requiring a separate ListenerCollection for each Listener type, and that an implementation only needs a single method overridden (instead of having to listen for events you’re not interested in).

Another point to take note of is that all DOM related Event objects extend from DomEvent. Why is this important? Because DomEvent includes the methods:

  • preventDefault()
  • stopPropagation()
  • getNativeEvent()

I find the last (getNativeEvent()) to be the most useful, it works much the same as Event.getCurrentEvent() and returns a NativeEvent (the new parent class to Event). The advantage of this mechanism over Event.getCurrentEvent() is that you have all the access to the underlying event information (mouse button, modifiers, etc) but attached directly to the dispatched event.

Summary

Handlers are a good step forward for GWT, and make abundant sense in a Web environment where code size is very important. They also make it much easier to add you own events since you no longer need to manage your own Listener registration / deregistration cycle or firing of the events to all of the registered Listeners (HandlerManager does it all for you). I think there will be some interesting new patterns emerging with regards to managing events on generated structures (built based on data fetched from the server for example).

I’m looking forward to seeing what comes of these changes.