Fun with String.intern()

The intern() method in the String class is one of the lesser-known gems of the Java world. It’s quite subtle, brilliantly powerful, and potentially very dangerous. That having been said, it has long been one of my favourite bits of core Java API, mostly because it’s incredibly flexible in what it allows you to do.

What is intern()?

What does String.intern() actually do? It makes strings that look the same (ie: have the same content), into the same String object. For a more in-depth understanding, it’s worth going back to one of the most basic Java lessons: “Don’t use == on a String, use .equals”.

Console cmd = System.console();
String input = cmd.readLine("Enter Your Name: ");

if(input == "Henry") {
    // This will never ever be true
}

if(input.equals("Joe")) {
    // Well, this might be true
}

if(input.intern() == "Jeff") {
    // Surprise, this might also be true... just like .equals
}

Okay, so what’s going on here? The answer is that the == operator tests whether two primitives are the same, so it works for int, long, boolean, double and friends but it doesn’t appear to work for objects at first. This is because in Java, objects are all references so what you are testing is “is this object the same object as another object” rather than “does this object have the same data and fields as another object”. Put another way “are these two references pointing to the same memory location?”

Referencing Basics

The String.intern() method allows you to make use of the string constant-pool in the JVM. Every time you intern a String, it’s checked against the VM’s pool of Strings and you get a reference back that will always be the same for any given bit of contents. “Hello” will always == “Hello”, and so on. This internal pool of Strings is what causes this behaviour:

String string1 = "Hello";
String string2 = "Hello";
String string3 = new String("Hello");
String string4 = string3.intern();

assert string1 == string2; // true
assert string2 == string3; // false
assert string1 == string4; // true

When the Java compiler and VM encounter string1 and string2 they see that they have exactly the same content, and since String is an immutable type they’re turned into the same object under the hood. string3 is different, we used the new operator effectively forcing the VM to create a new object for us. string4 however asks the VM to use it’s string-pool by invoking the intern() method.

Getting Clever

You can use interned String references to massively speed-up configuration key lookup. If you declare the possible configuration properties as constant String objects in an interface, you can intern() the keys when they are loaded externally. This allows you to use an IdentityHashMap instead of a traditional HashMap.

public class Config {
    public static final String REMOTE_SERVER_ADDRESS = "remoteServer.address";

    private final Map config = new IdentityHashMap();

    public void load(Properties config) {
       config.stringPropertyNames()
             .forEach((k) => config.put(k.intern(), config.getProperty(k)));
    }

    public String get(final String key) {
        return config.get(key);
    }
}

The above snippet will only ever work when the Config.get method is called with constants (static final) or other Strings that are interned.

Even more fun

Every field, method and class in Java has a name. It’s name (or identifier) is also stored in the same string-pool as the intern() method accesses. Meaning that there are some very strange things that start to happen with reflection:

public class Main {
    public static void main(String[] args) throws Exception {
        // this is true!
        Main.class.getMethod("main", String[].class).getName() == "main";
    }
}

Warnings

  1. Using String.intern might cause unexpected behaviour (ie: it can violate the principal or least astonishment), and make your code more confusing to your team-mates, most of us don’t expect to see identity-equality checks with Strings. If you are tempted to use it you should check with your team before doing so.
  2. Invoking intern is expensive in it’s own right, making it something to use with care. It’s fine when loading a configuration file on startup, it’s not something you want to do for ever equality check.
  3. The string-pool is a precious resource, and strings within it might not be garbage collected. For this reason you should never use intern on untrusted input.

One Response to “Fun with String.intern()”

  1. Java Weekly, Issue 215 | Baeldung Says:

    […] >> Fun with String.intern() [lemnik.wordpress.com] […]


Leave a comment