BigDecimal and your Money

I often see Java developers attempting to use the BigDecimal class to store monetary values for financial applications. This often seems like a great idea when you start out, but almost always comes back as a flesh-eating zombie intent on devouring your entire time some time later on. The BigDecimal class is designed around accuracy and has an almost infinite size. However: this choice will almost always come back to bite you, or someone else attempting to find a minor bug later on. It’s an especially bad idea when it comes to banking applications. So why is BigDecimal not well suited to storing monetary values then? For one thing: it doesn’t behave in a manner that is practical for financial systems. While this statement flies in the face of everything you’ve probably been taught about this class, it’s true. Read-on and I’ll tell you why.

Theres more numbers in there than you think!

A BigDecimal is not a standard floating point number. Instead: it’s a binary representation of a number. This means: 0.0 != 0.00. While this doesn’t seem like a problem at first: I’ve seen it cause no-end of strange little bugs. The only way to accurately determine whether two BigDecimal objects have an equal value is by using the compareTo method. Try these two little unit tests:

@Test
public void testScaleFactor() {
    final BigDecimal zero = new BigDecimal("0.0");
    final BigDecimal zerozero = new BigDecimal("0.00");

    assertEquals(zero, zerozero);
}

@Test
public void testScaleFactorCompare() {
    final BigDecimal zero = new BigDecimal("0.0");
    final BigDecimal zerozero = new BigDecimal("0.00");

    assertTrue(zero.compareTo(zerozero) == 0);
}

This technique works when you’re in control of the data and the comparison, but it breaks when you want to put a BigDecimal object into most other Java data-structures. I’ve actually seen someone use a BigDecimal as a key to a HashMap, which of course didn’t work. The solution in this case was simple: change the HashMap for a TreeMap and things were happy. However it won’t always be this simple.

They’re true high precision structures.

This doesn’t just mean that they are precise, it also means that they won’t run any calculation that wouldn’t result in a representable answer. Take the following code snippet as an example:

@Test
public void testArithmatic() {
    BigDecimal value = new BigDecimal(1);
    value = value.divide(new BigDecimal(3));
}

Primitive numeric types would just swallow this and represent the 0.3* as best they could, while a BigDecimal throws an ArithmeticException instead of attempting to represent a recurring number. In some cases getting an error will be desirable, but I’ve actually seen someone resolve the ArithmaticException like this:

try {
    return decimal1.divide(decimal2);
} catch(ArithmaticException ae) {
    return new BigDecimal(decimal1.doubleValue() / decimal2.doubleValue());
}

Yes folks, unfortunately I’m quite serious here. This is the sort of bug introduced by an error occurring, computations stop running, and someone adds a “hack” to just “make it work quickly and we’ll fix it later“. It’s a total disaster, but I see it far to often.

They don’t play nice with Databases.

According to the JDBC spec database drivers implement a getBigDecimal, setBigDecimal and updateBigDecimal functions. They seem like a great idea, until you ponder that your database may not have a suitable storage type for these values. When storing a BigDecimal in a database, it’s common to type the column as a DECIMAL or REAL SQL type. These are both standard floating-point types, with all the rounding errors that implies. They are also limited in capacity and will often overflow or cause a SQLException when attempting to store very large BigDecimal values.

The only practical solution which will keep all the BigDecimal functionality and accuracy in a database is to type the amounts a BLOB columns. Try to imagine the following table structure if you will:

CREATE TABLE transactions (
    initial_date DATETIME NOT NULL,
    effective_date DATETIME NOT NULL,
    description VARCHAR(30) NOT NULL,
    source_id BIGINT NOT NULL,
    destination_id BIGINT NOT NULL,
    in_amount BLOB NOT NULL,
    in_amount_currency CHAR(3) NOT NULL,
    effective_amount BLOB NOT NULL,
    effective_amount_currency CHAR(3) NOT NULL,
    charge_amount BLOB NOT NULL,
    tax_amount BLOB NOT NULL
);

That required four different BLOB columns, each one of which will be stored outside of table space. BLOB objects are very expensive both to store, and to work with. Each one often uses it’s own database resources (much like an internal cursor) to read or write the value. This translates to much more time and network usage between your application and it’s database. To add to the misery a BLOB is generally not readable by a SQL tool, one of the major reasons for sticking with a SQL database is that it can be managed from outside of your application.

Performance.

This is often raised as an issue, but ignored in favor of “accuracy”. The performance of BigDecimal is often considered “good enough” for general computing, and it’s fine if you want to add tax to an item every once in a while, but consider the number of interest calculations per month a moderate sized bank do. This may seem like an extreme case, but if your application ran a simple shipping and tax calculation for items on an online store in a JSP you’ve got effectively the same problem. In a very simple multiplication test BigDecimal performed over 2300 times slower than a simple long value. While this may only be milliseconds per mutation, a performance-factor of this size very quickly adds up to more computational time than is actually available to the system.

Also remember that BigDecimal (like most Number subclasses) are immutable. That means every calculation requires a copy of the existing BigDecimal. These copies are generally cleaned away by the eden-space collector (and G1 is very good at handling them), but when you put such a system into production it leads to a massive change in your heap requirements. Your BigDecimal objects must be allocated in such a way that a minimum number of them survive a garbage collection, the memory requirement of such a space quickly spirals out of control.

To add to the performance argument: the compareTo method is quite a bit slower than the equals method, and gets significantly slower as the size of the BigDecimal increases.

A Cure to BigDecimal Woes:

A standard long value can store the current value of the Unites States national debt (as cents, not dollars) 6477 times without any overflow. Whats more: it’s an integer type, not a floating point. This makes it easier and accurate to work with, and a guaranteed behavior. You’ll notice that several different behaviors in BigDecimal are either not well defined, or have multiple implementations. That said: depending on your application you may need to store the values as hundredths or even thousandths of cents. However this is highly dependent on your application, and theres almost always someone who can tell you exactly what unit the business works in. Bare in mind also that there are often de-facto (or even mandated) standards which exist between businesses about what unit of money they deal in, using more or less precision can lead to some serious problems when interfacing with suppliers or clients.

The mechanism I generally try to use is a custom-built MoneyAmount class (each application has different requirements) to store both the actual value, and it’s Currency. Building your own implementation opens the opportunity to use factory methods instead of a constructor. This will allow you to decide on the actual data-type at runtime, even during arithmetic operations. 99% of the time, an int or long value will suffice – when they don’t the implementation can change to using a BigInteger. The MoneyAmount class also enables you to define your own rounding schemes, and how you wish to handle recursive decimal places. I’ve seen systems that required several different rounding mechanisms depending on the context of the operation (currency pairs, country of operation and even time of day). For an example of this kind of factory discussion: take a look at the source-code for the java.util.EnumSet class. Two different implementations exist: the RegularEnumSet class uses a long to store a bit-set of all the selected constants. Given that very few enum values have more than 64 constants this implementation will cover most cases, just like a long will cover most requirements in a financial system.

Summary

This post is to warn people who are busy (or about to start) writing a system that will run financial calculations and are tempted to use BigDecimal. While it’s probably the most common type used for this purpose in the “enterprise” world, I’ve seen it backfire more times than I care to recount. My advise here is really to consider your options carefully. Taking shortcuts in implementation almost always leads to pain in the long-run (just look at the java.util.Properties class as an example of this).

On Netbeans; memory and speed

It’s about the most common complaint about Netbeans: It needs to much memory! *wine wine wine*
That, and: it’s so slows *cry cry cry*

You know something, there are two simple ways around this “problem”:

  1. If you are running Windos, switch to a real operating system! I recommend:
    1. Kubuntu for people with permanent internet or
    2. SUSE for those who are on dial-up
  2. If you are actually writing code, you need a minimum of 1Gb (preferably 1.5Gb) of RAM
    1. This applies to all software developers, regardless of you language or target platform (I included embedded developers in this statement).
    2. Software developers simply tend to have more software running at any given time
  3. A dual-core CPU does make an enormous difference to IDE performance

I do 100% of my development on a 1.6Ghz Centrino laptop, with 1.5Gb of RAM (533Mhz if you must know), with a standard laptop hard-drive. With this simple configuration I am amazed at the speed when I run:

  • Netbeans 5.5
  • Netbeans 6 beta-2
  • JBoss 4
  • Thunderbird
  • Firefox
  • Kopete
  • Konqueror
  • Amarok
  • Kerry-Beagle
  • Many terminals
  • MySQL

Spread out over 4 KDE desktops, and they all perform as if they were running by themselves. If you have the opertunity, switch to Netbeans 6, as it’s a much faster system than any previous version.

Open Office to Netbeans for my Wedding

I’ve just written the first paragraph of my wedding speech (getting married next weekend, 2007-09-15) using Open Office. It’s generally a lovely writing tool, except I just discovered that it doesn’t have a “Select Current ATOM” shortcut. So, I’m switching to Netbeans to write the rest of the speech. What kind of IDE doesn’t have an “ALT-J” combination to select the current ATOM!

Do “Hardware People” actually use computers?

This is a bit of a rant. Being someone who just spent R1350 (~ €142) on a new stick of RAM for my laptop, I have to wonder at a quote for a desktop machine I saw yesterday. It’s for a kid who’s really into his gaming. His power-supply fried (most likely cause of all the mods he’s got in the machine with a really rubbish 350 watt power-supply). The place he took the machine to, gave him the following explination:

Something strange happened on the power-lines, and fried the power-supply. It took you motherboard with it.

Now since his motherboard currently has a 939 socket for his Athlone 2600+ CPU, and the FSB speed is not great (the whole thing is over 2 years old), they quoted him for the following:

  • Dual-Core AMD 3600+ (64bit)
  • AM2 Socket Motherboard with 1Ghz FSP
  • 512Mb Memory (833Mhz)
  • 400 watt PSU

Do I see something wrong here? First off, he’s running 32bit XP… he doesn’t have a 64bit copy of Windows, why isn’t it in the quote?!?! Why is there only 512Mb of memory, and why the hell is it 800Mhz when the motherboards FSP is 1Ghz?!? Why, because 1Ghz memory modules would be more expensive than the motherboard.

This seems to be a bad case of selling things for the highest cut.

I know the difference some extra memory can make, it’s far greater than a slightly faster CPU or faster memory will make. With a CPU like the one in the quote, it should have 2Gb of RAM in the quote. Make the whole system 833Mhz FSB, and take the CPU clock-speed down a few notches.

Bah! Enough ranting 😉
Just beware of whats in the box when you buy a computer!

Breaking the spam chain

Chain letters are a form of spam. It’s really as simple as that. I know so many people who forward these emails on, and all they’re doing is helping spread spam. We jail spammers, fine them, and yet people continue to send these emails. Most people think that spam is just some sort of by-product of the internet, just like viruses are. Let me tell you that neither viruses, nor spam just come out of no-where. They are created and send by human beings (or in most cases software written by human beings). These are not “glitches” in code, they are systems designed specifically for one purpose: take your money away from you. It’s robbery, plain and simple, and by forwarding chain letters, you are contributing to this mess.

One of the most popular phrases I read in these things is, “Microsoft is tracking this email and will donate….” (Microsoft is sometimes AOL, or some other Big Computer Companytm). The fact is: email is untraceable. No one will donate the money. It’s spam, or worse. In some (growing number) of cases, the images, and/or hidden code within the email caries a computer virus. So by forwarding this email to 10 of your closest friends, you’re sending them a virus for their computers as well.

Don’t forward chain email, emails warning you about gangs, hijackings, or computer viruses. Not from your closest friends, and certainly not to them.
You are not helping them!

Linux is ready for mainstream use, get over it!

Instead of hearing it less and less, I seem to be hearing it more: “Linux is to hard”, “Linux is not desktop friendly”. It’s probably all the talk of Vista, and all those trying it out and comparing their 10+ years of windows use to a week with Linux (sometimes no more than a half-baked attempt at installing a distro). Quite frankly guys, Linux is desktop ready. Not only my parents, but my parents-in-law and grand-parents run Linux. I know schools that have installed Linux, and their kids find Windows more difficult to work with. It’s a question of what you’re used to. If you don’t know that distro’s like Suse and Mandriva have wonderful hardware config tools, and you wind up re-compiling your kernel to install some weird wireless card that is supported through NdisWrapper, then shame on you.

I’m not of the opinion that Linux will one day be on every desktop on the planet. Nor am I the type that starts frothing at the mouth when they hear the word “Microsoft” (though I do know a few like that :P), what I’m saying here is simply: don’t dismiss Linux because you tried to install Ubuntu. Ubuntu is a heavily purest version of Linux (very VERY GPL based). Rather try something like Suse of Mandriva that includes things like you NVidia drivers and Flash.

Crippled Passwords are Amoung us!

I love my secure passwords. Lovely passwords like “fr!bBl3~98_m0nT4=” (no, this is not a password I use, just an example of the type). This kind of mangling makes a password more difficult to guess for password breaking programmes. No it doesn’t make them “unbreakable”, but nothing in this digital world of ours is truly “unbreakable”, but many things are “very very difficult to break”, and that’s kinda what I aim for.

When I signed up for Internet banking with my bank, I thought “nice secure password”, and typed a really horribly convoluted password, the response was “invalid password”. I picked up the phone (something I don’t do often), and called their help line, only to get told “you can only use letters and numbers in your password sir”. The frightening thing is: there are many web sites I’ve signed up to recently that have this “no symbols or spaces” password policy. What on earth is wrong with you people!?!?! You tell me to select a secure password, but then tell me the one I gave is to secure??? I see absolutely no technical (or non-technical for that matter) reason why you cannot store my horrible password.

Surely you don’t store my password as plain text in your database do you. This is a massive potential security problem. If someone breaks into their database, they own your accounts. One of the first things I do is click the “forgot my password” link on the site. If they send me the password I typed in, I change my password and get rid of my account, simple reason: they’re storing my password somewhere. If they reset my password, or send me a random one, it’s a good indication that they are storing hashed passwords, and so my data is a bit more secure.

Be careful what sites you sign up with, how secure their data is directly affects you.