Improving Java JSON speed

TL;DR; People don’t really care about Java JSON speed

The curse of the best

In Javaland there is one “standard” JSON framework/library – Jackson. It’s been around forever, has various features, great performance and abundance of documentation. As such it’s no wonder that it’s the default choice for any Java JSON problem.

And when you have such reputation you are constantly compared with various alternatives in poorly constructed benchmarks, since that’s easy benchmarketing; which is understandable, since that’s the fast route to being famous and getting the girls… err, internet points.

Fastest gun in West

Even Jackson authors find benchmarketing tiresome to the point of building their own reference benchmark.

Benchmarking game

So, in a world of benchmarketing noise, how do you actually differentiate yourself?

  1. if you are from a big/known company you can just claim to have invented faster serialization to get people’s intention
  2. if you are well networked, you can spam and spam on various sites until people start noticing
  3. try to generate some distinct benchmarking noise
  4. submit to various “popular” benchmarks

If you look around/try those things you can end up with conclusion that:

  1. obviously works and you don’t even have to provide benchmarks, but can claim that benchmarks are misleading and you should benchmark for your use-case
  2. can work, but you will probably get appropriate reputation in the process
  3. might work if stars are correctly aligned and you have truly created something distinct
  4. will get you ignored (except in narrow circles)

Benchmarking is very hard to get right. People fall in all kind of traps. Correctly interpreting the results is often coordinately omitted. That’s why we should strive on improving community driven benchmarks, at least in the open source world.

The story of DSL-JSON

Microsoft Bond, Google Protobuf, FlatBuf and various other interface description languages are either liked by those which appreciate structure imposed by them for improved portability and performance or disliked by those satisfied with their true language.

What’s common about them is they all end up serializing JSON, although at much lower speed than Jackson.

DSL Platform wasn’t designed to be an IDL or to generate JSON serialization code. As such when DSL-JSON library for DSL Platform was built, it was not really interesting to ones who just wanted to serialize some JSON and not dabble with some unknown domain specific language.

But performance was there and it was so good, that it was comparable with fastest JVM binary codecs. Fortunately there is go-to benchmark for JVM serialization where it could spark some interest.

Results

  • Github repository stars: 0
  • Positive feedback: 0
  • Negative feedback on benchmarketing: all

If you try to come up with reasons for it, you could arrive to conclusion that:

  • You suck at marketing
  • People are not really interested in IDLs

To stay sane, you ignore negative feedback and just continue on your lonesome path. Today DSL-JSON can be used in Java idiomatic way through annotations; in the background IDL will be written on the fly based on code metadata (meaning you don’t even know IDL exists).
This way it doesn’t feel strange to those repelled by IDLs and is difficult to tell it apart from reflection based databinders.

Every benchmark has it’s own set of issues, sometimes at the will of authors (so they can game the results), sometimes due to messy evolution. JVM serializers is no exception and is kind of shame it’s mostly dormant. One problem with that benchmark is that it creates a lot of garbage, which is not really something a benchmark should do. If it didn’t, then instead of DSL-JSON near the top it could be on the top instead.

JSON vs binary

Maybe even Simple Binary Encoding wouldn’t have objections to it.

Everything is popularity contest

The most popular benchmark on Github is Techempower benchmark (followed closely by JVM serializers).
While Techempower benchmark has all kinds of it’s own issues, it is certainly valuable – up to a point in forcing Microsoft hand into improving their ASP.NET abysmal performance. Nobody likes to look bad on that benchmark and it’s funny when authors pull their framework if it’s not at the top of the charts.

Anyway, thanks to that framework benchmark we can see how changing JSON serialization library can give 80% better performance at the top of the chart. If you look at round 12 JSON results of their benchmark, you could spot something very strange; servlet beating Netty and Undertow at throughput.

Netty as Java performance poster child beaten by Servlet v2? That can’t be, must be some issue with configuration; is the default reasoning for those which even notice that numbers.

Java Servlet beating C while in lower loads? Nobody notices that.

Techempower

And that’s not the worst part. The worst is when you talk to someone in person about that stuff; and at best it ends up with a tap on the back, but mostly ends up with “sure you did, buddy”.

The best thing one can do about it is to shelve it under: “First they ignore you…” hoping it won’t turn into “… then they ignore you some more” 😉

But you move on, ignore programming pop-culture, err… meritocracy; and take the next performance consultancy.

Leave a Reply

Your email address will not be published. Required fields are marked *