Flatbuffers performance in Java

TLDR; FlatBuffers is not the cure for performance issues in Java

“Always right” attitude

Developers like to think of themselves as rational, methodical and driven by facts instead of beliefs. That gave rise to the popular myth of software field driven by meritocracy, instead of popularity/fashion/tribalism. This way we can feel better about ourselves and look down on fashion industry. Instead, we’ll jump on the next most popular (err. best) tool which came out from our popular designers (err. smartest developers) solving our world problems (err. first world problems).

Using the most popular tool

Everything new is better

FlatBuffers is nice serialization protocol from Google. While idea is nothing new and people have been doing such stuff for a long time, it tries to go a step further and be cross platform compatible. In doing so several optimizations which work in some unmanaged languages, stop working in most managed ones.

This gave rise to the myth of how to solve serialization problems in Java: “Just replace your JSON library with FlatBuffers”. Years before it was: “Just replace your JSON library with Protocol Buffers”.
Even Facebook solves their performance problems on Android the same way.

When someone actually digs into code and tries to understand/explain how such protocol works it sounds like a mad prophet.

Furious at the world

Benchmarking FlatBuffers

Google was kind enough to release public benchmarks along its library. Conveniently they only benchmarked the C++ implementation. What was not nice is the impression that other implementations behave the same. At least they did provide benchmarks, which is great improvement over the previous attitude that benchmarks are misleading and won’t tell you anything. That attitude crossed over with to CapnProto library, which shares the same underlying problems in managed languages as does FlatBuffers.

Most popular benchmark for JVM includes various other libraries. While it’s not perfect, it’s the go-to benchmark for Java. Unfortunately updates to the repository stalled, but it still provides excellent starting point for comparing various libraries. Latest run includes FlatBuffers and CapnProto.

If you are expecting those two near the top, you will be surprised. Since CapnProto Java implementation is not official it gets a pass. FlatBuffers on the other hand is maintained by Google. Since they advertise FlatBuffers as cure for the cancer, you would expect it would do better when put to the test.

Benchmarking FlatBuffers

There is a lot of room for improvements for Java FlatBuffers implementation. If Google wanted, they could improve the implementation (or hire someone outside), so they are at least not behind JSON. But when you are popular/big you can push your invalid point such as: “When serializing data from statically typed languages, however, JSON not only has the obvious drawback of runtime inefficiency, but also forces you to write more code to access data (counterintuitively) due to its dynamic-typing serialization system” and people will accept it without critical thinking.

Understanding FlatBuffers

To better understand pros and cons of FlatBuffers, first let’s analyze some of the popular arguments:

While JSON has a “schema” embedded with the data, most of the time, JSON is just a transport layer for some other schema. In most trivial cases this is done by serializing objects as strings in JSON, while they are actually some other type, eg. UUID, LocalDate,…
While “runtime inefficiencies” for converting objects into “string” exists, unless you your payload consists mostly of primitives and collections of primitives, those “inefficiencies” are minuscule.

Runtime databinding is greatly preferred over rigid generated code. If application has its own POJOs and just want to use some library for serialization, writing code for conversion from POJO into generated POJO-like monster or into series of offsets in byte[] (as in case of FlatBuffers) leads to much more code for data access. Only thing better than runtime databinding is compile time databinding 😀

If we are sending whole object over the wire, it’s not really useful to read only parts of it. It’s much better to only send interesting parts anyway. While this might lead to more code, it’s only because people are used to reusing existing POJOs everywhere.

FlatBuffers doesn’t allow nesting, which causes you to serialize everything as a flat sequence. This prevents it from being used in a streaming environment and solving actual problems developers have (such as streaming lists of objects).

While schemas are nice and most apps would benefit from writing less POO and more models, schema for the sake of serialization only is missing the bigger picture.
But that’s another story.

Improving Java JSON speed

TL;DR; People don’t really care about Java JSON speed

The curse of the best

In Javaland there is one “standard” JSON framework/library – Jackson. It’s been around forever, has various features, great performance and abundance of documentation. As such it’s no wonder that it’s the default choice for any Java JSON problem.

And when you have such reputation you are constantly compared with various alternatives in poorly constructed benchmarks, since that’s easy benchmarketing; which is understandable, since that’s the fast route to being famous and getting the girls… err, internet points.

Fastest gun in West

Even Jackson authors find benchmarketing tiresome to the point of building their own reference benchmark.

Benchmarking game

So, in a world of benchmarketing noise, how do you actually differentiate yourself?

  1. if you are from a big/known company you can just claim to have invented faster serialization to get people’s intention
  2. if you are well networked, you can spam and spam on various sites until people start noticing
  3. try to generate some distinct benchmarking noise
  4. submit to various “popular” benchmarks

If you look around/try those things you can end up with conclusion that:

  1. obviously works and you don’t even have to provide benchmarks, but can claim that benchmarks are misleading and you should benchmark for your use-case
  2. can work, but you will probably get appropriate reputation in the process
  3. might work if stars are correctly aligned and you have truly created something distinct
  4. will get you ignored (except in narrow circles)

Benchmarking is very hard to get right. People fall in all kind of traps. Correctly interpreting the results is often coordinately omitted. That’s why we should strive on improving community driven benchmarks, at least in the open source world.

The story of DSL-JSON

Microsoft Bond, Google Protobuf, FlatBuf and various other interface description languages are either liked by those which appreciate structure imposed by them for improved portability and performance or disliked by those satisfied with their true language.

What’s common about them is they all end up serializing JSON, although at much lower speed than Jackson.

DSL Platform wasn’t designed to be an IDL or to generate JSON serialization code. As such when DSL-JSON library for DSL Platform was built, it was not really interesting to ones who just wanted to serialize some JSON and not dabble with some unknown domain specific language.

But performance was there and it was so good, that it was comparable with fastest JVM binary codecs. Fortunately there is go-to benchmark for JVM serialization where it could spark some interest.


  • Github repository stars: 0
  • Positive feedback: 0
  • Negative feedback on benchmarketing: all

If you try to come up with reasons for it, you could arrive to conclusion that:

  • You suck at marketing
  • People are not really interested in IDLs

To stay sane, you ignore negative feedback and just continue on your lonesome path. Today DSL-JSON can be used in Java idiomatic way through annotations; in the background IDL will be written on the fly based on code metadata (meaning you don’t even know IDL exists).
This way it doesn’t feel strange to those repelled by IDLs and is difficult to tell it apart from reflection based databinders.

Every benchmark has it’s own set of issues, sometimes at the will of authors (so they can game the results), sometimes due to messy evolution. JVM serializers is no exception and is kind of shame it’s mostly dormant. One problem with that benchmark is that it creates a lot of garbage, which is not really something a benchmark should do. If it didn’t, then instead of DSL-JSON near the top it could be on the top instead.

JSON vs binary

Maybe even Simple Binary Encoding wouldn’t have objections to it.

Everything is popularity contest

The most popular benchmark on Github is Techempower benchmark (followed closely by JVM serializers).
While Techempower benchmark has all kinds of it’s own issues, it is certainly valuable – up to a point in forcing Microsoft hand into improving their ASP.NET abysmal performance. Nobody likes to look bad on that benchmark and it’s funny when authors pull their framework if it’s not at the top of the charts.

Anyway, thanks to that framework benchmark we can see how changing JSON serialization library can give 80% better performance at the top of the chart. If you look at round 12 JSON results of their benchmark, you could spot something very strange; servlet beating Netty and Undertow at throughput.

Netty as Java performance poster child beaten by Servlet v2? That can’t be, must be some issue with configuration; is the default reasoning for those which even notice that numbers.

Java Servlet beating C while in lower loads? Nobody notices that.


And that’s not the worst part. The worst is when you talk to someone in person about that stuff; and at best it ends up with a tap on the back, but mostly ends up with “sure you did, buddy”.

The best thing one can do about it is to shelve it under: “First they ignore you…” hoping it won’t turn into “… then they ignore you some more” 😉

But you move on, ignore programming pop-culture, err… meritocracy; and take the next performance consultancy.

Fast Postgres from .NET

It's often said that abstractions slow down your program, since they add layers which makes your application slower.
While this is generally correct, it's not always true.
Performance can be improved somewhat by removing layers, but the best way to improve performance is to change algorithms.

So let's see how we can beat performance of writing SQL and doing object materialization by hand, as it is (wrongly) common knowledge that this is the fastest way to talk to the database.

First use case, simple single table access

  created DATE NOT NULL

So – the standard pattern to access such a table would be:


ignoring (for now) that it would probably be a better style to explicitly name columns. Alternatively, in Postgres we can also do:


which would return a tuple for each row in the table.

For the first query, without going too deep into the actual Postgres protocol, we would get three "columns" with length and content. Parsing such a response would look something like this:

IDataReader dr = ...
return new Post { 
    id = dr.GetGuid(0), 
    title = dr.GetString(1),
    created = dr.GetDateTime(2) 

The second query, on the other hand, has only one "column" with length and content. Parsing such a response requires knowledge of Postgres rules for tuple assembly and is similar to parsing JSON. The code would look like this:

IDataReader dr = ...
return PostgresDriverLibrary.Parse<Post>(dr.GetValue(0));

In the TEXT protocol, the response from Postgres would look like this:

(f4d84c89-c179-4ae4-991a-e2e6bc12d879,"some text",2015-03-12)

So, now we can raise a couple of questions:

  • is it faster or slower for Postgres to return the second version?
  • can we parse the second response faster than the first response on the client side?

To make things more interesting, let's investigate how would it compare talking to Postgres using BINARY protocol in first case and using TEXT protocol for second case. Common knowledge tells us that binary protocols are much faster then textual ones, but this also isn’t always true:


(DSL Platform – serialization benchmark)

Verdict: for such a simple table, performance of both approaches is similar

(DSL Platform DAL benchmark – single table)

Second use case, master-detail table access

Common pattern in DB access is reading two tables to reconstruct an object on the client side. While we could use several approaches, let's use the "standard one" which first reads from one table and then from a second one. This can sometimes lead to reading inconsistent data, unless we change the isolation level.

For this example, let's use an Invoice and Item tables:

  dueDate DATE NOT NULL,
  canceled BOOL NOT NULL,
  version BIGINT NOT NULL,
  tax NUMERIC(15,2) NOT NULL,
  reference VARCHAR(15),
  invoiceNumber VARCHAR(20) REFERENCES Invoice,
  _index INT,
  PRIMARY KEY (invoiceNumber, _index),
  product VARCHAR(100) NOT NULL,
  quantity INT NOT NULL,
  taxGroup NUMERIC(4, 1) NOT NULL,
  discount NUMERIC(6, 2) NOT NULL

To make things more interesting we'll also investigate how performance would compare if we used a type instead of table for the items property. In that case we don't need a join or two queries to reconstruct the whole object.

So let's say that we want to read several invoices and their details. We would usually write something along the lines of:

WHERE NUMBER IN ('invoice-1', 'invoice-2', ...)
WHERE invoiceNumber IN ('invoice-1', 'invoice-2', ...)

and if we wanted to simplify materialization we could add ordering:

WHERE NUMBER IN ('invoice-1', 'invoice-2', ...)
WHERE invoiceNumber IN ('invoice-1', '"invoice-2', ...) ORDER BY invoiceNumber, _index

While this is slightly more taxing on the database, if we did a more complicated search, it would be much easier to process stuff in order via the second version.

On the other hand, by combining records into one big object directly on the database, we can load it in a single query:

  SELECT it FROM Item it
  WHERE it.invoiceNumber = inv.number
  ORDER BY it._index) AS items  
FROM Invoice inv
WHERE inv.number IN ('invoice-1', 'invoice-2', ...)

The above query actually returns two columns, but it could be changed to return only one column.

Materialization of such objects on the client for the first version would look like this:

IDataReader master = ...
IDataReader detail = ...
var memory = new Dictionary<string, Invoice>();
while (master.Read())
  var head = new Invoice { 
    number = master.GetString(0), 
    dueDate = master.GetDateTime(1), ... 
while (detail.Read())
  var invoice = memory[detail.GetString(0)];
  var detail = new Item { 
    product = detail.GetString(2),
    cost = detail.GetDecimal(3) ... 

Postgres native format would be materialized as in first example along the lines of:

IDataReader dr = ...
return PostgresDriverLibrary.Parse<Invoice>(dr.GetValue(0));

Postgres response in TEXT protocol would start to suffer from nesting and escaping, and would look something like:

(invoice-1,2015-03-16,"{""(invoice-1,1,""""product name"""",...)...}",...)

With each nesting layer more and more space would be spent on escaping. By developing optimized parsers for this specific Postgres TEXT response we can parse such a response very quickly.

Verdict: manual coding of SQL and materialization has become non-trivial. Joins introduce noticeable performance difference. Manual approach is losing ground.

(DSL Platform DAL benchmark – parent/child)

Third use case, master-child-detail table access

Sometimes we have nesting two levels deep. Since Postgres has rich type support this is something which we can leverage. So, how would our object-oriented modeling approach look like if we had to store bank account data into a database?

CREATE TYPE Currency AS ENUM ('EUR','USD','Other');
  description VARCHAR(200),
  currency Currency,
  amount NUMERIC(15,2)
  balance NUMERIC(15,2),
  name VARCHAR(100),
  notes VARCHAR(800),
  transactions TRANSACTION[]
  website VARCHAR(1024) NOT NULL,
  externalId VARCHAR(50),
  ranking INT NOT NULL,
  tags VARCHAR(10)[] NOT NULL,
  accounts Account[] NOT NULL

Our SQL queries and materialization code will look similar to before (although complexity will have increased drastically). Escaping issue is even worse than before and while reading transactions we are mostly skipping escaped chars. Not to mention that due to LOH issues we can’t just process a string, it must be done using TextReader.

Verdict: manual coding of SQL and materialization is really complex. Joins introduce a noticeable performance difference. Manual approach is not comparable on any test:

(DSL Platform DAL benchmark – parent/child/child)


  • Although we have looked into simple reading scenarios here, insert/update performance is maybe even more interesting.
  • Approach took by the Revenj and backing compiler is not something which can realistically be reproduced by manual coding.
  • Postgres is suffering from parsing complex tuples – but with smart optimizations that can yield net win. There are also few "interesting" behaviors of Postgres which required various workarounds.
  • It would be interesting to compare BINARY and TEXT protocol on deep nested aggregates.
  • JSON might have similar performance to Postgres native format, but it's probably more taxing on Postgres.

We’re Doing It Wrong – ANTLR

This is part of WDIW series – where we reflect on our misuse of a specific technology, which results in all kind of weird edge cases. In the end we always ask ourselves, is all software just buggy or are we doing something wrong?

Use of ANTLR in DSL Platform

Grammar for DSL Platform is currently defined in ANTLR, or better to say in unmanageable 7k+ lines of ANTLR. And it doesn’t include any other grammar like C#, Java or SQL, which it should. Of course we don’t actually work with 7k+ lines of code, but instead work with small snippets which are aggregated into that grammar behemoth.

So you could say that we have a small working set of ANTLR grammar – 100 lines or so, for defining a DSL ANTLR grammar, in which we build snippets for each concept used in DSL and as a result get a fully defined grammar which we send to ANTLR.

In practice this looks something like:

keyword mixin;

rule mixin_rule [IToken<ModuleConcept> Module]
  scope [IToken<MixinConcept> current]
  <# mixin n=ident { $mixin_rule::current = Parse<MixinConcept>($Module, n.Text); } #>
  extends [module_rule [current]]

… where we define the “keyword” mixin, rule mixin_rule with its arguments, its scope variables, actual grammar and the conversion to our tokens. Keyword is not really a keyword in a sense that it’s reserved. A valid DSL Platform input is:

module module {
  mixin mixin;

From which you get a module named “module” and a mixin named “mixin”. We’ve even gone so extreme that you can define a domain such as:

module public {
  entity class {
    int int;
    long[] for;

But, whether this will actually compile is up to the target languages (C# and Scala support this).

Many many years ago

Before the age of DSL Platform, while doing research and looking at Language workbenches such as MPS, prototyping ended with a simple parser and grammar which works OK and is still used in protoduction today, but we have since evolved and moved past it. The premise was that it must be super easy to add new ASTs. Grammar ended up being bound to the AST, without an explicit grammar definition. So while modeling, you were writing AST directly, but without MPS strong type checking. In terms of code that prototype AST looked like:

public class MixinConcept : IConcept
  public ModuleConcept Module { get; set; }
  public string Name { get; set; }

where you can pick up Name from definition and identity from the [Key] attribute. By having a ModuleConcept property you’re implicitly extending its grammar. You can also use the interface to extend multiple concepts at once. By adding something other than text editor on top of it, you can get to work directly on the AST, just like MPS does. Of course, you also have external constraints such as how each rule ends, starts or is extended by.
By moving away from that approach, we got a lot more flexibility in the grammar, which still looks very closely to the AST, but it’s not a 1-1 mapping anymore. Unfortunately, now we have to actually define grammar instead of being implicitly defined by the AST.

One could argue that previous approach is superior extensibility-wise (and extensibility is very high on our priorities), since you can just plug in a new AST type and parser automatically picks it up (as long as there are no ambiguities created, or such similar issues). But in practice, you don’t change grammar that often and even when you do, you usually just add a new snippet which is included automatically into the aggregated grammar and pass it through ANTLR. I guess dynamic-ism at the grammar definition level didn’t pan out to be that important since all you need is a recompile. Often you can have a mix of both, so you can have dynamic-ism at few important places. For example in DSL Platform we can add new simple type (such as int/float/decimal) without grammar changes, as long as few rules are satisfied. This makes DSL easily extendable at important places.

ANTLR issues

Funny how our newfound love for ANTLR turned out, since today we constantly have the need to remove ANTLR from our system, due to it not coping well with the grammar. We are using ANTLR3 with infinite lookahead (not that we need it, but otherwise ANTLR produces incorrect grammars), since we get all the benefits of grammar validation, ambiguity warnings and rather fast parser using pre-built DFA.

During initial phases of grammar definitions we would end up with a lot of strange errors, for example ANTLR missing arguments to rule, since it moved execution of the rule somewhere else. But I guess, there are workarounds for that, since there is a couple of ways to use context arguments in rules. What we can’t workaround using vanilla ANTLR is its DFA explosion (at least without extensive code changes to ANTLR). Some rules create such a big DFA that on few rules it takes 60+ seconds just for processing that rule. But even that is not a big deal, since we don’t rebuild the grammar all the time, but rather that you can’t compile languages such as Java when ANTLR outputs target Java file. In Java there is a limit of 64KB for classes/methods which are broken with generated code. Some of it can be fixed, by moving variable initializations in class into other classes, but methods with giant switch statements and dozens of if statements cannot be fixed that easily.
What’s worse, latest version of ANTLR3 doesn’t even manage to build our grammar. Strangely enough, latest .NET ANTLR3 port works fine.

So, when we started integrating tooling support in various IDEs, we were expecting that it would be a breeze, since everyone is telling you: oh, you have an ANTLR grammar, yeah just plug it in and it works. And don’t let me get started on keywords and context-sensitivity. Since it’s much easier to have keywords in your grammar every ANTLR tutorial highlights how easy it is to get those keywords highlighted. But of course, only if those keywords are not context-sensitive. If they are context-sensitive, they are not really keywords, right? 😉

One can say, that again, WDIW, since almost always you will hear advices such as, don’t build context sensitive grammars since they don’t play well with existing tooling. But if you want to have something which can be easily read by both programmer and non-technical person, you don’t really have a choice.

Well, of course you do, you can build a visual editor and let programmer work with that 😀

Everything new is better

At one time we even considered moving to ANTLR4, since it seemed that it should cope better with such a grammar. Due to our abstract factory factory around ANTLR grammar, it wasn’t really hard to translate it to ANTLR4, but I guess we didn’t really like the dynamic nature of ANTLR4. Not getting ambiguity warnings anymore and not really using much of ANTLR features, it didn’t really make sense.

So considering that we don’t use ANTLR for anything else besides parsing to our token representation in a single pass, it seems like a technical debt to use ANTLR at all.
ANTLR4 prefers that you use hooks instead of injecting code snippets, since code snippets can’t be translated easily to other languages (except when you know how to translate it to other languages and have an abstract abstract factory which enables you to do just that).

I guess ANTLR4 was trying to solve some other problems and it doesn’t fit our requirements nicely. What’s worse when you look more deeply into its relationship with other parsers/communities, you will find out that parser writers almost always suggest writing your own parser to take full control of the process. So it looks to me like ANTLR gave up on trying to serve their needs.

So, we’ll be staying with an older version of the previous ANTLR, at least until we decide it’s time to drastically improve IDE support.

Query user defined types over database link

Querying tables over database link is very common today. But what happens when we have a table that depends on one or more user defined types? Oracle needs to know the structure of our table and it’s columns when it receives it over the DB link. However, it cannot retrieve that information from remote server (although it would be nice to have this in future) and that’s why it raises an error: ORA-22804: remote operations not permitted on object tables or user-defined type columns.

However if we are able to tell oracle on our side of database link what the types look like it would be possible for it to interpret the data. And we can do this by creating all used UDTs on our side taking following into consideration:

  1.  Types need to have exact names as ones on the remote server (although they don’t need to be in the same schema)
  2. Types need to have exact OID as ones on the remote server.
  3. Types don’t need to have member functions implemented as on the remote server. You can either omit them completely if you don’t use or need them. Or you can implement them in your custom way if you want.

Here is an rough example of how it works. We’ll start on remote server and create needed objects (type and table).

  phoneNumber varchar2(50),
  address     varchar2(500),
  mail        varchar2(100)
  id       NUMBER(10) NOT NULL,
  username varchar2(20),
  contact  ContactInfo
-- insert one record:
VALUES (1,'uuser',ContactInfo('+385123456789','Somewhere, Atlantis 21314', 'uuser@atlantis.com'));
-- We are going to need OID from remote server later, so let's get it right away.
SELECT type_name, type_oid FROM dba_types WHERE type_name='CONTACTINFO';
-------------------- -------------------------------------
CONTACTINFO          582FAF525C684D7DB094F959FC667063

Now we’ll switch to our server. Let’s assume database link REMOTEDB is already created and goes straight to our remote user.

-- first let's try to query our remote table:
SQL Error: ORA-22804: remote operations NOT permitted ON object TABLES OR user-defined TYPE COLUMNS
22804. 00000 -  &amp;quot;remote operations NOT permitted ON object TABLES OR user-defined TYPE columns&amp;quot;
*Cause:    An attempt was made TO perform queries OR DML operations ON
           remote object TABLES OR ON remote TABLE COLUMNS whose TYPE IS one OF object,
           REF, nested TABLE OR VARRAY.
-- now let's try to tell oracle what our UDT looks like
CREATE OR REPLACE TYPE ContactInfo oid '582FAF525C684D7DB094F959FC667063' AS OBJECT (
  phoneNumber varchar2(50),
  address     varchar2(500),
  mail        varchar2(100)
-- and query the table again:
------- ------------ ------------------------------------------------------------------------------------
      1 uuser        CONTACTINFO('+385123456789','Somewhere, Atlantis 21314','uuser@atlantis.com')

That’s it. 🙂

Factory relationship type in IoC

Relationship types are a great way to express dependencies and their relationships. In Autofac there is a bunch of implicit relationship types. Still, there are some issues since some of relationships require types which don’t exists in BCL such as Owned<> and IIndex<,>. But a great one which cuts down on boilerplate is the Func<> relationship.

Autofac supports Func<B> and Func<A1...AN, B> factory methods. This is great when you want to repeatedly crate a new instance in your service. For example:

class MyInstanceService : IDisposable { 
  public void DoSomething() { ... } 
  public void Dispose() { ... }
class SomeOtherService {
  private readonly Func<MyInstanceService> InstanceFactory;
  public SomeOtherService(Func<MyInstanceService> instanceFactory) {
    this.InstanceFactory = instanceFactory;
  public void RunMe(string arg) {
    using(var inst = InstanceFactory()) {
      ... //use a new instance

To set up Autofac we register services into the container.

var builder = new ContainerBuilder();
var container = builder.Build();

This all works nice, but what should happen if we registered our service as instance per context/singleton?

var builder = new ContainerBuilder();
var container = builder.Build();

How should container behave now? Currently Autofac will return the same instance and ouch, our dispose method will be called multiple times. To make it more obvious that this is the wrong behavior, we can use factory with parameters and add disposed check.

class MyInstanceServiceWithArgs : IDisposable { 
  private readonly string Args;
  private bool disposed;
  public MyInstanceService(string args) {
    this.Args = args;
  public void DoSomething() { 
  public void Dispose() { 
    if (disposed)
      throw new ObjectDisposedException("instance");
    disposed = true;
class SomeOtherService {
  private readonly Func<string, MyInstanceService> InstanceFactory;
  public SomeOtherService(Func<string, MyInstanceService> instanceFactory) {
    this.InstanceFactory = instanceFactory;
  public void RunMe(string arg) {
    using(var inst = InstanceFactory(arg)) {

It’s obvious now that it’s wrong to reuse previous instance, since it will have different Args field value (and will write to console wrong value) and worse it will throw ObjectDisposedException.

What are the consequences if we fix this? Well, we are using customized version of Autofac (with some optimization and bug fixes) and were hit with a strange error shortly after we’ve “fixed” this by disallowing shared instanced for factory resolutions. Let’s use the same services again:

class MyInstanceService : IDisposable {
  public void DoSomething() { ... }
  public void Dispose() { ... }
class SomeOtherService {
  public SomeOtherService(Func<MyInstanceFactory> factory) { ... }
  public void RunMe(string arg) { ... }

And register it as an instance:

var builder = new ContainerBuilder();
var container = builder.Build();

When we try to resolve SomeOtherService we’ll get an exception saying that MyInstanceFactory is not registered. Why is that?
Well, we want to create a new instance of MyInstanceFactory, but registration which will allow us to do that doesn’t actually exist. We need to register stuff in the container with transient scope

var builder = new ContainerBuilder();
var container = builder.Build();

which will allow new instances for MyInstanceService. It looks kind of strange that we are unable to resolve instance type which we registered, but when you consider the relationship type, it becomes clearer why it behaves like that.
So, in the end Autofac will behave the same way with this “correct” registration, but the “fixed” version will not support same instance resolution from factory, but rather throw resolution errors.

Multiple result sets alternatives in Postgres

While Postgres’s type system is second to none, Postgres still lacks some features here and there.
Stored procedures with their own transaction management are high on that list.
But, beside transaction management, SP usually come with a cool feature which can cut down chatting with the database server to the minimum.
Those familiar with MS SQL server have seen multiple selects coming from SPs:

CREATE PROCEDURE GetInvoiceAndDetails @id INT
SELECT * FROM Invoice WHERE ID = @id
SELECT * FROM LineItem WHERE InvoiceID = @id

with which you can collect the whole Invoice aggregate at once.
Postgres has a better way of solving this particular example, but what about when you want to select two unrelated aggregates in a single query. For example, if your web page has a dozen of queries, you can combine them in a single SP and fetch all data with a single call to the database.
In Postgres you can use refcursors to implement such a feature:

CREATE FUNCTION load_page(_session INT) RETURNS setof refcursor AS
DECLARE c_top_items refcursor;
DECLARE c_shopping_cart refcursor;
    OPEN c_top_items FOR
        SELECT t.name, t.description
        FROM top_item t
        ORDER BY t.popularity DESC
        LIMIT 10;
    RETURN NEXT c_top_items;
    OPEN c_shopping_cart FOR
        SELECT c.product_id, c.product_name, c.quantity
        FROM shopping_cart c
        WHERE c.session_id = _session
        ORDER BY c.id;
    RETURN NEXT c_shopping_cart;
$$ LANGUAGE plpgsql;

Then you can call it with something like:

SELECT load_page(mySession);
FETCH ALL IN "<server cursor 1>";
FETCH ALL IN "<server cursor 2>";

Since this works in Hot Standby mode, an explicit transaction is not really an issue.

What are the alternatives in doing this kind of query and how far can we take it?

Utilizing Postgres type system and some boilerplate we can end up with something like this:

CREATE VIEW sorted_top_items AS
SELECT t.name, t.description
FROM top_item t
ORDER BY t.popularity DESC;
CREATE TYPE shopping_cart_session AS
    id INT,
    name VARCHAR,
    quantity NUMERIC
CREATE FUNCTION load_page_types(
    IN _session INT,
    OUT top_items sorted_top_items[],
    OUT cart_items shopping_cart_session[]
) AS $$
    SELECT array_agg(ti.*) INTO top_items
    FROM sorted_top_items ti
    LIMIT 10;
    SELECT array_agg(sq.*::shopping_cart_session) INTO cart_items
    FROM (SELECT c.product_id, c.product_name, c.quantity 
          FROM shopping_cart c
	  WHERE c.session_id = _session
	  ORDER BY c.id) sq;
$$ LANGUAGE plpgsql;

Since Postgres supports arrays we can just shove results in array columns and have a more OOP-like result.

Let’s take it a step further and see if we can remove some of the boilerplate?

Maintaining this function becomes cumbersome if sorted_top_items view needs to be modified. Postgres dependency tracking will complain that load_page_types depends on sorted_top_items and needs to be dropped to alter the view.
While this is a good thing, if you don’t have a setup which will automate object rebuilds, it’s very annoying to do it by hand.
Let’s use less restrictive type, but maintain all the features of that function:

    SELECT array_agg(sq.*) AS arr INTO r1
    FROM (SELECT t.name, t.description 
          FROM top_item t
          ORDER BY t.popularity DESC 
          LIMIT 10) sq;
    SELECT array_agg(sq.*) AS arr INTO r2
    FROM (SELECT c.product_id, c.product_name, c.quantity 
          FROM shopping_cart c 
          WHERE c.session_id = _session 
          ORDER BY c.id) sq;
    --RETURN ROW(r1,r2) -- only in 9.3
    SELECT r1.arr, r2.arr INTO RESULT;
$$ LANGUAGE plpgsql;

All of this is nice, but what to do when we don’t want to use server side functions?

For example, how can we gather results for

SELECT t.name, t.description
FROM top_item t
ORDER BY t.popularity DESC
SELECT c.product_id, c.product_name, c.quantity
FROM shopping_cart c
WHERE c.session_id = @sessionID

using single call?

By combining multiple selects into a single one:

    (SELECT array_agg(sq.*)
     FROM (SELECT t.name, t.description
           FROM top_item t
           ORDER BY t.popularity DESC
           LIMIT 10) sq
    ) AS top_items,
    (SELECT array_agg(sq.*)
     FROM (SELECT c.product_id, c.product_name, c.quantity 
           FROM shopping_cart c
           WHERE c.session_id = @sessionID
           ORDER BY c.id) sq
     ) AS shopping_cart

Hiding input from console in PHP

Unfortunately, PHP does not come with built-in way to control a console or a terminal. Many PHP CLI applications suffer from this as they often require users to enter sensitive data which should not be visible as they type it. The most obvious example is entering a password – imagine doing a presentation and having to type your password so that everyone can see it. This guide will show you various ways to hide input from console, discussing strengths and weaknesses of each.

Using ANSI escape sequences

This method is really simple and easy. Unfortunately, it does not work on Windows.

Every terminal and every decent terminal emulator has the ability to read escape codes from user input and change its behavior based on that. For example, if you put this in your console:

php -r 'echo "strawberry is \033[31m red \033[0m \n";'

You’ll see that string “red” is actually colored red!
The escape sequences start with the ESCAPE char (octal 33, hex 1b, ASCII) followed by a a ‘[‘. This 2 char sequence is known as CSI (Control Sequence Introducer) and it tells terminal that a command is coming that it should execute. The last char of a sequence is ‘m‘ meaning it will change the way text is being displayed based on numbers between the CSI and ‘m‘. You can provide multiple numeric commands separated with semicolon. Commands from 30 to 37 change the foreground color, while commands from 40 to 47 change the background color. Command 0 resets the display mode back to default.

For example, to hide a text we will simply change background and foreground color to the same value:

php -r 'echo "\033[30;40m invisible text \033[0m is invisible";'

The full list of sequences can be found here. This is also the way popular library ncurses works and the reason why it doesn’t work on Windows.

Now let’s write some PHP:

function hide_term() {
    if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN') {
        echo "\033[30;40m";
function restore_term() {
    if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN') {
        echo "\033[0m";
echo 'Enter password: ';
$password = rtrim(fgets(STDIN), PHP_EOL);
echo "You entered '$password'", PHP_EOL;

The interesting fact is that it used to work in MS DOS and Windows 95, but unfortunately Microsoft dropped support for it.

Using stty

stty is an external program which comes with almost any Unix based OS. This program changes terminal parameters and can make the text invisible with as simple command as follows:

stty -echo

You can test it out your self with this command:

stty -echo; read password; stty echo; echo "You entered '$password'";

To use it in PHP we can just slightly modify our last PHP program:

function hide_term() {
    if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN')
        system('stty -echo');
function restore_term() {
    if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN')
        system('stty echo');
echo 'Enter password: ';
$password = rtrim(fgets(STDIN), PHP_EOL);
echo PHP_EOL;
echo "You entered '$password'", PHP_EOL;

Solution for Windows

I believe the best solution for Windows is to write a Win32 C program.

If you already have headache because I mentioned C, don’t worry – the program is really simple. The reason I choose C is because by using Win32 API, the program will work on every Windows from 95 to 8 and on both x86 and x86_64 architectures without requiring any dependencies. Also it will be very small and it will Just Work. Surely you can do the same in Java, Python or C# but in that case users must have them installed.

The C code is as follows:

#include <stdio.h>
#include <wtypes.h>
#include <wincon.h>
DWORD cmode;
void restore_term(void) {
	if (hconin == INVALID_HANDLE_VALUE)
	SetConsoleMode(hconin, cmode);
int disable_echo(void) {
	hconin = CreateFile("CONIN$", GENERIC_READ | GENERIC_WRITE,
	if (hconin == INVALID_HANDLE_VALUE)
		return -1;
	GetConsoleMode(hconin, &cmode);
	if (!SetConsoleMode(hconin, cmode & (~ENABLE_ECHO_INPUT))) {
		return -1;
	return 0;
int main(void) {
	char psw[100];
	fgets(psw, 100, stdin);
	printf("%s", psw);
	return 0;

This code is a modification from this GIT source code. The reason I used it instead of writing my own (which would have much less lines of code) is that this is extremely well tested on almost every possible Windows version.

You can compile this code in Visual Studio Command Prompt:

cl /Os /Ox input.c

Or you can use MINGW‘s GCC like this:

gcc -Os -O3 -m32 -march=i586 input.c -o input.exe

You can even compile it from Linux for Windows. Just use mingw-gcc instead of gcc.

Notice that this .exe file has only 24K.

Now let’s use it in PHP:

function getPasswd($string = '') {
    echo $string;
    $psw = `input.exe`;
    echo PHP_EOL;
    return rtrim($psw, PHP_EOL);
$password = getPasswd("Please enter your password: ");
echo "You entered '$password'", PHP_EOL;

The only scenario where this won’t work is on Windows 8 for ARM architecture. But how many users will compile PHP on their mobile Windows 8 and use command line in it?

Putting it all together
My personal recommendation is to combine the C program if the platform is Windows and use stty method otherwise. That way your program will Just Work in almost any scenario.

function getPasswd($string = '') {
    echo $string;
    if (strtoupper(substr(PHP_OS, 0, 3)) !== 'WIN') {
        system('stty -echo');
        $psw = fgets(STDIN);
        system('stty echo');
    } else
        $psw = `input.exe`;
    echo PHP_EOL;
    return rtrim($psw, PHP_EOL);
$password = getPasswd("Please enter your password: ");
echo "You entered '$password'", PHP_EOL;

I’ll now discuss some other methods I found on the Internet that I don’t agree with.

Bad: Using a PHP extension

I strongly recommend against this.

For example you can use ncurses PECL extension. The problem with using PHP extensions is it requires your users to install it. Generally forcing your users into compiling and installing a 3rd party library is a best way to loose them. A workaround is to ship the extension as a compiled dynamic library file and use a trick like this to load it into application:

if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN')
if (extension_loaded('myext'))

Another problem is that you can’t load a dynamic library built for different architectures, and loading the same library in different OSs or different versions of OSs can fail. PHP on Windows is almost always build for i586 and Windows’ APIs have good backward compatibility so no problem there, but on other OSs you’ll be lucky if it works. Also there might be incompatibilities with different PHP versions so even if you compiled for the right architecture, extension loading might fail because user has different PHP version.

It might be a possibility for Windows, but it’s harder to write and more prone to errors.

Bad: Using VisualBasic
10 years ago this would be the perfect Windows-only solution. However, this no longer works in Windows 7 and newer.

Bad: using the COM extension for PHP
It’s not enabled by default and even if it is, it works only on some Windows OSs.

Bad: changing the color of the Windows command prompt
You can only change the color of the whole Console and you can’t change foreground and background to the same color. So your only option to mask a password is to make your console extremely ugly.

To check what I mean, type this in your cmd:

color EF

Hacking in the JVM: Return type polymorphism in Scala

The first thing that springs to mind when thinking about method polymorphism is the use case of having an identically named method accepting different argument types.

Let’s observe the following trivial example:

    def mul(x: Int, y: Int) = x * y
    def mul(x: String, y: String) = x + "*" + y

Both Java and Scala support this type of polymorphism; we’ll use this example as an entry point into our hacking session:

    scala> mul(2, 3)
    res0: Int = 6
    scala> mul("2", "3")
    res1: String = 2*3

So, what is return type polymorphism (return type overloading)?
Basically this means creation of methods which differ only by their return types. Here is a simple example:

    def div(x: Int, y: Int): Int = x / y
    def div(x: Int, y: Int): String = x + "/" + y
    def div(x: Int, y: Int): BigDecimal = BigDecimal(x) / y

A keen observer would rightfully comment that this does not compile, not with vanilla Scala 2.10.x at least. Let’s give it a prod.


Jasmin is an assembler – a very low level programming language for the JVM. It’s about as low as you could comfortably get unless you want to write class files with a hex editor.

Let’s implement the upper three methods in pure Jasmin … <rolls up sleeves>

    .class public io.jvm.poly.Tester
    .super java/lang/Object
    ; local_0 / local_1
    .method public static div(II)I
      .limit locals 2
      .limit stack 2
    .end method
    ; local_0 + "/" + local_1
    .method public static div(II)Ljava/lang/String;
      .limit locals 2
      .limit stack 2
      invokestatic java/lang/String.valueOf(I)Ljava/lang/String;
      ldc "/"
      invokevirtual java/lang/String.concat(Ljava/lang/String;)Ljava/lang/String;
      invokestatic java/lang/String.valueOf(I)Ljava/lang/String;
      invokevirtual java/lang/String.concat(Ljava/lang/String;)Ljava/lang/String;
    .end method
    ; BigDecimal(local_0) / BigDecimal(local_1)
    .method public static div(II)Lscala/math/BigDecimal;
      .limit locals 2
      .limit stack 2
      invokestatic scala/math/BigDecimal.int2bigDecimal(I)Lscala/math/BigDecimal;
      invokestatic scala/math/BigDecimal.int2bigDecimal(I)Lscala/math/BigDecimal;
      invokevirtual scala/math/BigDecimal.$div(Lscala/math/BigDecimal;)Lscala/math/BigDecimal;
    .end method

Download the Jasmin source code “io.jvm.poly.Tester.j”
Download the latest Jasmin assembler from Sourceforge

In order to compile this class yourself using Jasmin, simply run the jasmin.jar against the source code:
“java -jar jasmin.jar io.jvm.poly.Tester.j”

All right – now we have a separate class file containing the overloaded methods.
Since they all have the same name, we can import them all at once via the following directive:

    scala> import io.jvm.poly.Tester.div
    import io.jvm.poly.Tester.div
    scala> div(7, 2): Int
    res0: Int = 3
    scala> div(7, 2): String
    res1: String = 7/2
    scala> div(7, 2): BigDecimal
    res2: BigDecimal = 3.5

And there you have it – although it’s impossible to compile such a class directly from Scala, once it’s compiled and added to the classpath it is very easy to invoke it.

Of course, this doesn’t mean that you really need to start programming in Jasmin. If you desire for such functionality, simply create a proxy method in Jasmin which invokes your original Java/Scala/JProlog or whatever:


    package io.jvm.poly
    object Rnd {
      def makeInt() = scala.util.Random.nextInt
      def makeString() = Array.fill(10)(scala.util.Random.nextPrintableChar).mkString


    .class public io.jvm.poly.RndProxy
    .super java/lang/Object
    .method public static makeRnd()I
      .limit locals 0
      .limit stack 1
      invokestatic io/jvm/poly/Rnd.makeInt()I
    .end method
    .method public static makeRnd()Ljava/lang/String;
      .limit locals 0
      .limit stack 1
      invokestatic io/jvm/poly/Rnd.makeString()Ljava/lang/String;
    .end method

You will now be able to invoke the Rnd object methods by using the Jasmin compiled polymorphic method:

    scala> io.jvm.poly.RndProxy.makeRnd: Int
    res0: Int = 1958204046
    scala> io.jvm.poly.RndProxy.makeRnd: String
    res1: String = qwOSsm6D9

In essence, return type overloading is a bit of a gimmick. The only prevalent use case where return type
overloading is consistently used is in Java obfuscators/optimizers (Proguard) where they make classfiles
smaller and more confusing by having a dozen different methods all named a().

One example of such overloading which actually made it into production can be seen here:

    scala> import org.pgscala.converters.PGNullableConverter
    import org.pgscala.converters.PGNullableConverter
    scala> PGNullableConverter.fromPGString("t"): Boolean
    res0: Boolean = true
    scala> PGNullableConverter.fromPGString("3"): Double
    res1: Double = 3.0
    scala> PGNullableConverter.fromPGString(""""a"=>"b", "c"=>"d""""): Map[String, String]
    res2: Map[String,String] = Map(a -> b, c -> d)

Let’s say that you want to create a User case class …

  case class User(name: String, age: Int, born: LocalDate)

… from a ResultSet variable rs …

  { rs => User(rs.getString("name"), rs.getInt("age"), rs.getLocalDate("born")) }

By using return type overloading you would be able to make a tiny wrapper around the original ResultSet and write this instead:

  { rs => User(rs.get("name"), rs.get("age"), rs.get("born")) }

Basically this allows you to perform easy model migrations:

-  case class User(name: String, age: Int, born: LocalDate)
+  case class User(name: String, age: Int, born: DateTime)

Without changing the actual code – the compiler is going to inject the proper method by looking at the required return type:

  { rs => User(rs.get("name"), rs.get("age"), rs.get("born")) }

If you feel clever, you may also try overloading the apply method on your class 🙂

There are a couple other examples I could list, but this kind of an approach is losing value recently as Scala macros are stepping out of their experimental phase and are becoming more relevant.

The Vietnam of Computer Science Should Have Never Been Lost

Impedance mismatch problem is a widely known one, with many tools trying to solve it the wrong way. It’s so famous with so many failed attempts, that it got recognized as the Vietnam war of Computere Science.

Before ORM tools became popular, applications were usually built by first modeling the database, and then creating queries and mapping those queries to objects. Multiple queries per object was immediately recognized as a problem without an adequate solution. While this widely recognized problem was correct for some databases, more advanced could work around this problem. Even today, these workarounds are not mainstream knowledge and the multiple queries problem is often used as an excuse for SQL alternatives.

As ORM tools matured developers started building applications by modeling objects and modeling database to match those objects. Further improvements to ORM tooling also allowed creation of database tables and sometimes even simple migrations.

Unfortunately, ORM tools promised database independence; this resulted in features that could only be implemented in all databases and could not cover non-trivial scenarios. At the same time, more advanced databases (such as Postgres) which focused on features instead of popularity were not fully utilized and looked bad on trivial benchmarks.

ORM tools created all kind of problems by trying to map objects to databases in this way, but lets not reiterate those problems here.

So what went wrong and how to actually solve the Impedance mismatch problem?
To cross the chasm, objects must be modeled in the database too. The problem is that this is not available in databases such as MSSQL or MySQL. This makes it a no go for most ORM tools, since they want to support all databases.
But can this work in real world as well? Well, since this is advanced database territory, it’s not really without issues. For example, managing objects in Oracle is a real pain. And behaviors are not always obvious – let’s take a look at this flat out wrong behavior in Oracle:

SELECT VALUE(main), VALUE(optional)
FROM source1 main
LEFT JOIN source2 optional ON main.ID = optional.mainID

Guess what’s the VALUE(optional) if the left join fails to find a match? Well, in Oracle it’s an instance of object source2 with all fields null. We can use a workaround with CASE WHEN to work around this issue, but unfortunately it’s not the only one. Fortunately Postgres behaves as expected and doesn’t suffer from this particular issue.

Probably the biggest problem is the schema evolution. Since object oriented features are not very popular, managing objects has all kind of issues. Changing type is not a simple ALTER TYPE operation. Dependency issues are probably the most annoying ones. When you start using objects in a table column you can’t just change them as easily as before.

If managing objects in the database is such a pain, is it really worth it?
Well, it’s worth as much as you want to minimize the mismatch.

Applications are not about classes or tables, they are about models. Domain model is the most important thing in an application. Relations and classes are just a representation of that model in used technology.

So, how to solve this problem?
By moving up the ladder of abstraction. The same way as we started to use C instead of assembly, we should start abstracting the model and going down the ladder of abstraction only when necessary. If database and classes can be created from some model this solves all kind of issues. NoSQL argues about improved developer productivity since they can reason about their model more easily and have less code to maintain.

This, while hard, is a solvable problem – but it can’t be solved in a mainstream way. Mainstream ORMs which offer similar features do it through templates. Classes generated from the model should never, ever, be modified by hand. Unfortunately, nontrivial applications can’t be easily expressed in a template.
What about databases, since ORMs are known for horrible SQL queries?
The problem with horrible SQL queries is a programmer/framework one. A bad SQL query is usually result of a missing feature. Extendable frameworks you to solve this easily, just plug-in this use case until the root cause has been fixed.

Criticism of widely accepted viewpoints

1) The object-to-table mapping problem
While it’s not widely known that any object can be mapped to a row in a table and NoSQL solutions are exactly about that, this is usually not the best approach to every mapping. Sometimes a single object should be mapped to a single row, but most of the time it should span several tables and this distinction depends entirely on the model. The solution is to use SQL databases which support this feature.

2) The Schema-Ownership Conflict
This is not an actual technical issue and since the model should be shared across all “departments” it should reflect both developers and DBAs viewpoints. Also, this is actually a framework/tooling problem. If inadequate tooling is creating problems for some department, sometime they will take measures as drastic as entirely forbidding tooling. Solution to this is to stop using bad tooling. If some framework doesn’t support even the most basic database features such as bulk operations, stop using it.

3) The Dual-Schema Problem
Moving up an abstraction this is not a problem anymore. Both database and classes are representation of a single model. Freezing the model in code or database is a result of inadequate abstraction. If renaming a field is risky because it may break something it’s obvious that there is something wrong with the development process.

4) Entity Identity issues
Concurrency is hard, deal with it. The best way to deal with it is to rely on databases for transactions. MVCC doesn’t suffer from many problems found in non-MVCC databases. Cache invalidation is one of the hardest computer science problems. LISTEN/NOTIFY and Advanced Queuing goes a long way to help with cache invalidation. Again, this is an inadequate database and/or a bad framework problem.

5) The Data Retrieval Mechanism Concern
Oracle and Postgres support collections of objects. This allows them to provide complex aggregates in a single query.
LINQ drastically simplifies interaction with the database from code. Type safety and code familiarity are powerful tools. Manual SQL is always an option, but the main goal is mapping to objects.
Languages which do not support LINQ can, in best case, fall back to method chaining but that leaves a lot to be desired.

6) The Partial-Object Problem and the Load-Time Paradox
Domain-Driven Design provides an excellent modeling tool. Understanding the model provides insight about interactions with objects. When cache understands the model all kind of issues are mitigated. The problem is that for cache to be able to understand the model, either the developer has to specify a lot of information or there must be a higher abstraction which understands the model and know when to invalidate the cache.

E.g. When views are expressed in a model they are there for a business reason. Unfortunately, views are rarely used in ORM tooling since they are really complicated to manage.


Object-relational databases offer ways to bridge the Impedance mismatch problem. But they alone are not enough.
Domain-Driven Design provides language for better modeling. But it alone is not enough.
Combination of a DSL for describing the domain model using DDD concepts and compilers which can maintain application components is a viable solution to solving the Impedance mismatch problem.