(Back to index)


Why I hate Java

Most Java programmers and their brainwashing hype are wrong. While their claims may in some cases have a rational basis, their arguments they use to defend these claims are often simply wrong or irrelevant.

Here I present my arguments against Java and against the wrong arguments defending Java:

Garbage collection

While garbage collection is handy for some purposes, all kinds of wrong arguments are presented defending it.

No memory leaks?

The most usual is the "no memory leaks" argument.

Actually this is completely the opposite: With a language supporting garbage collection, such as Java, you are leaking memory all the time.

Memory is constantly being allocated dynamically (often more than would be really necessary) and then "forgotten" (the language actually forces you to forget about them). That is, the coder leaves it to the garbage collection engine to clean up his memory allocation mess.

Why does this matter? It matters for the exact same reason why memory leaks are bad in general: They consume more memory than necessary. What the garbage collection scheme causes is that the process will consume more memory than it really needs (in pathological cases it can consume even hundreds of times more memory than it really is using). This memory is away from other processes running in the same system. Unless the operating system has a way to tell the garbage collection engine "hey, free up some unneeded memory", it will hog everything it can and other processes may suffer from shortage of free memory.

The fact is that garbage collection is a selfish algorithm. It only takes care of the program itself not running out of memory due to a memory leak. Unless it has a really close relation to the OS, it completely disregards the memory usage of other processes.

Destructors and modularity

Garbage collection in Java forces developers to break good modularity principles and object-oriented design.

Two of the many important principles in OO programming is that you should not break module boundaries with resource handling responsibilities and that a module should have a very well-defined behaviour during its lifetime.

Breaking module boundaries with resource handling means in other words that one module allocates a resource and leaves it to another module to handle its freeing (and trusts that some other module really does that). This is against good object-oriented principles: If a module reserves some resources, it's the same module which should take care of freeing the resources as well (moreover, it should take the proper steps to actually forbid any other module from freeing the resource without permission).

With memory management this is not such a big problem when we have garbage collection. However, memory is not the only resource you can reserve (which is something most Java advocates will happily ignore). Does the garbage collection scheme in Java help the code to make modular programs?

It does not. There are many other resources, including system resources and more high-level resources provided by the program itself, which the module can reserve, use and free (examples of system resources include file handlers and internet sockets). For example, it may be very important for an object to be certain that its resource (eg. a file handler) is freed when the object is not needed anymore (ie. goes out of scope). In some cases not doing so may even result in severe malfunction of the program.

A careful programmer will take extreme care of not "allocating and forgetting" these important resources, but there's only a tiny step from "allocate memory and forget" to "allocate any resource and forget", which is even encouraged by the language itself because you can't trust that the destructor of the object is called immediately when the object goes out of scope (in fact, you can't trust that the destructor is called at all!), but you must simply trust that another module will call the destruction mechanism of the object properly (thus breaking module boundaries). This certainly does not aid in good OO programming. All the contrary.

In other words: Due to the idiotic destructor calling mechanism in Java, the programmer is forced to break module boundaries and transfer the responsibility of freeing a resource outside the module.

And it's not like Java could not call the destructor of an object when the last reference pointing to it goes out of scope: This can be done with a simple reference counting engine. (I'm not saying Java should free the memory taken by the object, I'm just saying that Java sould call the destructor function of the object if it has one.)

Sharing a resource

An argument which is sometimes presented is that garbage collection makes it easier to share data between several objects: If two or more objects share the same reference to some data, since the objects don't need to take care of freeing the data, the programmer doesn't have to decide which one of the objects is responsible for this freeing.

However, if you have this kind of code in your program, you have a serious flaw in your OO design.

In good OO design one object makes one task, and the responsibilities of each object is crystal-clear. If you have an ambiguity in responsibilities, then your OO design is crappy.

Once again: Memory is not the only resource which can be allocated and freed. What if the shared resource is, for example, a file handle? Which one of the objects should be responsible for freeing it?

If the problem is in crappy OO design, the right solution is not to encourage doing it by fixing the problem with a compiler feature.

GC efficiency

Some garbage collection engines make memory deallocation faster than it would be eg. in C++ with explicit deallocations. This is because they can optimize by merging several deallocations into one using complex algorithms. A program which is constantly allocating and deallocating large amounts of small chunks of memory can sometimes see a speed benefit from such GC.

Of course GC advocates use this fact to boast about GC being faster than explicit memory management. However, this is a bad generalization. Yes, there are cases where it makes the program faster. However, there are many other cases where it makes the program, and worse, the entire OS much slower.

One problem with garbage collection is that it needs to see the entire memory used by the program even if the program is not using big parts of it for anything. When gargabe collection is triggered, it will sweep through the entire address space used by the program.

This is absolutely detrimental with respect to virtual memory and swapping. There are many programs which are designed to stay mostly idle in the background, only being run from time to time to do some small task. When such a program is idle long enough, and especially if other programs are actively using most of the available memory, the memory of the idle program will be swapped to disk.

Now, if this program does not use GC and it's activated to do some task which requires only a small part of the memory allocated by the program, the swapping engine will bring back only those parts of the memory the program actually needs. In best cases this can be only a few percents of the total amount of memory allocated by the program.

However, if the program uses GC and the GC engine happens to be triggered at that point, the entire memory used by the program will be retrieved from the swap, usually for no good reason (after all, it's unlikely that the program suddenly freed some of that memory if it has been keeping hold of it for so long).

If the system was running low on memory when this happened, the end result can be quite devastating, bringing the system to almost at halt, while the poor disk is swapping like mad. And all this was completely unnecessary and only caused by the GC needlessly reading the entire memory.

As I wrote earlier, GC is a selfish algorithm. It only cares about itself. It doesn't care about the other programs in the system. This can sometimes be quite devastating in a system with many programs running.

Only dynamic objects

Memory consumption

By its very design Java is a memory hog. You simply can't make a memory-efficient program which handles enormous amounts of data while still preserving good object-oriented abstraction in your program.

In Java each object is allocated dynamically and all classes are dynamically bound. This means that the size of each object is at least the size of a pointer (which is needed for dynamic binding to work; a pointer typically takes 4 bytes) plus the size required by the memory allocation engine to manage an allocated block of memory (in typical systems this is 8 to 32 bytes, but we could even think very optimistically that it takes only 4 bytes).

That is, each object takes at least 8 bytes of memory even if it didn't have anything inside it. However, since objects can only be handled through references, the reference itself adds at least the size of a pointer (typically 4 bytes) to the minimum memory consumption of an object. The contents of the object are naturally added to this size.

Thus, for example, an object containing one (32-bit) integer will take at least 16 bytes of memory (and this is extremely optimistic).

Now, suppose that you want to make a vector containing 100 millions of objects of this type. In Java objects are put into vectors by putting the references to the objects, thus each object in a vector takes at least the abovementioned 16 bytes of memory. Thus the total memory consumption of this vector will be at least 1600 millions of bytes, ie. almost 1.5 gigabytes.

Compare that to a vector in C++ containing 100 millions of objects (which contain one integer inside them): The memory consumption of this vector will be some bytes more than 400 millions of bytes, that is less than 400 megabytes. So the memory requirement of 1.5 gigabytes in Java was reduced to 400 megabytes in C++. (And I was extremely optimistic about the memory consumption of objects in Java. They probably consume a lot more than this.)

This extra memory consumption does not matter if you are making a tic-tac-toe game, but it begins to matter a lot if you want to make eg. a video editing software which should handle gigabytes of data in almost realtime (just imagine what happens if you define one pixel as a class).

Only references

Java coders will tell you that Java has no pointers. However, in Java everything is a pointer (they are just called references). Granted, these "pointers" are safer than C++ pointers, but they are still just "handles" to the object, and you can't get the object itself.

Why does this matter? Well, you can't easily, for example, give an object to a function by value, swap the values of two objects, etc. While it's rather rare to have to do these kinds of things, when you would have to, it can be a real nuisance.

However, this has a more serious repercussions with regard to object-oriented design:

Java supports several data containers like vectors and lists. However, due to its very design, these data containers can only contain references of type Object.

Now, in a good object-oriented design upcasting (ie. casting a reference from a base class to an inherited class) is seldom needed and in fact often a sign of a design flaw. If the program is properly designed, upcasting should usually not be necessary.

The problem with upcasting (a problem which the opposite, ie. downcasting doesn't have) is that it can fail: What if the object behind the reference was not of the correct type you are casting to? If this happens unexpectedly, your program will malfunction and this is a clear sign of an error in the program (which origins are right in the design).

Now, in Java, if you use a data container provided by the language, you are forced to upcast all the time.

What happens if you make a mistake and upcast to the wrong type? Java will issue an error at runtime. This error might be in a place in the code which is run very rarely and the error might happen when the client who bought the software is using it (because the error was not found in testing). Fixing the code can be quite expensive.

In C++, if you use its data containers properly, the compiler will give you an error at compile time. Its simply impossible to use a data container believing it contains items of a type it doesn't. (You can, of course, emulate what Java is doing, ie. store base-class pointers to dynamically allocated objects and then handle them by upcasting, but there's seldom any need for this. In Java you are forced to do this every time, even when there would not be any need.)

Abstraction of types

Abstraction is one of the key elements of object-oriented programming. With abstraction programs can be made more flexible, maintainable and upgradeable. (The infamous year-2000 problem was caused precisely because of lack of abstraction in the affected programs.)

Thus a good object-oriented language should give strong tools for abstraction for the programmer to use. Unfortunately Java is not such a language.

No easy way of abstracting an internal type

Assume that you need for example integer values for a certain task. The Java's internal type int suffices more than enough for the time being. However, suppose that you have the feeling that perhaps some time in the future int might not be good enough and that perhaps it might be necessary to convert it to a double. Who knows, perhaps it may be necessary to use a string instead?

If you were coding in C++, what you would do in this case is to abstract the int away with a typedef and only use the abstract version created this way wherever this particular type of integer is needed. If at some time in the future it would be necessary to change it to another type, you would then just change that one single line and recompile. (This can even make the code easier to read because you can have differently-named integers for different tasks and it makes it easier to read the code when you see which type of integer is being used in a particular place.)

Not so in Java. In their infinite wisdom the developers of Java decided that typedef (or anything similar) is not needed. Thus they left coders with no easy way of abstracting away a type.

What this causes in practice is that people will simply use int and hope that will suffice forever, and if at some point it does not, they will hunt for each one of them (a task which is more difficult because there's no way of differentiating this type of int from other ints used in the program). This is not good object-oriented programming.

No easy way of changing an internal type to a class

In their infinite wisdom the developers of Java decided that operator overloading is evil and must not be included in the language. Their argument is that operator overloading can be used for the wrong purposes and thus they should shepherd the programmers to not to use them (by force, of course).

This argument is ridiculous in itself. Yes, you can use operators in the wrong way. However, you can also use member functions in the wrong way (what stops me from naming a member function "multiply()" and make it perform a division?). That doesn't mean member functions should not be supported. Variables can be named in the wrong way; does that mean variable names should not be supported? Give me a break.

What this causes is that if, in the example above, you would want to be prepared for the case that no internal type suffices for a future requirement and you want to be able to change the int to a class (which eg. manages 128-bit integers or whatever), you can't use int nor its easy-to-use operators, but you are forced to make a class. Since a class does not support operators you will then be forced to write every single operation you make with it as member function calls.

Naturally having to perform every possible operation doable to integers with member functions makes your code to look quite uglier, longer and harder to read.

Then there's the efficiency issue: Due to the fact that every class in Java is dynamically bound, unless the Java compiler is really smart (which current compilers might actually be, I don't really know), every member function call is really a function call (vs. a simple integer operation if you had used an int). Due to dynamic bounding this function call is most probably an indirect one, adding even more overhead to such simple operations as adding two integers together.

In C++ you just abstract your int away with typedef and if at any point in the future you find that you really must change it to a class, then you simply change it to a class, implement every integer operator and you are done. No need to change existing code.

Inconsistency between internal types and abstract types

Since internal types are completely different beasts than abstract types, it's impossible to make, for example, a generic container which could contain any types such as ints, strings or user-defined types (ie. classes).

This means that if you want to support any type, you have to make specializations for each type manually. Is this object-oriented programming?

One form of abstraction is that you can handle things without the need to know what those things are. In Java this is not possible.

The template mechanism in C++ might not be a purely OO feature, but it's a very strong tool for abstraction. Java coders have always seen templates as evil monsters and fought to death against them. The sweet irony is that templates will be added to Java as well, because they really are handy (naturally since they can't name them "templates" because that's evil, they had to invent another name: "generics").

Multiple inheritance

If all the other things described above are seen as evil monsters by Java developers and programmers, this one is the big boss of all monsters.

As I have said earlier, the "it can be used in the wrong way" argument is ridiculous and obsolete. Anything can be used in the wrong way, including regular inheritance (if I inherit Car from Wheel, that's badly misusing inheritance). However, that's no reason to forbid inheritance.

Multiple inheritance can be very useful in some cases. Because they could not deny this fact, they implemented it in a very limited way: With the so-called interfaces.

The problem with interfaces is that member functions can't have default implementations, thus forcing the programmer to implement each function again and again in each class which is inheriting from the interface. This forces programmers to do one of the biggest sins in programming (not just in OO programming but in all programming): Code repetition.

I once had to make a program reading XML in Java and I used an existing XML parser. This parser was implemented as an interface which had more than 15 functions. Of these functions I only needed 3 for what I wanted to do. However, I was forced to implement all of the functions for no good reason, simply because Java forced me to do so. These functions were rather specific and there was no good reason why they could not logically have a default implementation. I was glad I had to implement this interface just once.

The only reason why the Java developers feared multiple inheritance is the so-called diamond inheritance. What I can't grasp is why they did not simply forbid diamond inheritance. Forbidding only diamond inheritance would not change anything currently available in Java, but would add a new useful feature: "Interfaces" could have default implementations and even member variables, greatly enhancing the object-oriented design possibilities.

Conclusion

Further reading

The Java Hall of Shame.

Java is taking us in the wrong direction.

Java: Slow, ugly and irrelevant.

java sucks.

Analysts see Java EE dying in an SOA world.

When will Sun make the Java GUI fast and consistent?

Making Sense of Java. (Perhaps a bit outdated on the speed issue, though.)


(Back to index)