Microsoft compares programming languages

There's a page at http://msdn.microsoft.com which compares (or better, claims to compare) C# and Java, but seems to be more a "why C++ sucks and C# rules" page instead. You can find the original article behind this link (or a behind a more direct link here).

The writer of the article presents himself as "Senior Software Engineering Consultant". I suppose the "consultant" part is the one which tells the most about his true knowledge on programming. I think this quote from the article tells quite well what's all about:

This paper will not deal with writing "unsafe" C# code not only because it is discouraged, but also because I have very little experience using it;

Even though in the context he is talking about using pointers in C#, from the rest of the article it's quite clear that he really doesn't have too much experience about programming in general. Even though it's commendable that he has the courage to admit that he has little experience in some things, it's something which does not inspire confidence in the writer. If he doesn't have much experience, then why not hire someone who has to write the article?

I'm not going to go through the entire article, but I'm going to point out some of the things he claims wrongly. I'm not saying C++ is better than C# or java nor that C++ would not have defects and that C# and Java would not have their virtues. I'm not even saying that his basic claims are untrue. However, he more often than not uses all the wrong arguments to defend his claims.

Garbage collection

Garbage collection has its advantages and disadvantages. I'm not saying it's good or bad. However, the pro-garbage-collection arguments the writer presents are, in my opinion, flawed. For instance:

1. No memory leaks.

Actually this is completely the opposite: With a language supporting garbage collection, such as Java and C#, you are leaking memory all the time.

Memory is constantly being allocated dynamically (often more than would be really necessary) and then "forgotten" (as he says, the languages actually forces you to forget about them). That is, the coder leaves it to the garbage collection engine to clean up his memory allocation mess.

Why does this matter? It matters for the exact same reason why memory leaks are bad in general: They consume more memory than necessary. What the garbage collection scheme causes is that the process will consume more memory than it really needs (in pathological cases it can consume even hundreds of times more memory than it really is using). This memory is away from other processes running in the same system. Unless the operating system has a way to tell the garbage collection engine "hey, free up some unneeded memory", it will hog everything it can and other processes may suffer from shortage of free memory.

The fact is that garbage collection is a selfish algorithm. It only takes care of the program itself not running out of memory due to a memory leak. Unless it has a really close relation to the OS, it completely disregards the memory usage of other processes.

2. It rewards and encourages developers to write more object-oriented code.

Now, this made me really ROTFL. That is really a laughable claim. On the contrary: Garbage collection encourages developers to break good modularity principles and object-oriented design.

Two of the many important principles in OO programming is that you should not break module boundaries with resource handling responsibilities and that a module should have a very well defined behaviour during its lifetime.

Breaking module boundaries with resource handling means in other words that one module allocates a resource and leaves it to another module to handle its freeing (and trusts that some other module really does that). This is against good object-oriented principles: If a module reserves some resources, it's the same module which should take care of freeing the resources as well (moreover, it should take the proper steps to actually forbid any other module from freeing the resource without permission).

With memory management this is not such a big problem when we have garbage collection. However, the point here was not whether it would be problematic with memory allocation or not. The point here was whether this encourages the developer to make good object-oriented design.

It does not. Memory is not the only resource that can be reserved and freed. There are many other resources, including system resources and more high-level resources provided by the program itself, which the module can reserve, use and free (examples of system resources include file handlers and internet sockets). For example, it may be very important for an object to be certain that its resource (eg. a file handler) is freed when the object is not needed anymore (ie. goes out of scope). In some cases not doing so may even result in severe malfunction of the program.

A careful programmer will take extreme care of not "allocating and forgetting" these important resources, but there's only a tiny step from "allocate memory and forget" to "allocate any resource and forget", which is even encouraged by the language itself because you can't trust that the destructor of the object is called immediately when the object goes out of scope, but you must simply trust that another module will call the destruction mechanism of the object properly (thus breaking module boundaries). This certainly does not encourage good OO programming. All the contrary.

3. Garbage collection makes data sharing easier. Applications are sometimes built which require object sharing. Problems can arise in defining responsibilities for clean-up: if object A and object B share pointer C, should A delete C, or should B?

If you have this kind of code in your program, you have a serious flaw in your OO design.

In good OO design one object makes one task, and the responsibilities of each object is crystal-clear. If you have an ambiguity in responsibilities, then your OO design is crappy.

Once again: Memory is not the only resource which can be allocated and freed. Change the above idea to "if object A and object B share the file handler C, should A close C, or should B?" Get the idea?

If the problem is in crappy OO design, the right solution is not to encourage doing it by fixing the problem with a compiler feature.

Note that I'm not saying garbage collection is bad per se. What I'm saying is that it's not true that garbage collection somehow encourages good programming principles, because it doesn't.

4. Programs should automatically become more "correct."

Good thing he used quotation marks there.

A program does not become magically more "correct" if the compiler and/or interpreter fixes the programmer's mess (which is caused by a poor OO design).

5. I am quite sure that there are other advantages that I can't even dream up right now.

Yes, we see by now that he really can't think of any real advantages of garbage collection (which it has; they just aren't those he listed).

Dynamic vs static objects

The writer goes on trying to advocate the fact that every object in C# is allocated dynamically and that you can't have static instances. He does this by claiming that this is a good thing because you can for example create generic containers (which in his mind are better than the generic template containers in C++). He lists all the wrong reasons as arguments. For example:

The template definition is found, then an actual int stack class definition and implementation code are created implicitly behind-the-scenes, using that template. This naturally adds more code to the executable file. OK; so maybe it's not a big deal, really. Memory and hard drive space is cheap nowadays. But it could be an issue if a developer uses many different template types.

Hint: If you are going to present arguments in favor of only-dynamic-objects, memory consumption is not one of them.

Having different template instantiations for each different type you are using the container with is a negligible increase in memory consumption compared to how much it saves memory. The size of the binary will increase by a marginal amount for each type you use with the container (and usually you don't really use many tens of them), but the memory consumption of the container will be enormously less than an equivalent "generic" container using dynamic objects.

He uses a stack of ints as an example. So let's go with that example:

The code for the stack will take a fixed amount of memory, perhaps some kilobytes. There will be also some little overhead directly related to the amount of items in the stack (by default std::stack uses std::deque as its container). We could pessimistically estimate that for example for each kilobyte of data the container uses like 32 bytes of ancillary data (it probably uses less, but this suffices as a rough estimate). So now suppose you have 50 millions of integers in your stack. How much memory will it take? Since an integer takes typically 4 bytes, even with the additional overhead it will take a bit less than 200 megabytes of memory.

Now, let's consider a container which he considers "good": Internally each item in the container will be a pointer to a dynamically allocated integer type. A pointer typically takes 4 bytes and the dynamically allocated integer takes at least 4 bytes itself as well (we can disregard any additional data the integer object type may require due to technical implementation details of the language). A dynamically allocated object will need some additional data around itself for the memory management system (including the garbage collection engine) to be able to handle it. Typically this additional data takes from 16 to 32 bytes. Let's take the optimistical 16 bytes.

So now, how much memory do the 50 millions of integers require memory in this "better" container? A bit less than 1.2 gigabytes.

So once again: You don't want to use memory consumption as an argument in favor of dynamic-only objects.

He continues on and says for example:

Standard C++ would have been a much better language if only one addition had been made: make every class implicitly an object. That's it! Garbage collection is very nice, but not completely necessary. If every class were implicitly an object in C++, then a virtual destructor method could exist on the object base, and then generic C++ containers could have been built for any class where the container handles memory management internally. For example, a stack class could be built, which holds only object pointers.

He does not only use the wrong terminology (a class is the definition of an object; an object is an instance of a class; a class is unique, but there can be several instances of that class type), but he (naturally) fails to realize that if you force every class to be automatically inherited from a common base class containing a virtual destructor (which is what he means by that), you are increasing the memory consumption of each class by a pointer. This can cause a drastic increase in memory consumption in some cases (eg. if your class has only one 4-byte member variable, making the class virtual will double its size, thus doubling the amount of memory needed to hold it; if you have millions of instances, your memory requirementes will be doubled).

He continues claiming that the stack container would somehow be better if it contained pointers to dynamic objects instead of the objects themselves. Well, if you don't mind your stack requiring at least 6 times more memory for the same task, then why not... All for the sake of saving some hundreds of bytes in program size... Pardon me if I fully disagree with his views.

He has even the audacity of bragging about how he has made a "better STL":

It may be worth your while to look into managed C++ code. I have played with it a little, and discuss at least what I know briefly below. Managed C++ code uses "my" recommendation (make every class instance an object) but also includes memory management. It's actually pretty good.

Perhaps he should start actually measuring the efficiency of his "better" library instead of just claiming that it's better and more efficient. Ease of use (compared to the STL container), speed and memory consumption are good measurements.

Conclusion

The article goes on and on, and I could continue much longer, but I'll just stop here. At least to me it's quite clear that the writer of this article does not have the proper programming experience, even though he may have some experience. He concentrates on the wrong things and presents irrelevant or plain wrong arguments in favor of certain features (even when there are other truely good arguments in favor of them).

It would be recommendable for him to learn good OO programming principles and also how compilers really work. He should also stop believing all the hype and get the right information.

And then you wonder why is it so hard for Microsoft to make high-quality programs?