A response to Linus Torvalds on C++

Note (added 31 Oct 2010): In a recent programming-related online thread discussing C vs. C++ someone linked to this article and someone responded along the lines of "I stopped reading after the line that says that Linus opposes any programming paradigms which are not possible or awkward to use in C, etc."

Please note that the list at the beginning of the article contains typical symptoms of people who I consider suffering from the "C-hacker syndrome". "Typical symptoms" does not mean that every single such person suffers from all of those symptoms. In other words, I am not saying that Linus Torvalds has all the opinions in the list. All I'm saying is that Torvalds seems to show many of these typical symptoms.

While Linus Torvalds has performed an outstanding job of creating the Linux kernel and leading its developement, he seems to suffer from what I call the "C-hacker syndrome". Typical symptoms of this syndrome include:

A strong aversion towards C++, usually based on prejudices instead of facts.
An aversion towards using C++ standard library functions and especially data containers because the C-hacker feels that he is not "in control".
The opinion that C is a much better language than C++.
Opposition to any programming paradigms and concepts related to those paradigms which are not possible or very awkward to use in C. These include things like object-oriented design, abstraction, etc. Basically anything not directly supported by C is bad.
Strong resistance to change (in programming practices, programming paradigms, etc.) Often this resistance to change extends even changing programming practices when coding in C itself. (Basically, "I have always done it like this, and I'm not going to change.")
Favoring the use of so-called "hacker optimization", even to the point where it becomes completely counter-productive.

The typical C-hacker will fail to give any rational and logical argument about why C++ is a bad language and why C is better.

I will quote here some text written by Linus Torvalds, posted to a programming forum, where he expresses his aversion towards C++. He presents all the typical symptoms of the C-hacker syndrome. From all of his arguments only one has even a little bit of rational basis.

C++ is a horrible language. It's made more horrible by the fact that a lot of substandard programmers use it, to the point where it's much much easier to generate total and utter crap with it. Quite frankly, even if the choice of C were to do *nothing* but keep the C++ programmers out, that in itself would be a huge reason to use C.

Just this paragraph alone has so many logical fallacies in it that it's hard to even decide where to start.

"It's made more horrible by the fact that a lot of substandard programmers use it" is a complete null statement. A lot of "substandard programmers" use all kinds of languages, and C is in no way an exception. That statement is just true for all languages, not C++ in particular, which makes the statement completely null in significance.

Perhaps what Torvalds wanted to imply is that more "substandard programmers" use C++ than C. Besides this being completely impossible to corroborate (which is why it's such an easy claim to make, as nobody can prove it as false), even if it was true, so what? Does the amount of "substandard programmers" using a specific language make the language itself "horrible"? Of course not. The more likely explanation is that the language is simply more popular. But why would that make C++ a "more horrible" language?

Of course the other implication of this sentence is that either there aren't many "substandard programmers" using C (which is certainly not true), or that C is not "made more horrible" by the fact that there are. Why C++ is made more horrible but C isn't is left to anybody's guess.

That sentence is immediately followed by a weird conclusion: "... to the point where it's much much easier to generate total and utter crap with it".

Note that this sentence was in the same sentence as the "it's made more horrible by substandard programmers", separated by a comma. If we follow Torvalds' logic here, he is saying that the fact that there are lots of substandard programmers using C++ makes it much easier to generate "total and utter crap with it".

In other words, the more substandard programmers use a certain programming language, the easier it is to make bad code with that language. Do the properties of a language change according to how many bad programmers use it? Does C++ somehow become easier to use wrongly if a lot of programmers use it wrongly?

Torvalds probably meant something else with that sentence, but he should really put some thought in the logical structure and wording of his posts. They just don't make any sense. One could ironically say that knowing well C doesn't seem to help you write logical sentences in English.

The next sentence doesn't make too much sense either. It feels more like trolling than anything else: "Quite frankly, even if the choice of C were to do *nothing* but keep the C++ programmers out, that in itself would be a huge reason to use C."

This implies that: 1) Most C++ programmers are substandard, and that 2) most C programmers aren't, and for this reason it's good to use C because then only the competent programmers will join the project.

Of course this is a load of bullcrap. Claiming that most C programmers are competent is simply not true. Of course this is also a very easy claim to make because there's no way to disprove it.

In other words: the choice of C is the only sane choice. I know Miles Bader jokingly said "to piss you off", but it's actually true. I've come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with.

In the first paragraph Torvalds presents absolutely no rational argument nor reason why C would be a better choice than C++, but he still starts this paragraph is if he had. The first sentence implies that "due to the clear advantages presented above C is a much better choice". Yet Torvalds did not present any arguments at all above. They were all just null statements with no relevant meaning.

The rest of the paragraph is just pure trolling, with no arguments. This is so typical of the C-hacker syndrome.

C++ leads to really really bad design choices. You invariably start using the "nice" library features of the language like STL and Boost and other total and utter crap, that may "help" you program, but causes:

- infinite amounts of pain when they don't work (and anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny)

- inefficient abstracted programming models where two years down the road you notice that some abstraction wasn't very efficient, but now all your code depends on all the nice object models around it, and you cannot fix it without rewriting your app.

Again Torvalds succeeds in making a claim and then presenting arguments which at first sight support that claim but actually have nothing to do with it.

I won't argue whether the first sentence, "C++ leads to really really bad design choices" is true or not (personally I disagree with it, but I won't pursue that further here), but I'll like to point out what Torvalds is implying with that sentence: That C, unlike C++, leads to good design choices. Because that's basically what Torvalds is saying here. He is presenting arguments why C++ is a bad choice and C is a much better one. Thus if he arguments that C++ leads to bad design choices, it automatically implies that C is much better in this category, and thus a better choice of language.

Of course this is, once again, a claim which is easy to make because it's so hard to disprove. However, having seen quite some C and C++ code I have to completely disagree. There is, of course, quite a lot of bad C++ code out there, but there is also quite a lot of bad C code out there. From what I have seen the C side wins in this category.

There simply isn't anything in C which would "lead to good design choices" any more than there is in C++. On the contrary, the limitations in C often lead to very bad, inefficient and unsafe design choices. Sure, there is a lot of bad, inefficient and unsafe C++ out there, but C code out there is no better. In fact, I would claim that it's much worse.

By its very nature C encourages making unsafe, and often inefficient code. A prime example of the first is the standard C library function gets(). This function is completely unsafe and it must never be used for anything. It reads characters from standard input into a fixed buffer, and it only stops when a newline character is found. If the buffer is not large enough, bad luck, you have a buffer overflow. There's absolutely no way of using gets() safely even if you wanted to (you have to use some other function).

Sure, the solution is not to use gets(). However, this is a prime example of what kind of code C encourages creating, by its very nature. It had encouraged this so much that such a function got into the C standard itself. If the guys at the C standard committee (who should know quite a bit about programming in C) were encouraged to include this kind of monstrosity, how much more likely is the average C programmer to create similar code?

This, of course, doesn't mean that C++ would not encourage this. However, I'm not arguing against that. My point is that it's simply not true that C somehow leads to better code. Claiming that is just utterly false. There's nothing in C which would encourage making better code than C++.

As for efficiency, I dare to claim there are cases where the opposite is true (ie. C++ encourages more efficiency than C). This is a very typical code you will see made by the average C programmer:

for(i = 0; i < strlen(line); i++)
{
    ...
    doSomethingWith(line);
    ...
}

Doing it like that is very inefficient. The compiler probably even can't optimize the strlen outside the loop, especially if the doSomethingWith function takes a non-const pointer (because the compiler can't know if that function might change the string).

Sure, the answer is "don't do it like that". However, C just encourages the newbie programmer to code like that. It's just a natural way of thinking.

Now, compare to how the same thing is done in C++, by the natural way of thinking in C++:

for(i = 0; i < line.length(); i++)
{
    ...
    doSomethingWith(line);
    ...
}

This looks a lot like the C version, but is much more efficient (because std::string::length() simply returns the value of a member variable and doesn't need to count the size of the string each time it's called). Note that it's efficient even if the string is modified in the body of the loop.

In both cases the natural thinking created by the language was followed. In the C case a very inefficient implementation was the result, while in the C++ case a much more efficient one.

As for the quality of the code, in order to see what kind of code C leads to it's simply enough to take almost any open source C program out there and examine how it has been done. Just as an example let's take the mplayer software, which is the most popular media player in Linux. Let's examine, for example, its libavcodec/h264.c file:

This one single file has a whopping 8306 lines of code. As an example of a function, the fill_caches() function in this file is a whopping 462 lines long, and looks like this. (The few comments in that code are worth reading.) As an example of another function, the decode_mb_cavlc() function, which is 492 lines long and looks like this.

Sure, that code could be made a lot better, even in C. However, that's not my point. My point is that you see that kind of code a lot in C programs out there. This just goes to prove that C does not somehow automatically lead to good code nor that there wouldn't be a lot of incompetent C programmers out there. In fact, due to the amount of that kind of C code out there I can only make the deduction that C by nature makes people to create code like that. I just can't see C++ as being any worse in this regard (but in fact all the contrary).

As a very typical C-hacker syndrome symptom, Torvalds strongly opposes the use of STL containers. The most typical reason for a C-hacker to oppose the STL containers is that they feel they are not "in control" when an abstract "black box" is doing all the things behind the scenes. Thus they have the strong urge to always re-implement everything by hand each time they need a data container. Because they like to be in control. They prefer writing 1000 lines of code instead of using an existing library, just to be in control. (Never mind that in many cases the C-hacker implementation will be less efficient and a lot unsafer than the equivalent STL data container.)

Ok, it would be unfair to claim that this applies to Torvalds because he doesn't say that. However, I wouldn't be surprised if this wasn't one big reason for his aversion.

He actually presents one argument here which has even a slight rationality to it: If you use the STL data containers, chances are that some old C++ compiler out there won't support something. This may indeed be relevant if you are writing code which should be compilable in some 20-years-old mainframe with a 30-years-old operating system which hasn't been upgraded since. Sure, I admit that. (Yes, that is an exaggeration. It's intentional. Don't split hairs with it.)

However, Torvalds writes as if using STL would always lead to portability nightmares. He expresses this with such a strong expression as "infinite amounts of pain". This just demonstrates the strong resistance to change so typical to the C-hacker syndrome. Perhaps Torvalds hasn't been following the modern C++ scene so closely. If he had, he would know that basically any modern C++ compiler has great support for the C++ standard libraries. But resistance to change means to a C-hacker that we must assume that we live in the 1970's. Anything we write today must be compilable in a system created at that time.

Sure, if we are talking about the Linux kernel, then I have nothing to say against that. The Linux kernel, being so portable, should be made in C. I agree with that. However, Torvalds writes in a more generic way than simply related to the Linux kernel. He writes like it's always a bad idea to use C++, no matter what you are writing.

His claim that using the STL causes "inefficient abstracted programming models" is simply not true. Using STL right causes the program to be more efficient, a lot safer and much easier to understand and maintain. It makes programs shorter and less error-prone (because, after all, the STL implementations in compilers have been tested to death against bugs and errors). Sure, you can use the STL in the wrong way, leading to inefficient code, but the STL gives you the tools to easily create efficient and safe code, very much unlike C which offers you absolutely nothing of the sort. Moreover, as already seen above, C more probably leads to messy and unmaintainable spaghetti code which is probably also less efficient.

In other words, the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C.

This is just utterly false. Even if we refrained from using the STL, C++ still offers many strong tools to make your code better, more efficient and especially safer, tools which C completely lacks. Torvalds is basically claiming here (since it immediately follows the STL bashing above) that "since you can't use STL, everything that is left is C". This is just completely and utterly false.

And limiting your project to C means that people don't screw that up, and also means that you get a lot of programmers that do actually understand low-level issues and don't screw things up with any idiotic "object model" crap.

The bullshit continues. Torvalds still maintains that using C somehow automatically leads to better code. This simply is against all logic and actual observations.

He also claims that C++ coders do not "understand low-level issues". This is another typical way of thinking for someone with the C-hacker syndrome: Since you can create high-level abstract code with C++, that automatically means that you can't create very low-level code. As if doing abstract code somehow removed the understanding of low-level issues. This is simply not true. There's nothing in C++ which would stop you from understanding or using low-level features. In fact, many experienced C++ programmers do understand these issues much better than the average C programmer, and often even better than experienced C programmers.

And again, the other thing Torvalds implies here is that programming in C somehow automatically makes you more aware of the "low-level issues" and thus automatically leads you to create more efficient code. I don't know in which fantasy world Torvalds lives, but if you examine real C code out there you will see that this just isn't true. In fact, in many cases the opposite is true: C has lead to more inefficient implementations, just because of how C makes people to think. (And not to talk about how unsafe and hard to understand those inefficient C implementations are...)

The C-hacker syndrome symptom of aversion towards different programming paradigms also shows clearly in the above sentence, as Torvalds shows a strong aversion against object-oriented design, with absolutely no good reasons given.

In the vast majority of cases when a C-hacker hears the term "object-oriented" or "class", what he actually hears is "virtual functions", and his prejudiced brain immediately makes the connection "virtual functions → slow → bad", and they immediately become completely deaf to anything else. They seem to think that using classes means that you absolutely must use inheritance (not true), you must use virtual functions (not true) and that virtual functions are extremely slow (not true).

So I'm sorry, but for something like git, where efficiency was a primary objective, the "advantages" of C++ is just a huge mistake. The fact that we also piss off people who cannot see that is just a big additional advantage.

Again, Torvalds seems to imply that you can't make efficient (yet still abstract) code in C++. This is simply not true.

(I'm sure Torvalds himself knows this. However, his C-hacker syndrome doesn't allow him to admit it after such a heated ranting.)

The next sentence is again pure trolling. I really don't understand why Torvalds likes to troll so much. It almost feels like he enjoys making people mad at him. One wouldn't expect such an attitude from such a known person.

If you want a VCS that is written in C++, go play with Monotone. Really. They use a "real database". They use "nice object-oriented libraries". They use "nice C++ abstractions". And quite frankly, as a result of all these design decisions that sound so appealing to some CS people, the end result is a horrible and unmaintainable mess.

I don't know if Monotone is really a "horrible and unmaintainable mess" (which I'm tempted to doubt due to Torvald's completely prejudiced attitude), but, once again, he implies two things:

That C++ somehow automatically leads to an "horrible and unmaintainable mess", by citing one example. This, of course, is just no proof.
More subtly, that if they had done it in C it would not be such a horrible mess. Again, without any concrete proof. And practical examination of C programs out there give a rather different impression (the mplayer example given earlier is just one example of many).

Note, again, that I'm not saying that C++ would automatically lead to better programs either. What I am saying that Torvald's claim that C does is simply and utterly false, and that practice has shown that it may even be worse than C++ in this regard.

After a response to his post, he writes:

On Thu, 6 Sep 2007, Dmitry Kakurin wrote:
>
> As dinosaurs (who code exclusively in C) are becoming extinct, you
> will soon find yourself alone with attitude like this.

Unlike you, I actually gave reasons for my dislike of C++, and pointed to
examples of the kinds of failures that it leads to.

It's sad that Torvalds himself can't see that his "reasons" he wrote in the previous post do not actually say anything, do not prove anything and are simply false. Basically his arguments were "C++ leads to worse programs than C because I say so", without even the first hint of evidence.

Curious that he says "[I] pointed to examples", in plural, even though he gave one single example of a C++ program which, in his opinion, is a "failure" (without actually explaining why it's a failure).

The fact is, git is better than the other SCM's. And good taste (and C) is one of the reasons for that.

Torvalds surely likes to maintain that C leads to better programs. Of course explaining why this is so, using rational and logical arguments, seems to be quite difficult for him (as for anyone with the C-hacker syndrome). He makes lots of claims about C and how it affects the programming behavior of people (as well as how C++ affects negatively), but giving proof or even a logical explanation of any kind is too much to ask. It's basically "it just does, because I say so."

In another post Torvalds writes:

The things that actually *matter* for core git code is things like writing your own object allocator to make the footprint be as small as possible in order to be able to keep track of object flags for a million objects efficiently. It's writing a parser for the tree objects that is basically fairly optimal, because there *is* no abstraction. Absolutely all of it is at the raw memory byte level.

Can those kinds of things be written in other languages than C? Sure. But they can *not* be written by people who think the "high-level" capabilities of C++ string handling somehow matter.

This text again shows his strong aversion towards other programming paradigms as well as programming principles commonly agreed as good. His text has, once again, completely false and mistaken implications and claims.

He claims (he even emphasizes it with asterisks) that the reason why in this specific software the memory allocation and the parser are so efficient is precisely because of the lack of abstraction (in the program and in C in general) and because all the code is at the "raw memory byte level".

This kind of thinking is just so wrong in so many levels that it shows a complete lack of understanding of such programming concepts as abstraction.

Basically he is saying that low-level, hard-coded programs which lack abstraction are automatically more efficient. That's exactly what he is saying, because he writes "because there is no abstraction". In other words, lack of abstraction automatically leads to optimal code, and thus that using abstraction leads to inefficient code.

This is just patently false. There's nothing in low-level non-abstract coding which would somehow automatically make it more efficient. There's nothing in coding "at the raw memory byte level" that would make it more efficient. It's not only perfectly possible to make horribly inefficient code like that, but you can actually see it a lot of C programs out there. It's perfectly possible and in fact very easy to create very inefficient code even if you program at "the the raw memory byte level". There isn't a magical silver bullet in that kind of programming which would automatically make the program more efficient. Practice has shown the contrary: Often this kind of code is quite inefficient.

Likewise, there's nothing in abstraction which would cause the program to be more inefficient. Usually abstraction (at least in C++) exists only at the language level. The abstraction is basically completely removed by the compiler when it creates the executable binary. The end result may be completely similar to the equivalent C implementation. There's nothing in abstraction which would stop you from making an equally efficient implementation as you would do in C.

In his previous posts he claimed that the C++-way of thinking (which includes abstraction) leads to an "unmaintainable mess". It's rather ironic that he supports writing non-abstract low-level code here, even though it has been known from the dawn of programming languages what kind of unmaintainability that produces. I can't believe Torvalds would argue against all the respected computing scientists on this subject.

And it's not like this would be a tradeoff between maintainability and efficiency. It's perfectly possible to create very abstract, very maintainable C++ code which is still exactly as efficient as the C implementation.

He concedes this a bit by saying: "Can those kinds of things be written in other languages than C? Sure." However, he immediately follows with his prejudiced C-hacker syndrome: "But they can *not* be written by people who think the "high-level" capabilities of C++ string handling somehow matter."

That is, of course, just an opinion. However, that opinion is just false.

Favoring C++ strings even in places where it doesn't really matter what you use is in no way a symptom that you are unable to understand and create efficient (yet abstract) code. He is saying that anyone who would like to use C++ strings everywhere is a person who doesn't understand low-level optimization and is unable to write such code. This is, of course, a completely prejudiced (and wrong) attitude.

One very good reason for someone to prefer using C++ strings is safety. It's very easy to use C++ strings safely, without the danger of memory leaks and such. Many, if not most C++ programmers, including those who are very well capable of writing very efficient low-level code, prefer using C++ strings because of their ease of use and safety. Preferring C++ strings is in no way a symptom of not understanding nor being able to create low-level efficient code.

It's only people suffering from the C-hacker syndrome who abhor things like the C++ string, and they usually don't have any good or logical reason for this, but only prejudice, resistance to change and the feeling that they are not "in control".

The fact is, that is *exactly* the kinds of things that C excels at. Not just as a language, but as a required *mentality*. One of the great strengths of C is that it doesn't make you think of your program as anything high-level. It's what makes you apparently prefer other languages, but the thing is, from a git standpoint, "high level" is exactly the wrong thing.

It's rather ironic that, in a way, Torvalds is contradicting himself here.

Unless countless very respected computer scientists and programmers around the world are completely wrong, we just cannot deny that non-abstract, low-level code tends to be quite unmaintainable. This is the more true the larger the program is.

Torvalds not only admits here that C, by its very nature, leads you to create non-abstract, low-level code, that C gives you the mentality required to create such non-abstract code, but he actually claims that it's a good thing that this is so.

Thus, if we accept what all respected people in this field say, C automatically leads the programmer to create unmaintainable code.

Of course this is the exact opposite of what Torvalds claimed earlier, where he basically said that C leads to maintainable code, unlike C++, which leads to a "horrible and unmaintainable mess".

Of course it may be that Torvalds is also saying that C leads to making low-level non-abstract code and maintainable code at the same time. If he is claiming this, then he disagrees with basically every single respectable professional in this field.

But this is just another symptom of the C-hacker syndrome. You don't have to present logical and consistent arguments to defend C as the best language. It just is. It doesn't matter what others say.

In yet another post he writes:

And if you want a fancier language, C++ is absolutely the worst one to choose. If you want real high-level, pick one that has true high-level features like garbage collection or a good system integration, rather than something that lacks both the sparseness and straightforwardness of C, *and* doesn't even have the high-level bindings to important concepts.

IOW, C++ is in that inconvenient spot where it doesn't help make things simple enough to be truly usable for prototyping or simple GUI programming, and yet isn't the lean system programming language that C is that actively encourags you to use simple and direct constructs.

The amount of bullshit he writes, once again, is just amazing. Once again, there are so many things wrong in this short quote that it's hard to decide where to start.

Firstly, he seems to think that the only reason one would want to use C++ is to have "high-level" features in order to create temporary prototypes and such, and that such a person should instead use a "true high-level" language which is good for prototyping but cannot be used for creating low-level programs. If someone wants to create low-level programs, he should use C.

It seems that it's impossible for him to understand that someone could actually use the higher-level features of C++ to his advantage and create an efficient program at the same time. Dear Torvalds, this is not impossible at all. It's very possible and it can be done cleanly and well. Just because Torvalds has such a strong prejudice against C++ doesn't mean that it's not possible to do what he claims cannot be done.

He says that C has "sparseness and straightforwardness", and that it "actively encourags you to use simple and direct constructs".

This is a prime example of C-hacker preaching. If we put those terms in a slightly different light, they mean:

"Sparseness": The language is extremely simple and offers basically no tools to do anything. It doesn't offer tools eg. for safety (which easily leads to things like memory leaks and segmentation faults), for doing things automatically, nor basically anything. Everything has to be done "by hand".
"Straightforwardness": You have to write 100 lines of code for things which could perfectly be written in 10. In other words, you have to explicitly and very meticulously write each single step each single time you want to do it.
"Simple and direct constructs": Memory allocation hacks which are absolutely horrible and unmaintainable. Any data container which is more complicated than an array (let's say, for example, a Kd-tree) requires tons of messy and unsafe code.

It's rather ironic that encouraging "simple and direct constructs" often actually encourages making very inefficient data containers, although Torvalds claims that C naturally leads programmers to create efficient code.

The fact is: Look at almost any complicated data structure made in C, and it will almost invariably be a horrible mess which is very error-prone and extremely hard to understand and maintain.

I'm not saying that many such data containers made in C++ aren't a mess. However, C++ at least offers you the tools to make it cleaner. C doesn't offer you anything (after all, it's a "sparse" and "straightforward" language) and thus C constructs tend to look like a huge mess.

Claiming that C somehow encourages people to make better, cleaner and more efficient code is simply false.