What makes the Java compiler so fast?

I was wondering about what makes the primary Java compiler (javac by sun) so fast at compilation?

..as well as the C# .NET compiler from Microsoft.

I am comparing them with C++ compilers (such as G++), so maybe my question should have been, what makes C++ compilers so slow :)


That question was nicely answered in this one: Why does C++ compilation take so long? (as jalf pointed out in the comments section)

Basically it's the missing modules concept of C++, and the aggressive optimization done by the compiler.


I think the most difficult part is not the need to compile the header files (unless they are really big, but you can use precompiled headers in that case). The worst part is always the fact that C++'s grammar is too wildly context-sensitive. Despite the fact I like C++, I feel sorry for anybody who has to write a C++ parser.


There are a couple of things that make the C++ compiler slower than those of Java/C#. The grammar is much more complex, generic programming support is much more powerfull in C++, but at the same time it is more expensive to compile. Inclusion of files work in a different way than importing modules.

Inclussion of header files

First, whenever you include a file in C++ the contents of the file (.h usually) are injected in the current compilation unit (include guards avoid reinjecting the same header twice), and this is transitive. That is, if you include header ah, that in turns includes bh, your compilation unit will include all code in ah and all code in bh

Java (or C#, I will talk about Java, but they are similar in this) don't have include files, they depend on the binaries from the compilation of the used classes. This means that whenever you compile a.java that uses an object B defined in b.java, it just checks the binary b.class, it does not need to go deeper to check the dependencies of B, so it can cut the process earlier (with just one level of checking).

At the same time, including files only includes the language definitions, and processing it requires time. When the Java/C# compiler reads a binary it has the same information but already processed by the compilation step that generated it.

So at the end, in C/C++ more files are included and at the same time processing of those includes is more expensive than processing of binary modules.

Templates

Templates are special in their own way. They can be precompiled, but they are usually not (for a good set of reasons). This means that in all compilation units that use std::vector the whole set of vector methods used (unused template methods don't get compiled) is processed and the binary code generated by the compiler. At a later step, during linking, redundant definitions of the same method will get dropped, but during compilation they must be processed.

Support in Java for generics is more limited in many ways. At the end, for example, there is only one Vector class binary, and whenever the compiler sees Vector in java what it does is generating type checking code before delegating to the real Vector implementation (that stores plain Object) and that is not generic. The compiler does provide the type warranties, but does not compile Vector for each type.

In C# it is, once again, different. C# support for generics is more complex than that of Java, and at the end generic classes are different than plain classes but anyway they get compiled only once as the binary format has all required information.

链接地址: http://www.djcxy.com/p/14978.html

上一篇: 我如何知道为什么g ++在特定文件上花费很长时间?

下一篇: 是什么让Java编译器如此之快?