Here's one example. C is really fast, because among other things, the compiler can inline function calls. It's gonna use some heuristics, so it doesn't get too much code bloat, but if it sees a function call, it can inline it: it patches it right in, ok, because it knows the address at link time.
C++ — you've got your virtual method dispatch, which is what C++ you know, sort of evangelists, that's the first thing they go after, like in an interview, "tell me how a virtual method table works!" Right? Out of all the features in C++, they care a lot about that one, because it's the one they have to pay for at run time, and it drives them nuts! It drives them nuts because the compiler doesn't know, at run time, the receiver's type.
If you call foo.bar(), foo could be some class that C++ knows about, or it could be some class that got loaded in afterwards. And so it winds up — this polymorphism winds up meaning the compiler can compile both the caller and the callee, but it can't compile them together. So you get all the overhead of a function call. Plus, you know, the method lookup. Which is more than just the instructions involved. You're also blowing your instruction cache, and you're messing with all these, potentially, code optimizations that could be happening if it were one basic-block fall-through.
[...]
In particular, let me come back to my inlining example. Java inlines polymorphic methods! Now the simplest way to do it was actually invented here at Stanford by Googler Urs Hoelzle, who's, you know, like VP and Fellow there, and it's called, it's now called Polymorphic Inline Caching. He called it, uh, type-feedback compilation, I believe is what he called it. Great paper. And it scared everybody, apparently. The rumors on the mailing lists were that people were terrified of it, I mean it seems too hard. And if you look at it now, you're like, dang, that was a good idea.
All it is, I mean, I told you the compiler doesn't know the receiver type, right? But the thing is, in computing, I mean, heuristics work pretty well. The whole 80/20 rule and the Power Law apply pretty much unilaterally across the board. So you can make assumptions like: the first time through a loop, if a particular variable is a specific instance of a type, then it's probably going to be [the same type] on the remaining iterations of the loop. OK?
So what he [Urs] does, is he has these counters at hot spots in the code, in the VM. And they come in and they check the types of the arguments [or operands]. And they say, all right, it looks like a bunch of them appear to be class B, where we thought it might be class A.
So what we're gonna do is generate this fall-through code that says, all right, if it's a B – so they have to put the guard instruction in there; it has to be correct: it has to handle the case where they're wrong, OK? But they can make the guard instruction very, very fast, effectively one instruction, depending on how you do it. You can compare the address of the intended method, or you can maybe do a type-tag comparison. There are different ways to do it, but it's fast, and more importantly, if it's right, which it is 80-90% of the time, it falls through \[i.e., inlines the method for that type - Ed.], which means you maintain your processor pipeline and all that stuff.
So it means they have predicted the type of the receiver. They've successfully inlined that. I mean, you can do a whole bunch of branching, and they actually found out through some experimentation that you only need to do 2 to 4 of these, right, before the gain completely tails off. So you don't have to generate too much of this. And they've expanded on this idea now, for the last ten years.
Dit citaat geeft een specifiek voorbeeld hoe een JIT aan de hand van runtime informatie een methode-aanroep beter kan compileren.