Accessing local field vs object field. Is doc wrong?
The documentation seems to be wrong. Could someone tell me which is true?
In Performance Myths section is:
On devices without a JIT, caching field accesses is about 20% faster than repeatedly accesssing the field. With a JIT, field access costs about the same as local access.
In Avoid Internal Getters/Setters section is:
Without a JIT, direct field access is about 3x faster than invoking a trivial getter. With the JIT (where direct field access is as cheap as accessing a local), direct field access is about 7x faster than invoking a trivial getter.
It's clear that without JIT local access is faster. It's also clear that accessing field is faster while accessing directly than with getter.
But why in the first case performance is 20% better and in the second case performance is 133% better for the same reason, that is JIT optimization for calling object field?
I think you're comparing apples and oranges. the Performance Myths reference discusses the advantage of a JIT for field access, while the second reference discusses the advantage of a JIT for method access.
As I understand it, an analogy for direct field access vs. local access (not local field access as you wrote in your post - there is no such thing as a local field) is the following:
class foo {
int bar = 42;
int doStuff(){
int x = 1;
x += bar;
x += bar;
x += bar;
return x;
}
}
Each reference to bar
has an associated performance cost. A good compiler will recognize the opportunity for optimization and 'rewrite' the code as such:
int doStuff(){
int x = 1f;
int local_bar = bar;
x += local_bar;
x += local_bar;
x += local_bar;
return x;
}
Without a JIT, this is a handy optimization, which gets you a 20% bump in performance.
With a JIT, the optimization is unneccessary, as the JIT removes the performance hit from the access to bar
in the first place.
The second reference describes the following scenario:
class foo {
int bar = 42;
int getBar() { return bar; }
int doStuff(){
int x = 1;
x += getBar();
x += getBar();
x += getBar();
return x;
}
}
Each function call has an associated performance penalty. A compiler can NOT cache the multiple getBar()
method calls (as it cached the multiple direct field accesses to bar
in the previous example), because getBar() might return a completely different number each time it is called (ie if it had a random or time-based component to its return value). Therefore, it must execute three method calls.
It is vital to understand that the above function would execute at approximatley the same speed with or without a JIT.
If you were to manually replace getBar()
in the above function with simply bar
, you would achieve a performance boost. On a machine without a JIT, that performance boost is roughly 3x, because field access is still somewhat slow, so replacing the very slow methods with somewhat slow field accesses only yields a moderate boost. With a JIT, however, field access is fast, so replacing the very slow methods with fast field access yields a much greater (7x) boost.
I hope that makes sense!
I think you may be comparing apples vs. oranges. In the first quote:
caching field accesses is about 20% faster than repeatedly accesssing the field
implies that a caching strategy could improve performance without JIT compilation only during direct field access. In other words:
int a = this.field;
if (a == 1)
...
if (a == 7) // etc.
yields better performance than
if (this.field == 1)
....
if (this.field == 7) //etc.
The quote suggests you'll have a penalty hit by repeatedly referencing the field rather than storing it locally.
The second quote suggests that without JIT using a trivial getter/setter is slower than direct field access, eg:
if (this.getField()) // etc.
is slower than:
if (this.field) // etc.
I don't think the documentation is wrong or that one statement undermines the other.
This is just an educated guess, I have no idea about Dalvik internals. But note that in the first case, performance of local access is compared to field access, while in the second case, field access is compared to a trivial method call. Also note that the x% speedup aren't really x% less time taken for the same code by adding a JIT, we're talking about relative performance: (a) Interpreted local access is 20% faster than interpreted field access, and (b) JIT'd local access is as fast as JIT'd field access does not imply (c) Interpreted local access is as fast as JIT'd local/field access. It's most likely slower, in fact.
Reading a local in an interpreter is, with most VM architectures, a memory access, not a register access (we're talking about machine registers, not Dalvik registers). Reading a field is even slower -- I cannot say for sure why (my guess would be the second lookup, reading both register and object field), but in any case it's more complex. The JIT on the other hand can put both fields and locals into registers (that's what I have to assume to explain the performance equality, and in fact there are JITs which do this -- I just don't know if it applies here) and removes much of the overhead.
For method calls, assuming Dalvik JITs don't inline methods (which is implied), you have quite some overhead atop of the actual call which makes calls expensive even when JIT'd: Must save registers to stack, must restore them afterwards, cannot optimize as much because not all code is visible. A call is relatively more expensive then call-less code because the call-less alternative is so blazing fast, not because the interpreter does better at doing calls (it doesn't, it just also slow doing everything else). For example, no optimizations are being prevented by the call, because there are no optimizations.
链接地址: http://www.djcxy.com/p/23678.html