How long does a microbenchmark need to run?

First of all, this is not about the usefulness of microbenchmarks. I'm well aware of their purpose: Indicating performance characteristics and comparison in a very specific case to highlight a single aspect. Whether or not this should have any implications on your work is a different story.

A few years ago, someone (I think Heinz Kabutz?) noted that every benchmark that is worth the time to look at its results has to run at least a few minutes and needs to be run at least 3 times, whereas the first run is always discarded. That was to account for warming up the JVM as well as inconsistencies in the environment (background processes, network traffic, ...) and measurement inaccuracies. That made sense to me, and my personal experiences suggested something similar, so I always adopted this strategy.

However, I noticed many people (for instance Jeff) write benchmarks that only run for a couple milliseconds (!) and are run only once. I know that the accuracy of short-running benchmarks went up in the recent years, but it still strikes me as odd. Shouldn't every microbenchmark run for at least a second and be run at least 3 times to get a somewhat useful output? Or is that rule obsolete nowadays?


In my experience you need to:

  • run multiple times (and discard the first result - VM and other effects)
  • take the minimum time if you're looking at compute-intensive code
  • run for long enough to mitigate cost of loops and timing functions
  • ideally run within one OS time slice (typically 10 ms) or for much more than one time slice, eg run for ~5 ms or ~500 ms.
  • I only tend to work with compute-intensive code - if you have a different profile (eg memory-intensive, or lots of I/O) then the timing strategy may need to be different.

    链接地址: http://www.djcxy.com/p/31688.html

    上一篇: python中的零列表

    下一篇: 微基准标记需要运行多久?