Comparing Complexity of Two Algorithms

We have two algorithms which are implemented in Visual C++ 2010 and work fine. We know that the complexity one of them is n*log(n), and the other is n^2. But how can I actually "measure" the time required for running each of them? The problem is that they run really fast like a few micro-seconds. Can I measure with that precision or can I, for example, count CPU cycles required for each? Adding a delay in each loop of them is correct?


Well, if your input is small, asymptotic measurement of the run time means squat, since the constant might not be negligible, and must be taken into account.

The big O notation is useful and predicts correctly "which algorithm is better" only for large input sizes (for all input size of n>N for some constant N per algorithms pair).


To measure which of the two algorithms are better you should try empirical and statistical approaches.
Generate thousands (or more) of different test cases (automatically), and run the algorithms on the test cases. Don't forget to warm up the system before starting to run the benchmark.

Find the time (nano-seconds) it took the algorithm per test case, and compare the two using statistical measures - you can look at the mean time.

You should also run statistical test - such as Wilcoxon test to find out if the differences between run times has statistical significance.

Important: Note that for different machines, or different distribution of inputs, the result might vary - the test gives you confidence for the specific machine and test cases distribution.


A typical "testbed" (inherited from C) looks as follows:

#define n 20
#define itera 10000000

int main(int argc, char* argv[])
{
    clock_t started;
    clock_t stopped;
    double duration;

    started = clock();

    for (unsigned long i = 0; i < itera; i++)
    {   
      for (unsigned k = 0; k < n; k++)
      {
         ....
      }
    }

    stopped = clock();

    duration = (double)(stopped - started) / CLOCKS_PER_SEC;

}

Frequent pitfalls:

The compiler might optimize your code in such a way that the results are misleading. (eg if you assign a variable and don't use the variable later on, the calculation and assignment might be omitted)

Your system might be busy with other things while performing the test. Repeat the test often enough.

Caching effects might influence the speed. This is especially true, if disk access times play a role or lots of memory are involved.

The performance of the algorithm often depends on the test data.

The outer loop of your testbed might take more time than the actual algorithm. Measure the empty loop to take this effect into account.

链接地址: http://www.djcxy.com/p/40074.html

上一篇: Haskell中两个累积和(cumsum)函数的复杂性

下一篇: 比较两种算法的复杂性