gprof和（unix）时间之间的差异; gprof报告运行时间较短

2018-06-14 04:11:27

这个问题在这里已经有了答案：

时间抽样问题与gprof 1答案

gprof不知道它没有调试信息访问的功能，即标准库。如果你想得到准确的经过时间，仍然可以得到一个callgraph，你可以使用perf 。

作为一个例子，我编写了一个循环10000次的程序。在这个循环中，我用随机值填充一个向量然后对其进行排序。对于gprof ，我执行以下步骤：

g++ -std=c++11 -O2 -pg -g
./a.out
gprof -b ./a.out

如果gmon.out不存在，则创建gmon.out ;如果没有，则覆盖gmon.out ，如果不指定要使用的文件， gprof将自动使用它。 -b抑制文本blurbs。

以下是输出示例：

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
100.52      4.94     4.94                             frame_dummy
  0.00      4.94     0.00       26     0.00     0.00  void std::__adjust_heap<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, long, double, __gnu_cxx::__ops::_Iter_less_iter>(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, long, long, double, __gnu_cxx::__ops::_Iter_less_iter)
  0.00      4.94     0.00        1     0.00     0.00  _GLOBAL__sub_I_main

如你所见，它只记录向量堆实现的时间，并且根本不知道排序（或其他）。现在让我们试试perf ：

perf record -g ./a.out
perf report --call-graph --stdio

# Total Lost Samples: 0
#
# Samples: 32K of event 'cycles'
# Event count (approx.): 31899806183
#
# Children      Self  Command  Shared Object        Symbol                                                                            
# ........  ........  .......  ...................  ..................................................................................
#
    99.98%    34.46%  a.out    a.out                [.] main                                                                          
              |          
              |--65.52%-- main
              |          |          
              |          |--65.29%-- std::__introsort_loop<__gnu_cxx::__normal_iterator<double*, std::vector<double

[其余部分省略]

正如你所看到的， perf捕获了排序功能。如果我运行perf stat ，我也会得到准确的运行时间。

如果您使用的是GCC，则可以传递-D_GLIBCXX_DEBUG以使其使用调试库实现。这会让你的代码运行慢很多，但为了让gprof能够看到这些函数，这是必须的。一个例子：

g++ -std=c++11 -O2 test.cpp -D_GLIBCXX_DEBUG -pg -g
./a.out
gprof -b ./a.out

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 88.26      0.15     0.15   102875     1.46     1.46  __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > > std::__unguarded_partition<__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter)
 11.77      0.17     0.02   996280     0.02     0.02  void std::__unguarded_linear_insert<__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Val_less_iter>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Val_less_iter)
  0.00      0.17     0.00   417220     0.00     0.00  frame_dummy
  0.00      0.17     0.00   102875     0.00     0.00  void std::__move_median_to_first<__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter)
  0.00      0.17     0.00     1000     0.00     0.25  void std::__insertion_sort<__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<double*, std::__cxx1998::vector<double, std::allocator<double> > >, std::__debug::vector<double, std::allocator<double> > >, __gnu_cxx::__ops::_Iter_less_iter)
  0.00      0.17     0.00        1     0.00     0.00  _GLOBAL__sub_I_main

我故意减少了迭代次数，以便在合理的时间内完成执行，但是您会发现gprof现在显示了它之前不计算的函数。

链接地址: http://www.djcxy.com/p/40353.html

上一篇: Discrepancy between gprof and (unix) time; gprof reports lower runtimes

下一篇: Is GNU gprof buggy?