Debugging the CPU Caches
I'm currently trying to optimize my software for better CPU cache usage. There are some posts on SO which suggest that it's sometimes hard to guess what the CPU cache is doing and why there are some performance drops in certain cases. For example:
So in order to get a clue where the cache misses happen, I can run perf
to get a count of cache misses and where they occur as well as valgrind --tool=cachegrind
to simulate the caches (at least an L1 and a last-level cache).
It's really nice to know where cache misses happen, but I'd like to know why they happen (for example cache trashing etc.). Is there a way to explicitly pause the program and see whats inside the caches (maybe with the program running in valgrind
and vgdb
attached)?
In my experience you'll need to disassemble your binary and, look to see where the program is using the cache. Look to see where the prefetch or cache instructions are called. That will give you the where and whys of it. It's an unfortunately painful process.
链接地址: http://www.djcxy.com/p/85964.html上一篇: 为什么MATLAB在矩阵乘法中如此快速?
下一篇: 调试CPU高速缓存