c++ 2d array access speed changes based on [a][b] order?

Possible Duplicate: Why is my program slow when looping over exactly 8192 elements? I have been tinkering around with a program that I'm using to simply sum the elements of a 2d array. A typo led to what seem to me at least, some very strange results. When dealing with array, matrix[SIZE][SIZE]: for(int row = 0; row < SIZE; ++row) for(int col = 0; col < SIZE; ++col)

c ++二维数组访问速度基于[a] [b]顺序变化?

可能重复: 为什么我的程序在循环8192个元素时很慢? 我一直在用我用来简单地总结二维数组的元素的程序。 一个错字导致了至少在我看来,一些非常奇怪的结果。 处理数组时,矩阵[SIZE] [SIZE]: for(int row = 0; row < SIZE; ++row) for(int col = 0; col < SIZE; ++col) sum1 += matrix[row][col]; 运行速度非常快,但上面的行sum1 ...被修改: sum2 += matrix[col][row] 正如我在意外事故中没有

Tracing call stack in disassembled code

I am trying to debug a tricky core dump (from an -O2 optimized binary). // Caller Function void caller(Container* c) { std::list < Message*> msgs; if(!decoder.called(c->buf_, msgs)) { .... ..... } // Called Function bool Decoder::called(Buffer* buf, list < Message*>& msgs) { add_data(buf); // Inlined code to append buf to decoders buf chain while(m_data_in

在反汇编代码中追踪调用堆栈

我试图调试一个棘手的核心转储(从-O2优化二进制文件)。 // Caller Function void caller(Container* c) { std::list < Message*> msgs; if(!decoder.called(c->buf_, msgs)) { .... ..... } // Called Function bool Decoder::called(Buffer* buf, list < Message*>& msgs) { add_data(buf); // Inlined code to append buf to decoders buf chain while(m_data_in && m_data

Why does the enhanced GCC 6 optimizer break practical C++ code?

GCC 6 has a new optimizer feature: It assumes that this is always not null and optimizes based on that. Value range propagation now assumes that the this pointer of C++ member functions is non-null. This eliminates common null pointer checks but also breaks some non-conforming code-bases (such as Qt-5, Chromium, KDevelop) . As a temporary work-around -fno-delete-null-pointer-checks can be use

为什么增强的GCC 6优化器打破实用的C ++代码?

GCC 6有一个新的优化器功能:它假定this总是不为空,并基于this优化。 值范围传播现在假定C ++成员函数的这个指针是非空的。 这消除了常见的空指针检查, 但也打破了一些不合格的代码库(如Qt-5,Chromium,KDevelop) 。 作为临时解决方法,可以使用-fno-delete-null-pointer-checks。 错误的代码可以通过使用-fsanitize = undefined来识别。 更改文件清楚地表明这是危险的,因为它打破了令人惊讶的常用代码量。 为什

How to find out if an item is present in a std::vector?

我想要做的就是检查向量中是否存在元素,以便处理每个案例。 if ( item_present ) do_this(); else do_that(); You can use std::find from <algorithm> : std::find(vector.begin(), vector.end(), item) != vector.end() This returns a bool ( true if present, false otherwise). With your example: #include <algorithm> if ( std::find(vector.begin(), vector.end(), item) != vector.end() )

如何找出一个项目是否存在于std :: vector中?

我想要做的就是检查向量中是否存在元素,以便处理每个案例。 if ( item_present ) do_this(); else do_that(); 你可以使用<algorithm> std::find : std::find(vector.begin(), vector.end(), item) != vector.end() 这会返回一个bool(如果存在,则返回true ,否则返回false )。 用你的例子: #include <algorithm> if ( std::find(vector.begin(), vector.end(), item) != vector.end() ) do_this()

Template instantiation details of GCC and MS compilers

Could anyone provide a comparison or specific details of how is template instantiation handled at compile and/or link time in GCC and MS compilers? Is this process different in the context of static libraries, shared libraries and executables? I found this doc about how GCC handles it but I'm not sure if the information is still referring to the current state of things. Should I use the fl

GCC和MS编译器的模板实例化细节

任何人都可以提供GCC和MS编译器在编译和/或链接时如何处理模板实例的比较或具体细节? 这个过程在静态库,共享库和可执行文件中是不同的吗? 我找到了关于GCC如何处理它的这个文档,但我不确定这些信息是否仍然指的是事物的当前状态。 我应该在编译我的库时使用它们建议的标志,例如-fno-implicit-templates? 我所知道的(可能不一定是正确的)是: 模板将在实际使用时被实例化 模板将作为显式实例化的结果而被实例化

What are the differences between

What are the differences between -std=c++11 and -std=gnu++11 as compilation parameter for gcc and clang? Same question with c99 and gnu99 ? I know about C++ and C standards, it's the differences in the parameters that interest me. I've read somewhere that it has to do with some extensions but it is not clear to me which ones and how to choose between one or the other for a new project

有什么区别

-std=c++11和-std=gnu++11作为gcc和clang的编译参数有什么区别? 与c99和gnu99同样的问题? 我了解C ++和C标准,这是我感兴趣的参数差异。 我已经在某处读到它与某些扩展有关的问题,但我不清楚哪些扩展名以及如何在新项目之间进行选择。 正如你自己发现的那样,两种选择之间的区别在于是否启用了违反C ++标准的GNU扩展。 这里描述GNU扩展。 请注意,使用-std=c++11时,某些扩展仍然可以生效,只要它们与标准不矛盾即可

Difference between CC, gcc and g++?

在汇编代码生成,可用库,语言特性等方面编译C和C ++代码时,3个编译器CC,gcc,g ++之间有什么区别? The answer to this is platform-specific; what happens on Linux is different from what happens on Solaris, for example. The easy part (because it is not platform-specific) is the separation of 'gcc' and 'g++': gcc is the GNU C Compiler from the GCC (GNU Compiler Collection). g++ i

CC,gcc和g ++之间的区别?

在汇编代码生成,可用库,语言特性等方面编译C和C ++代码时,3个编译器CC,gcc,g ++之间有什么区别? 对此的答案是特定于平台的; 例如,Linux上发生的事情与Solaris上发生的事情不同。 容易的部分(因为它不是平台特定的)是'gcc'和'g ++'的分离: gcc是GCC(GNU编译器集合)的GNU C编译器。 g ++是GCC的GNU C ++编译器。 困难的部分,因为它是平台特定的,是'CC'(和'cc')的含义。

Why does the order in which libraries are linked sometimes cause errors in GCC?

为什么库链接的顺序有时会在GCC中导致错误? (See the history on this answer to get the more elaborate text, but I now think it's easier for the reader to see real command lines). Common files shared by all below commands $ cat a.cpp extern int a; int main() { return a; } $ cat b.cpp extern int b; int a = b; $ cat d.cpp int b; Linking to static libraries $ g++ -c b.cpp -o b.o $ ar cr li

为什么库链接的顺序有时会在GCC中导致错误?

为什么库链接的顺序有时会在GCC中导致错误? (请参阅此答案的历史记录以获取更精细的文本,但现在我认为读者可以更容易地看到真实的命令行)。 以下所有命令共享的公共文件 $ cat a.cpp extern int a; int main() { return a; } $ cat b.cpp extern int b; int a = b; $ cat d.cpp int b; 链接到静态库 $ g++ -c b.cpp -o b.o $ ar cr libb.a b.o $ g++ -c d.cpp -o d.o $ ar cr libd.a d.o $ g++ -L. -ld -lb a.cpp #

How could I optimize this calculation ? (x^a + y^a +z^a)^(1/a)

As the title shows. I need to do a lot of the calculation like this: re = (x^a + y^a + z^a)^(1/a). where {x, y, z} >= 0. more specific, a is a positive floating point constant, and x, y, z are floating point numbers. The ^ is an exponentiation operator. Currently, I'd prefer not to use SIMD, but hope for some other trick to speed it up. static void heavy_load(void) { static struct

我怎样才能优化这个计算? (x ^ a + y ^ a + z ^ a)^(1 / a)

正如标题所示。 我需要做很多这样的计算: re = (x^a + y^a + z^a)^(1/a). 其中{x,y,z}> = 0.更具体地说,a是正浮点常量,x,y,z是浮点数。 ^是一个指数运算符。 目前,我不想使用SIMD,但希望其他一些技巧加快速度。 static void heavy_load(void) { static struct xyz_t { float x,y,z; }; struct xyz_t xyzs[10000]; float re[10000] = {.0f}; const float a = 0.2; /* here fill xyzs using so

1,000,000,000 calculations per microsecond?

OK, I've been talking to a friend about compilers and optimisation of programs, and he suggested that n * 0.5 is faster than n / 2 . I said that compilers do that kind of optimisation automatically, so I wrote a small program to see if there was a difference between n / 2 and n * 0.5 : Division: #include <stdio.h> #include <time.h> int main(int argc, const char * argv[]) {

1,000微妙计算每微秒?

好的,我一直在和一位朋友谈论编译器和程序优化,他建议n * 0.5比n / 2更快。 我说编译器会自动进行这种优化,所以我编写了一个小程序来查看n / 2和n * 0.5之间是否有差别: 师: #include <stdio.h> #include <time.h> int main(int argc, const char * argv[]) { int i, m; float n, s; clock_t t; m = 1000000000; t = clock(); for(i = 0; i < m; i++) { n = i / 2;