A fast method to round a double to a 32

When reading Lua's source code, I noticed that Lua uses a macro to round a double to a 32-bit int . I extracted the macro , and it looks like this: union i_cast {double d; int i[2]}; #define double2int(i, d, t) {volatile union i_cast u; u.d = (d) + 6755399441055744.0; (i) = (t)u.i[ENDIANLOC];} Here ENDIANLOC is defined as endianness, 0 for little endian, 1 for big endian. Lua c

一种快速方法将一个double加到一个32

在阅读Lua的源代码时,我注意到Lua使用一个macro将double加到32位int 。 我提取了macro ,它看起来像这样: union i_cast {double d; int i[2]}; #define double2int(i, d, t) {volatile union i_cast u; u.d = (d) + 6755399441055744.0; (i) = (t)u.i[ENDIANLOC];} 这里ENDIANLOC被定义为字节序, 0表示小端, 1表示大端。 Lua仔细处理排序。 t代表整数类型,如int或unsigned int 。 我做了一点研究,并且有

"Isolate" specific Row/Column/Diagonal from a 64

OK, let's consider a 64-bit number, with its bits forming a 8x8 table. Eg 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 written as a b c d e f g h ---------------- 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 Now, wh

从64位“隔离”特定的行/列/对角线

好吧,让我们考虑一个64位数字,它的位形成一个8x8表格。 例如 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 写为 a b c d e f g h ---------------- 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 现在,如果我们想要隔离JUST,例

Explanation of a method of counting set bits in a 32

This question already has an answer here: How to count the number of set bits in a 32-bit integer? 50 answers It works because you can count the total number of set bits by dividing in two halves, counting the number of set bits in both halves and then adding them up. Also know as Divide and Conquer paradigm. Let's get into detail.. v = v - ((v >> 1) & 0x55555555); The num

在32位计数设置位的方法的说明

这个问题在这里已经有了答案: 如何计算一个32位整数中的设置位数? 50个答案 它的工作原理是,你可以通过分成两半来计算所设置位的总数,计算两个半部分中设置位的数量,然后将它们相加。 也被称为Divide and Conquer范式。 让我们进入细节.. v = v - ((v >> 1) & 0x55555555); 位的两个比特的数目可以是0b00 , 0b01或0b10 。 让我们试着去解决这个问题。 --------------------------------------------

C++ how to get length of bits of a variable?

This question already has an answer here: How to count the number of set bits in a 32-bit integer? 50 answers Warning: math ahead. If you are squeamish, skip ahead to the TL;DR. What you are really looking for is the highest bit that is set. Let's write out what the binary number 10001 11010111 actually means: x = 1 * 2^(12) + 0 * 2^(11) + 0 * 2^(10) + ... + 1 * 2^1 + 1 * 2^0 where

C ++如何获得一个变量的位长度?

这个问题在这里已经有了答案: 如何计算一个32位整数中的设置位数? 50个答案 警告:数学提前。 如果你很娇气,可以直接跳到TL; DR。 你真正想要的是设定的最高位。 让我们写出二进制数字10001 11010111的实际含义: x = 1 * 2^(12) + 0 * 2^(11) + 0 * 2^(10) + ... + 1 * 2^1 + 1 * 2^0 其中*表示乘法, ^是取幂。 你可以这样写 2^12 * (1 + a) 其中0 < a < 1 (准确地说, a = 0/2 + 0/2^2 + ... + 1/2^

Compare two binary numbers and get the diffrent bits

Possible Duplicate: Best algorithm to count the number of set bits in a 32-bit integer? I want to write a program to get the number of 1's bit in comparing two numbers.if I compare the bits between any two numbers to find where the binary numbers are different in the 1's and 0's. in other words Exclusive OR (XOR) relationship. like if 22 (which has 10110 binary)and compare it w

比较两个二进制数并得到不同的位

可能重复: 计算32位整数中设定位数的最佳算法? 我想编写一个程序来比较两个数字,得到1的位数。如果比较两个数字之间的位数,找出二进制数字在1和0中的不同位置。 换句话说,异或(XOR)关系。 如果22(其具有10110二进制)并且将其与15(其具有01111二进制)进行比较, 第一个10110 第二个01111 结果11001 答案是25,但我想得到的是3,其中有三个1和0是不同的。 Hrmmm是我想到的第一个非递归的想法: int a =

Mac OSX minumum support sse version

What is the minimum supported sse flag that can be enabled on osx?. most of hardware, I uses supports sse2 these days. On windows and linux, I have some code to test sse support. I read somewhere that osx has support for sse for long time. But I don't know which is minimum version that can be enabled. The final binary will be copied to other osx platforms so I cannot use -march=native li

Mac OSX minumum支持sse版本

什么是可以在osx上启用的最小支持sse标志? 大部分硬件,我使用支持sse2这些天。 在Windows和Linux上,我有一些代码来测试sse支持。 我在某处读到osx长时间支持sse。 但我不知道哪个是可以启用的最低版本。 最终的二进制文件将被复制到其他osx平台,所以我不能像GCC那样使用-march = native 如果它在所有构建中默认启用,那么在构建我的代码时是否必须传递-msse或-msse2标志? 这里是编译器版本: Apple LLVM version

c++

I am writing a program to compute Groebner bases using the library FGB. While it has a C interface, I am calling the library from C++ code compiled with g++ on Ubuntu. Compiling with the option -g and using x/i $pc in gdb, the illegal instruction is as follows. 0x421c39 FGb_xmalloc_spec+985: vcvtsi2sd %rbx,%xmm0,%xmm0 The line above has angle brackets around FGB_xmalloc_spec+985. As far a

C ++

我正在编写一个程序来使用库FGB计算Groebner碱基。 虽然它有一个C接口,但我正在用Ubuntu上用g ++编译的C ++代码调用库。 使用g选项编译选项并在gdb中使用x / i $ pc,非法指令如下所示。 0x421c39 FGb_xmalloc_spec + 985:vcvtsi2sd%rbx,%xmm0,%xmm0 上面的行在FGB_xmalloc_spec + 985周围有尖括号。 据我所知,我的处理器不支持这个指令,我正试图弄清楚为什么程序使用它。 它看起来像指令来自库代码。 然而,

c++

I was using tiny-dnn recently, however when I try to use AVX/AVX2 instructions to train example_minist_train, eg set(USE_AVX ON) , I got Illegal instruction (core dumped) error. I try to debug in Clion-IDE, I got the error in gdb : Program received signal SIGILL, Illegal instruction. 0x0000000000448b66 in tiny_dnn::weight_init::xavier::fill(std::vector >*, unsigned long, unsigned long) ()

C ++

我最近使用的是tiny-dnn,但是当我尝试使用AVX / AVX2指令来训练example_minist_train时,例如set(USE_AVX ON),我得到非法指令(核心转储)错误。 我尝试在Clion-IDE中进行调试,在gdb中收到错误:程序收到信号SIGILL,非法指令。 0x0000000000448b66在tiny_dnn :: weight_init :: xavier :: fill(std :: vector> *,unsigned long,unsigned long)() 我使用Ubuntu 14.04 STL 64位系统,以及我的cpu的信息,如下所

Deoptimizing a program for the pipeline in Intel Sandybridge

I've been racking my brain for a week trying to complete this assignment and I'm hoping someone here can lead me toward the right path. Let me start with the instructor's instructions: Your assignment is the opposite of our first lab assignment, which was to optimize a prime number program. Your purpose in this assignment is to pessimize the program, ie make it run slower. Both o

英特尔Sandybridge的管道优化项目

我一直在努力完成这项任务,花了一个星期的时间,我希望有人能带领我走向正确的道路。 让我从讲师的指示开始: 您的任务与我们的第一个实验任务相反,这是为了优化质数计划。 你在这项任务中的目的是让程序变得更加悲观,即让它运行得更慢。 这两个都是CPU密集型程序。 他们需要几秒钟才能在我们的实验室PC上运行。 你可能不会改变算法。 为了消除该方案的优化,请使用您对英特尔i7管道如何运作的了解。 想象一下如何

Simple Assembly Language doubts

I had worked out some code for my assignment and something tells me that I'm not doing it correctly.. Hope someone can take a look at it. Thank you! AREA Reset, CODE, READONLY ENTRY LDR r1, = 0x13579BA0 MOV r3, #0 MOV r4, #0 MOV r2, #8 Loop CMP r2, #0 BGE DONE LDR r5, [r1, r4] AND r5, r5, #0x00000000 ADD r3, r3, r5 ADD r4, r4, #4 SUB r2, r2, #1 B

简单的汇编语言怀疑

我已经为我的任务制定了一些代码,并告诉我,我没有正确地做它..希望有人可以看看它。 谢谢! AREA Reset, CODE, READONLY ENTRY LDR r1, = 0x13579BA0 MOV r3, #0 MOV r4, #0 MOV r2, #8 Loop CMP r2, #0 BGE DONE LDR r5, [r1, r4] AND r5, r5, #0x00000000 ADD r3, r3, r5 ADD r4, r4, #4 SUB r2, r2, #1 B Loop LDR r0, [r3] DONE B DONE END 编写一个ARM汇编