Maximum SIMD integer multiplications on Ivy Bridge using SSE/AVX?

Would somebody be able to advise me how I can work out the maximum number of 32-bit unsigned integer multiplications I would be able to do concurrently on an Ivy Bridge CPU using SIMD via SSE/AVX?

I understand AVX did have 256-bit registers for multiplication but this was for floating point (AVX2 introduced 256-bit integer registers). Therefore I am not overly sure whether it would be better to use floating-point registers for integer multiplication (if thats even possible)?

In addition, I am unsure whether it matters on just the number of registers, or whether I need to look at the ports of the CPU. Looks like port 0 and port 5 can handle SSE integer ALU?


You can do one pmulld = 4 multiplications per clock.

Therefore I am not overly sure whether it would be better to use floating-point registers for integer multiplication (if thats even possible)?

Nothing like that is possible. You can put 8 integers in an ymm register of course, but then you're stuck. The instruction you'd need to do something useful with them is in AVX2.


As you can see here:

  • Can long integer routines benefit from SSE?
  • SSE multiplication of 2 64-bit integers
  • There is no current solution to improve multiplication of long integers with SSE or AVX.

    链接地址: http://www.djcxy.com/p/85658.html

    上一篇: 每个周期的浮点操作

    下一篇: 使用SSE / AVX在Ivy Bridge上进行最大SIMD整数乘法?