c - How to update an array in vectorized assembly(AVX)? -
inline void addition(double * x, const double * vx,uint32_t size){ /*for (uint32_t i=0;i<size;++i){ x[i] = x[i] + vx[i]; }*/ __asm__ __volatile__ ( "1: \n\t" "vmovupd -32(%0), %%ymm1\n\t" "vmovupd (%0), %%ymm0\n\t" "vaddpd -32(%1), %%ymm0, %%ymm0\n\t" "vaddpd (%1), %%ymm1, %%ymm1\n\t" "vmovupd %%ymm0, -32(%0)\n\t" "vmovupd %%ymm1, (%0)\n\t" "addq $128, %0\n\t" "addq $128, %1\n\t" "addl $-8, %2\n\t" "jne 1b" : : "r" (x),"r"(vx),"r"(size) : "ymm0", "ymm1" ); }
i practicing assembly(avx instructions) right write above piece of code in inline assembly replace c code in original function(which commented out). compiling process successful when try run program, error happens: bus error: 10
thoughts bug? didn't know what's wrong here. compiler version clang 602.0.53. thank you!
inline assembly complicated beast, if want practice avx assembly use separate asm file don't have put compiler. in exchange, need observe calling convention though.
you have issues constraints. example, change input registers without telling compiler , can cause sorts of weird problems elsewhere in compiler generated code. need specify memory
clobber obvious reasons.
also, learn use debugger can find exact cause of problems , fix own code.
failing that, @ least comment code can figure out intentions. in case, particularly puzzled why use -32
offset address before array. think wanted +32
there. using 2 avx registers @ 32 bytes each, of course need advance pointers 64 not 128. have ymm0
, ymm1
swapped in initial load.
this code seems work fine me:
#include <stdio.h> #include <stdint.h> inline void addition(double * x, const double * vx,uint32_t size){ /*for (uint32_t i=0;i<size;++i){ x[i] = x[i] + vx[i]; }*/ __asm__ __volatile__ ( "1: \n\t" "vmovupd 32(%0), %%ymm0\n\t" "vmovupd (%0), %%ymm1\n\t" "vaddpd 32(%1), %%ymm0, %%ymm0\n\t" "vaddpd (%1), %%ymm1, %%ymm1\n\t" "vmovupd %%ymm0, 32(%0)\n\t" "vmovupd %%ymm1, (%0)\n\t" "addq $64, %0\n\t" "addq $64, %1\n\t" "addl $-8, %2\n\t" "jne 1b" : "+r" (x),"+r"(vx),"+r"(size) : : "ymm0", "ymm1", "memory" ); } int main() { double x[] = { 1, 2, 3, 4, 5, 6, 7, 8 }; double vx[] = { 9, 10, 11, 12, 13, 14, 15, 16 }; int i; addition(x, vx, 8); for(i = 0; < 8; i++) printf("%g ", x[i]); putchar('\n'); return 0; }
Comments
Post a Comment