public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/67167] New: cilkplus vectorization problems
@ 2015-08-10  7:56 marcin.krotkiewski at gmail dot com
  2015-08-11  8:57 ` [Bug c/67167] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: marcin.krotkiewski at gmail dot com @ 2015-08-10  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67167

            Bug ID: 67167
           Summary: cilkplus vectorization problems
           Product: gcc
           Version: 5.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marcin.krotkiewski at gmail dot com
  Target Milestone: ---

I think there is a problem with vectorization of arithmetic operations in the
cilkplus implementation in gcc. I have inspected generated asm of the following
two implementations of vector addition (a = a + b). The code is compiled with
'gcc -O3 -mavx -ftree-vectorize -fopt-info-vec -fcilkplus test.c'.


// ICC compatibility - alignment hint
#ifdef __GNUC__

#define __assume_aligned(lvalueptr, align) lvalueptr = __builtin_assume_aligned
(lvalueptr, align)

#endif
#define RESTRICT __restrict__

typedef double Double;

void test(Double * RESTRICT a, Double * RESTRICT b, int size)
{
  int i;

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  for(i=0; i<size; i++)
    a[i] = a[i] + b[i];

}


void test_cilkplus1(Double * RESTRICT a, Double * RESTRICT b, int size)
{

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  a[0:size] = a[0:size] + b[0:size];

}


The first code (test) is vectorized as expected - here comes the ASM:

.L4:
        vmovapd (%rdi,%r8), %ymm0
        addl    $1, %r9d
        vaddpd  (%rsi,%r8), %ymm0, %ymm0
        vmovapd %ymm0, (%rdi,%r8)
        addq    $32, %r8
        cmpl    %r9d, %ecx
        ja      .L4


On the contrary, the second function (test_cilkplus1) is not vectorized:

.L21:
        vmovsd  (%rdi,%rax), %xmm0
        movl    %ecx, %r8d
        addl    $1, %ecx
        vaddsd  (%rsi,%rax), %xmm0, %xmm0
        vmovsd  %xmm0, (%rdi,%rax)
        addq    $8, %rax
        cmpl    %r8d, %edx
        jg      .L21


Now I have made sure that the compiler understands that there is no aliasing
(restrict) and that the vectors are aligned in memory. Clearly this is enough
for the standard implementation to produce a vectorized code, but not for the
CilkPlus array notation.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-03  9:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-10  7:56 [Bug c/67167] New: cilkplus vectorization problems marcin.krotkiewski at gmail dot com
2015-08-11  8:57 ` [Bug c/67167] " rguenth at gcc dot gnu.org
2015-08-13  7:06 ` rguenth at gcc dot gnu.org
2015-08-13  7:10 ` rguenth at gcc dot gnu.org
2022-01-03  9:50 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).