public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "andysem at mail dot ru" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/99971] New: GCC generates partially vectorized and scalar code at once
Date: Thu, 08 Apr 2021 14:41:49 +0000	[thread overview]
Message-ID: <bug-99971-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99971

            Bug ID: 99971
           Summary: GCC generates partially vectorized and scalar code at
                    once
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andysem at mail dot ru
  Target Milestone: ---

Consider the following code sample:

struct A
{
    unsigned int a, b, c, d;

    A& operator+= (A const& that)
    {
        a += that.a;
        b += that.b;
        c += that.c;
        d += that.d;
        return *this;
    }

    A& operator-= (A const& that)
    {
        a -= that.a;
        b -= that.b;
        c -= that.c;
        d -= that.d;
        return *this;
    }
};

void test(A& x, A const& y1, A const& y2)
{
    x += y1;
    x -= y2;
}

The code, when compiled with options "-O3 -march=nehalem", generates:

test(A&, A const&, A const&):
        pushq   %rbp
        movdqu  (%rdi), %xmm1
        pushq   %rbx
        movl    4(%rsi), %r8d
        movdqu  (%rsi), %xmm0
        movl    (%rsi), %r9d
        paddd   %xmm1, %xmm0
        movl    8(%rsi), %ecx
        movl    12(%rsi), %eax
        movl    %r8d, %esi
        movl    (%rdi), %ebp
        movl    4(%rdi), %ebx
        movl    8(%rdi), %r11d
        movl    12(%rdi), %r10d
        movups  %xmm0, (%rdi)
        subl    (%rdx), %r9d
        subl    4(%rdx), %esi
        subl    8(%rdx), %ecx
        subl    12(%rdx), %eax
        addl    %ebp, %r9d
        addl    %ebx, %esi
        movl    %r9d, (%rdi)
        popq    %rbx
        addl    %r11d, %ecx
        popq    %rbp
        movl    %esi, 4(%rdi)
        addl    %r10d, %eax
        movl    %ecx, 8(%rdi)
        movl    %eax, 12(%rdi)
        ret

https://gcc.godbolt.org/z/Mzchj8bxG

Here you can see that the compiler has partially vectorized the test function -
it converted "x += y1" to paddd, as expected, but failed to vectorize "x -=
y2". But at the same time the compiler also generated scalar code, including
for the already vectorized "x += y1" line, basically duplicating it.

Note that when either "x += y1" or "x -= y2" is commented, the compiler is able
to vectorize the line that is left. It is also able to vectorize both lines
when the += and -= operators are applied to different objects instead of x.

This is reproducible since gcc 8 up to and including 10.2. gcc 7 doesn't
vectorize this code. With the current trunk on godbolt the generated code is
different:

test(A&, A const&, A const&):
        movdqu  (%rsi), %xmm0
        movdqu  (%rdi), %xmm1
        paddd   %xmm1, %xmm0
        movups  %xmm0, (%rdi)
        movd    %xmm0, %eax
        subl    (%rdx), %eax
        movl    %eax, (%rdi)
        pextrd  $1, %xmm0, %eax
        subl    4(%rdx), %eax
        movl    %eax, 4(%rdi)
        pextrd  $2, %xmm0, %eax
        subl    8(%rdx), %eax
        movl    %eax, 8(%rdi)
        pextrd  $3, %xmm0, %eax
        subl    12(%rdx), %eax
        movl    %eax, 12(%rdi)
        ret

Here the compiler is able to vectorize "x += y1" but not "x -= y2". At least,
it removed the duplicate scalar version of "x += y1".

Given that the compiler is able to vectorize each line in isolation, I would
expect it to be able to vectorize them combined. Generating duplicate versions
of code is certainly not expected.

             reply	other threads:[~2021-04-08 14:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 14:41 andysem at mail dot ru [this message]
2021-04-08 14:45 ` [Bug tree-optimization/99971] " andysem at mail dot ru
2021-04-09  7:05 ` rguenth at gcc dot gnu.org
2021-04-15  9:15 ` andysem at mail dot ru
2021-04-15 11:26 ` rguenth at gcc dot gnu.org
2021-04-15 11:30 ` rguenth at gcc dot gnu.org
2021-04-15 16:01 ` andysem at mail dot ru
2021-04-15 23:17 ` david.bolvansky at gmail dot com
2021-04-23  7:35 ` cvs-commit at gcc dot gnu.org
2021-04-23  7:37 ` rguenth at gcc dot gnu.org
2021-04-23  8:43 ` andysem at mail dot ru
2021-04-23  9:03 ` rguenther at suse dot de

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-99971-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).