From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id CD28B3861893; Thu, 15 Apr 2021 07:35:50 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CD28B3861893
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/100076] eembc/automotive/basefp01 has 30.3%
 regression compare -O2 -ftree-vectorize with -O2 on CLX/Znver3
Date: Thu, 15 Apr 2021 07:35:50 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 11.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: everconfirmed cf_reconfirmed_on bug_status
Message-ID: <bug-100076-4-G8wQ8R62pX@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-100076-4@http.gcc.gnu.org/bugzilla/>
References: <bug-100076-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Apr 2021 07:35:50 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100076

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-04-15
             Status|UNCONFIRMED                 |NEW
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note even when avoiding the STLF hit the vectorized version is slower.
You can use -mtune-ctl=3D^sse_unaligned_load_optimal to force loading
the lower/upper half of vectors separately.

The reason is that without -ffast-math we are using an in-order reduction
which doesn't save us much but instead just combines dependence chains
here.  We do have a related bug for this somewhere.

With -ffast-math the version with/without
-mtune-ctl=3D^sse_unaligned_load_optimal
is about the same speed, so STLF is a red herring here (on Zen2).

Still not vectorizing is a lot faster.

Can you check if -mtune-ctl=3D^sse_unaligned_load_optimal helps on CLX?=