From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-381928-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 9994 invoked by alias); 30 Jan 2012 23:17:41 -0000
Received: (qmail 9984 invoked by uid 22791); 30 Jan 2012 23:17:39 -0000
X-SWARE-Spam-Status: No, hits=-2.8 required=5.0	tests=ALL_TRUSTED,AWL,BAYES_00,TW_MX
X-Spam-Check-By: sourceware.org
Received: from localhost (HELO gcc.gnu.org) (127.0.0.1)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 30 Jan 2012 23:17:26 +0000
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/52056] Code optimization sensitive to trivial changes
Date: Tue, 31 Jan 2012 01:04:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: minor
X-Bugzilla-Who: jakub at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Changed-Fields: CC
Message-ID: <bug-52056-4-av5VVhLoQf@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-52056-4@http.gcc.gnu.org/bugzilla/>
References: <bug-52056-4@http.gcc.gnu.org/bugzilla/>
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
Content-Type: text/plain; charset="UTF-8"
MIME-Version: 1.0
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2012-01/txt/msg03603.txt.bz2

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52056

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |irar at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-01-30 23:16:03 UTC ---
The signed vs. unsigned long right shift is quite significant, because Intel
chips don't support signed quadword right shifts, only unsigned quadword right
shifts (and left shifts), except that AMD chips with -mxop do support that.
So, with the unsigned long right shift the loop is vectorized, while with
signed long right shift it is not, and clearly in this case the vectorization
(at least two elements at a time) isn't beneficial, but the cost model doesn't
figure that out.  So the faster times are without vectorization, you can get
the same speed with -O3 -fno-tree-vectorize even with the unsigned shift.
Even AVX can't process more than two elements at a time, only AVX2 will be
able, how fast is that loop on AVX2 capable chips compared to non-vectorized
remains to be seen.