From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1666 invoked by alias); 4 Sep 2014 08:37:01 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 1616 invoked by uid 48); 4 Sep 2014 08:36:58 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32. Date: Thu, 04 Sep 2014 08:37:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.8.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.8.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-09/txt/msg01016.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148 --- Comment #5 from Richard Biener --- The input to the vectorizer is already bogus: _12 = i.0_5 + 536870911; _13 = global_data.b[_12]; the issue seems to be that 'sizetype' is used to index the array: *(double * const) &global_data.a[(sizetype) i] = *(double * const) &global_data.a[(sizetype) i] + *(double * const) &global_data.c[(sizetype) i] * *(double * const) &global_data.d[(sizetype) i]; Ok, so it's one of the suspicious transforms in fold-const.c (I removed all these sorts of transforms from GIMPLE already...). try_move_mult_to_index. Bah. It's never correct (for later data-dependence) to "reconstruct" ARRAY_REFs from pointer arithmetic. Here we fold (ssizetype) (((sizetype) i + 536870911) * 8) to &global_data.b[(sizetype) i + 536870911]. But that's not the same as data-dependence analysis doesn't interpret the array index as only ending up in the address computation which multiplies the index by 8 again and thus correctly arrives at i * 8 + -8U. That is, you can't simply strip an unsigned multiplication this way. For 64bit we seem to be lucky and we retain (sizetype) ((long unsigned int) i * 8) + 18446744073709551608 so we didn't move the multiplication out. That is because fold_plusminus_mult_expr only handles signed HWI and 18446744073709551608 is too large for a signed HWI. So maybe we can apply a not so invasive fix here by restricting both to signed or unsigned with no sign bit set values. Doing that fixes the testcase but also ends up with mixed pointer/array accesses which dependence analysis cannot handle so we get versioning for aliasing. If OTOH we disable the offending transform to an array access we get only pointer-based accesses and data dependence analysis fails correctly and we don't get anything vectorized here.