From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 77742 invoked by alias); 7 Apr 2016 21:56:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 77726 invoked by uid 89); 7 Apr 2016 21:56:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=lra, *loc, unlucky, Hx-languages-length:4344 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 07 Apr 2016 21:56:39 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 33A1E85364 for ; Thu, 7 Apr 2016 21:56:38 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-113-22.phx2.redhat.com [10.3.113.22]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u37LuavM016506 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 7 Apr 2016 17:56:37 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u37LuYYl016247; Thu, 7 Apr 2016 23:56:35 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u37LuXS1016246; Thu, 7 Apr 2016 23:56:33 +0200 Date: Thu, 07 Apr 2016 21:56:00 -0000 From: Jakub Jelinek To: Jeff Law , Bernd Schmidt Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Don't add REG_EQUAL notes in fwprop for paradoxical subregs (PR rtl-optimization/70574) Message-ID: <20160407215633.GM19207@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes X-SW-Source: 2016-04/txt/msg00365.txt.bz2 Hi! The following testcase is miscompiled, because we have: (set (reg:SI ...) (subreg:SI (reg:QI (...)) 0)) instruction and the fwprop attempts to propagate (const_int -1) into the reg:QI use in there, but gives up because costs don't say it is beneficial and adds instead REG_EQUAL (const_int -1) note on the insn. That is wrong though, it is fine to optimize this insn to (set (reg:SI ...) (const_int -1)) because the higher bits in the paradoxical subreg are undefined, but as they are undefined, if the subreg is kept, those bits can be anything. If we say the subreg is equal to (const_int -1), it means e.g. CSE2 can replace other places that need SImode -1 with the SET_DEST of this insn, but then it really depends on what bits actually end up in the register. If we are unlucky, and it is e.g. spilled during LRA and reloaded using QImode, the upper bits can be anything. Not sure if this patch catches everything though, perhaps there could be e.g. (set (reg:SI ...) (plus:SI ((subreg:SI (reg:QI ...) 0) (const_int ...))) and we'd still assign REG_EQUAL note. So maybe instead we should walk the *loc expression and look for paradoxical subregs, and for each of them, if we find the DF_REF_REG (use) mentioned in their operand, clear set_reg_equal. Though of course, if DF_REF_REG (use) itself is a paradoxical subreg, we could clear set_reg_equal without any walking. 2016-04-07 Jakub Jelinek PR rtl-optimization/70574 * fwprop.c (forward_propagate_and_simplify): Don't add REG_EQUAL note if DF_REF_REG (use) is a paradoxical subreg. * gcc.target/i386/avx2-pr70574.c: New test. --- gcc/fwprop.c.jj 2016-01-04 14:55:53.000000000 +0100 +++ gcc/fwprop.c 2016-04-07 18:01:42.953844357 +0200 @@ -1213,7 +1213,7 @@ forward_propagate_and_simplify (df_ref u rtx_insn *use_insn = DF_REF_INSN (use); rtx use_set = single_set (use_insn); rtx src, reg, new_rtx, *loc; - bool set_reg_equal; + bool set_reg_equal = true; machine_mode mode; int asm_use = -1; @@ -1240,7 +1240,15 @@ forward_propagate_and_simplify (df_ref u /* Check if the use has a subreg, but the def had the whole reg. Unlike the previous case, the optimization is possible and often useful indeed. */ else if (GET_CODE (reg) == SUBREG && REG_P (SET_DEST (def_set))) - reg = SUBREG_REG (reg); + { + /* If the use is a paradoxical subreg, make sure we don't add a + REG_EQUAL note for it, because it is not equivalent, it is one + possible value for it, but we can't rely on it holding that value. + See PR70574. */ + if (paradoxical_subreg_p (reg)) + set_reg_equal = false; + reg = SUBREG_REG (reg); + } /* Make sure that we can treat REG as having the same mode as the source of DEF_SET. */ @@ -1301,13 +1309,13 @@ forward_propagate_and_simplify (df_ref u otherwise. We also don't want to install a note if we are merely propagating a pseudo since verifying that this pseudo isn't dead is a pain; moreover such a note won't help anything. */ - set_reg_equal = (note == NULL_RTX - && REG_P (SET_DEST (use_set)) - && !REG_P (src) - && !(GET_CODE (src) == SUBREG - && REG_P (SUBREG_REG (src))) - && !reg_mentioned_p (SET_DEST (use_set), - SET_SRC (use_set))); + set_reg_equal &= (note == NULL_RTX + && REG_P (SET_DEST (use_set)) + && !REG_P (src) + && !(GET_CODE (src) == SUBREG + && REG_P (SUBREG_REG (src))) + && !reg_mentioned_p (SET_DEST (use_set), + SET_SRC (use_set))); } if (GET_MODE (*loc) == VOIDmode) --- gcc/testsuite/gcc.target/i386/avx2-pr70574.c.jj 2016-04-07 18:09:25.788519218 +0200 +++ gcc/testsuite/gcc.target/i386/avx2-pr70574.c 2016-04-07 18:09:21.825573327 +0200 @@ -0,0 +1,26 @@ +/* PR rtl-optimization/70574 */ +/* { dg-do run { target lp64 } } */ +/* { dg-require-effective-target avx2 } */ +/* { dg-options "-O -frerun-cse-after-loop -fno-tree-ccp -mcmodel=medium -mavx2" } */ +/* { dg-additional-options "-fPIC" { target fpic } } */ + +#include "avx2-check.h" + +typedef char A __attribute__((vector_size (32))); +typedef short B __attribute__((vector_size (32))); + +int +foo (int x, __int128 y, __int128 z, A w) +{ + y <<= 64; + w *= (A) { 0, -1, z, 0, ~y }; + return w[0] + ((B) { x, 0, y, 0, -1 } | 1)[4]; +} + +static void +avx2_test () +{ + int x = foo (0, 0, 0, (A) {}); + if (x != -1) + __builtin_abort (); +} Jakub