From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 86C06385842C for ; Wed, 8 Sep 2021 15:31:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 86C06385842C Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 188F3IQ0185994; Wed, 8 Sep 2021 11:31:10 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3axrd4mww6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Sep 2021 11:31:09 -0400 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 188FJAIM066369; Wed, 8 Sep 2021 11:31:09 -0400 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 3axrd4mwvr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Sep 2021 11:31:09 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 188FI1WW024108; Wed, 8 Sep 2021 15:31:08 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma04dal.us.ibm.com with ESMTP id 3axcnj3c75-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Sep 2021 15:31:08 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 188FV6Yj15139268 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Sep 2021 15:31:06 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7C830BE06D; Wed, 8 Sep 2021 15:31:06 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C4C62BE05B; Wed, 8 Sep 2021 15:31:05 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.139.21]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTPS; Wed, 8 Sep 2021 15:31:05 +0000 (GMT) Date: Wed, 8 Sep 2021 11:31:03 -0400 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt Subject: Re: [PATCH] Fix SFmode subreg of DImode and TImode Message-ID: Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt References: <20210907230730.GM1583@gate.crashing.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Rw4F8Ozu2gBrB4Nr" Content-Disposition: inline In-Reply-To: <20210907230730.GM1583@gate.crashing.org> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: a0av7c2msK43EoMk--S_kKriBapHJ7O5 X-Proofpoint-ORIG-GUID: beRcbCqNMzLgwD8XWf3T_MRMqaE9qRrd X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-09-08_06:2021-09-07, 2021-09-08 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 phishscore=0 malwarescore=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 bulkscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109080094 X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2021 15:31:16 -0000 --Rw4F8Ozu2gBrB4Nr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Sep 07, 2021 at 06:07:30PM -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Sep 07, 2021 at 03:12:36AM -0400, Michael Meissner wrote: > > [PATCH] Fix SFmode subreg of DImode and TImode > > > > This patch fixes the breakage in the PowerPC due to a recent change in SUBREG > > behavior. > > But what was that change? And was that intentional? If so, why wasn't > it documented, was the existing behaviour considered buggy? But the > documentation agrees with the previous behaviour afaics. I haven't investigated what the machine independent change was that suddenly started allowing: (subreg:SF (reg:TI ...)) This patch just allows the subreg's to exist instead of getting an insn not found message. An alternative approach would be to allow wider subregs in the insn matching and deal with getting the right physical register when splitting. > > While it is arguable that the patch that caused the breakage should > > be reverted, this patch should be a bandage to prevent these changes from > > happening again. > > NAK. This patch will likely cause us to generate worse code. If that > is not the case it will need a long, in-depth explanation of why not. Trying to replicate the problem, I get exactly the same code with an older compiler. I will attach the test program as an attachment. > Sorry. > > > I first noticed it in building the Spec 2017 wrf_r and blender_r > > benchmarks. Once I applied this patch, I also noticed several of the > > tests now pass. > > > > The core of the problem is we need to treat SUBREG's of SFmode and SImode > > specially on the PowerPC. This is due to the fact that SFmode values that are > > in the vector and floating point registers are represented as DFmode. When we > > want to do a direct move between the GPR registers and the vector registers, we > > have to convert the value from the DFmode representation to/from the SFmode > > representation. > > The core of the problem is that subreg of pseudos has three meanings: > -- Paradoxical subregs; > -- Actual subregs; > -- "bit_cast" thingies: treat the same bits as something else. Like > looking at the bits of a float as its memory image. > > Ignoring paradoxical subregs (as well as subregs of mem, which should > have disappeared by now), and subregs of hard registers as well (those > have *different* semantics after all), the other two kinds can be mixed, > and *have to* be mixed, because subregs of subregs are non-canonical. This isn't subregs of subregs. This is a subreg of a larger integer type, for example having a subreg representing an __int128_t or structure, and accessing the bottom/top element. Instead of doing an extract, the compiler generates a subreg. Before the machine independent part of the compiler would not generate these subregs, and now it does. > > Is there any reason why not to allow this kind of subreg? > > If we want to not allow mixing bit_cast with subregs, we should make it > its own RTL code. > > > + /* In case we are given a SUBREG for a larger type, reduce it to > > + SImode. */ > > + if (mode == SFmode && GET_MODE_SIZE (inner_mode) > 4) > > + { > > + rtx tmp = gen_reg_rtx (SImode); > > + emit_move_insn (tmp, gen_lowpart (SImode, source)); > > + emit_insn (gen_movsf_from_si (dest, tmp)); > > + return true; > > + } > > This makes it two separate insns. Is that always optimised to code that > is at least as good as before? It is likely even if it generates an extra move, it will be better, because the default would be to do the transfer via store on one register bank and load of the other. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.ibm.com, phone: +1 (978) 899-4797 --Rw4F8Ozu2gBrB4Nr Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="foo-subreg.c" union u_128 { __int128_t i128; long l[2]; int i[4]; double d[2]; float f[4]; }; union u_64 { long l; int i[2]; double d; float f[2]; }; float ret_u128_f0 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ y; return u.f[0]; } float ret_u128_f1 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ y; return u.f[1]; } float ret_u128_f2 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ x; return u.f[2]; } float ret_u128_f3 (long a, long b, long x, long y) { union u_128 u; u.l[0] = a ^ x; u.l[1] = b ^ x; return u.f[3]; } float ret_u64_f0 (long a, long x) { union u_64 u; u.l = a ^ x; return u.f[0]; } float ret_u64_f1 (long a, long x) { union u_64 u; u.l = a ^ x; return u.f[1]; } --Rw4F8Ozu2gBrB4Nr--