From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10600 invoked by alias); 17 Aug 2012 15:20:34 -0000 Received: (qmail 10576 invoked by uid 22791); 17 Aug 2012 15:20:32 -0000 X-SWARE-Spam-Status: No, hits=-1.6 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 17 Aug 2012 15:20:18 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Fri, 17 Aug 2012 16:20:17 +0100 Received: from [10.1.69.67] ([10.1.255.212]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 17 Aug 2012 16:22:01 +0100 Message-ID: <502E612E.60706@arm.com> Date: Fri, 17 Aug 2012 15:20:00 -0000 From: Richard Earnshaw User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: Andrew Stubbs CC: "gcc-patches@gcc.gnu.org" , Richard Sandiford Subject: Re: [patch, tree-ssa] PR54295 Incorrect value extension in widening multiply-accumulate References: <502E4F54.9040309@arm.com> <502E53A7.406@codesourcery.com> <502E55CE.6020304@arm.com> <502E57A9.8050502@codesourcery.com> <502E598F.9010105@arm.com> <502E5DE6.8090806@codesourcery.com> In-Reply-To: <502E5DE6.8090806@codesourcery.com> X-MC-Unique: 112081716201701101 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-08/txt/msg01187.txt.bz2 On 17/08/12 16:06, Andrew Stubbs wrote: > On 17/08/12 15:47, Richard Earnshaw wrote: >> If we don't have a 16x16->64 mult operation then after step 1 we'll >> still have a MULT_EXPR, not a WIDEN_MULT_EXPR, so when we reach step2 >> there's nothing to short circuit. >> >> Unless, of course, you're expecting us to get >> >> step1 -> 16x16->32 widen mult >> step2 -> widen64(step1) + acc64 >=20 > No, given a u16xu16->u64 operation in the code, and that the arch=20 > doesn't have such an opcode, I'd expect to get >=20 > step1 -> (u32)u16 x (u32)u16 -> u64 Hmm, I would have thought that would be more costly than (u64)(u16 x u16 -> u32) >=20 > Likewise, 8x8->32 might give (16)8x(16)8->32. >=20 > The code can't see that the widening operation is non-optimal without=20 > looking beyond into its inputs. Ok, in which case we have to give is_widening_mult_rhs_p enough smarts to not strip (s32)u32 and return u32. I'll have another think about it. R.