From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id D213E3839417 for ; Thu, 15 Jul 2021 11:59:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D213E3839417 Received: by mail-ej1-x630.google.com with SMTP id go30so8829413ejc.8 for ; Thu, 15 Jul 2021 04:59:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=152R4SFR7chP//3An2DmfM0CzVdy/9eq5ADf8OYwCFY=; b=C7oXivzNVeKCyMp2sLFy6II3SLQTWNIDJTK5vuMxLmkOB0gR7J9QfmEFxVUxFcNKA2 JBCvyO0ENann7y3eia5jMjjaM7RUu/UmBYGErF5OG5bRkEEnhAmILBydWPOYtV2//LIJ IDDNnrrNl9/HUd2Nwr0xcWWejkvGy0RL8N3dJqIIbWij6mmfsrwsq6iKkADBJYnO4/1w kZGwW3DUybUgnbN8JH6hfWa8ZZp6G7+X05jNK3HANJoozmDMYR1vQUU13Q3MbVUWZfiW c/HvR063m/wjE1ndhijjEWeiRBxYFQl1zYjycPvJ6a+HLkodAnV+JSQhpYBEv/c3Qsbo +iNg== X-Gm-Message-State: AOAM533h1AxyRkI5zdVc6vCbBzeqs4x7faw2yDCxDJLB/mg9up8WLscd hW9Cc2xLSlXgmHk3EpR+G35f4ZPqETYyBiYxN1E= X-Google-Smtp-Source: ABdhPJxvlR4bzyBw2A9ywnsJ5p212bOOmyZnth2wxMRcNw1cgIGrWrZlpbVSs+/6QuXQMdzqsyHubysJ5FvOstjJqpI= X-Received: by 2002:a17:906:e1a:: with SMTP id l26mr5169635eji.129.1626350342857; Thu, 15 Jul 2021 04:59:02 -0700 (PDT) MIME-Version: 1.0 References: <0b72fa77-a281-35e6-34e3-17cf26f18bc1@linux.ibm.com> <46838de4-3d92-a270-e71a-73fbe923d306@linux.ibm.com> <926b210b-4394-6410-548a-33ca0297955a@linux.ibm.com> In-Reply-To: <926b210b-4394-6410-548a-33ca0297955a@linux.ibm.com> From: Richard Biener Date: Thu, 15 Jul 2021 13:58:51 +0200 Message-ID: Subject: Re: [PATCH v3] vect: Recog mul_highpart pattern To: "Kewen.Lin" Cc: Richard Sandiford , Bill Schmidt , GCC Patches , Uros Bizjak , Segher Boessenkool Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jul 2021 11:59:05 -0000 On Thu, Jul 15, 2021 at 10:41 AM Kewen.Lin wrote: > > on 2021/7/15 =E4=B8=8B=E5=8D=884:04, Kewen.Lin via Gcc-patches wrote: > > Hi Uros, > > > > on 2021/7/15 =E4=B8=8B=E5=8D=883:17, Uros Bizjak wrote: > >> On Thu, Jul 15, 2021 at 9:07 AM Kewen.Lin wrote: > >>> > >>> on 2021/7/14 =E4=B8=8B=E5=8D=883:45, Kewen.Lin via Gcc-patches wrote: > >>>> on 2021/7/14 =E4=B8=8B=E5=8D=882:38, Richard Biener wrote: > >>>>> On Tue, Jul 13, 2021 at 4:59 PM Kewen.Lin wro= te: > >>>>>> > >>>>>> on 2021/7/13 =E4=B8=8B=E5=8D=888:42, Richard Biener wrote: > >>>>>>> On Tue, Jul 13, 2021 at 12:25 PM Kewen.Lin = wrote: > >>>>> > >>>>>> I guess the proposed IFN would be directly mapped for [us]mul_high= part? > >>>>> > >>>>> Yes. > >>>>> > >>>> > >>>> Thanks for confirming! The related patch v2 is attached and the tes= ting > >>>> is ongoing. > >>>> > >>> > >>> It's bootstrapped & regtested on powerpc64le-linux-gnu P9 and > >>> aarch64-linux-gnu. But on x86_64-redhat-linux there are XPASSes as b= elow: > >>> > >>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw > >>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw > >>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw > >>> XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw > >> > >> These XFAILs should be removed after your patch. > >> > > I'm curious whether it's intentional not to specify -fno-vect-cost-mode= l > > for this test case. As noted above, this case is sensitive on how we > > cost mult_highpart. Without cost modeling, the XFAILs can be removed > > only with this mul_highpart pattern support, no matter how we model it > > (x86 part of this patch exists or not). > > > >> This is PR100696 [1], we want PMULH.W here, so x86 part of the patch > >> is actually not needed. > >> > > > > Thanks for the information! The justification for the x86 part is that= : > > the IFN_MULH essentially covers MULT_HIGHPART_EXPR with mul_highpart > > optab support, i386 port has already customized costing for > > MULT_HIGHPART_EXPR (should mean/involve the case with mul_highpart opta= b > > support), if we don't follow the same way for IFN_MULH, I'm worried tha= t > > we may cost the IFN_MULH wrongly. If taking IFN_MULH as normal stmt is > > a right thing (we shouldn't cost it specially), it at least means we > > have to adjust ix86_multiplication_cost for MULT_HIGHPART_EXPR when it > > has direct mul_highpart optab support, I think they should be costed > > consistently. Does it sound reasonable? > > > > Hi Richard(s), > > This possibly inconsistent handling problem seems like a counter example > better to use a new IFN rather than the existing tree_code, it seems hard > to maintain (should remember to keep consistent for its handlings). ;) > From this perspective, maybe it's better to move backward to use tree_cod= e > and guard it under can_mult_highpart_p =3D=3D 1 (just like IFN and avoid > costing issue Richi pointed out before)? > > What do you think? No, whenever we want to do code generation based on machine capabilities the canonical way to test for those is to look at optabs and then it's most natural to keep that 1:1 relation and emit internal function calls which directly map to supported optabs instead of going back to some tree codes. When targets "lie" and provide expanders for something they can only emulate then they have to compensate in their costing. But as I understand this isn't the case for x86 here. Now, in this case we already have the MULT_HIGHPART_EXPR tree, so yes, it might make sense to use that instead of introducing an alternate way via the direct internal function. Somebody decided that MULT_HIGHPART is generic enough to warrant this - but I see that expand_mult_highpart can fail unless can_mult_highpart_p and this is exactly one of the cases we want to avoid - either we can handle something generally in which case it can be a tree code or we can't, then it should be 1:1 tied to optabs at best (mult_highpart has scalar support only for the direct optab, vector support also for widen_mult). Richard. > > BR, > Kewen