From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112b.google.com (mail-yw1-x112b.google.com [IPv6:2607:f8b0:4864:20::112b]) by sourceware.org (Postfix) with ESMTPS id B715839730FC for ; Thu, 17 Nov 2022 10:20:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B715839730FC Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x112b.google.com with SMTP id 00721157ae682-381662c78a9so14278547b3.7 for ; Thu, 17 Nov 2022 02:20:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=i9ciTlz99vB15MoUEBWOY5t6KYBE42YpO/ODNkC8UzY=; b=abEljy5RDhqUsbAuCq65GzYgl6EEWZCSgloqH+LQ2ZrSCdjppzw2nZKmT/fukX8UzM pFfzxSEZ/PrQEuH+AcnGoau0nwqqhMw3/Vwj+llesRMf5DKsj80b29m/owtAwNPHuM78 escBy96lUvOT3cBZGWFgg6TS1k+WnXRzlo/SuKBZoi9l+yc6e0wyoa5FlVk5JgpiXBOG Ow3UZw2tHlK911vmD9PIpRFA7RlypM+WQgsnEcyXV5hil0J03/mM2Ym8bfqF+G5Dd1EV xjj/V8GjoLTyigwRBftogwHPA04caZ5jdIukflVxaXYq++OY1QsvVdi+HfxZHixbcvU+ UxWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i9ciTlz99vB15MoUEBWOY5t6KYBE42YpO/ODNkC8UzY=; b=aYRHZv+RnD43f7uLIMkWY1j+uw8Q++cxiw5UZZ2IAiqSwV1/iEK1suQrx6zX/lcFmN nGVL2AEgrS2mrduCkyRFEdFWlhBlVwEXhUnNyngU4HZt8GEbuqcz+EozPBCJv+EBX8oa 32fDz9n8emKb11xsqP4/0VQMO3zuyRh9Z+oCduyrA2IGFhYLzIwzPvVsFvioOznOxKPg WZSp4ntmsprhe4Bxg9bPLwlYUY77OZkXo1LPC/v0aQdUlZbWk46Y6YhLbKZrOTT0PbIH x2oFmvWGaz05pSh77tVANwZM7q0CXHMvu/uV7w2hZqu7J28RmkIF9SeqPrufczQhzn+1 h6EA== X-Gm-Message-State: ANoB5pnNBy2V7+x3Qs1U5pOsvCCXgvOTa3DYmSKaFmAziZcmEErz52Tj 4N+SQBPX+nUk+VYdE1CldySayXEuIqhvXTtIP9o= X-Google-Smtp-Source: AA0mqf4P+gLB2vAG4v+RAX/YrbkSAaX8EFSR+WhI3eCjvnjJU/OnCD2GXeK8LpVv0RoqkVjz4J+ZWtOjDVvW6aor6no= X-Received: by 2002:a81:552:0:b0:367:b4b3:3952 with SMTP id 79-20020a810552000000b00367b4b33952mr1283443ywf.508.1668680442108; Thu, 17 Nov 2022 02:20:42 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Hongtao Liu Date: Thu, 17 Nov 2022 18:20:29 +0800 Message-ID: Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector To: Hongtao Liu , Tamar Christina , Tamar Christina via Gcc-patches , nd , "rguenther@suse.de" , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_MANYTO,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Nov 17, 2022 at 5:39 PM Richard Sandiford wrote: > > Hongtao Liu writes: > > On Wed, Nov 16, 2022 at 1:39 AM Richard Sandiford > > wrote: > >> > >> Tamar Christina writes: > >> >> -----Original Message----- > >> >> From: Hongtao Liu > >> >> Sent: Tuesday, November 15, 2022 9:37 AM > >> >> To: Tamar Christina > >> >> Cc: Richard Sandiford ; Tamar Christina via > >> >> Gcc-patches ; nd ; > >> >> rguenther@suse.de > >> >> Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from > >> >> arbitrary element position inside a vector > >> >> > >> >> On Tue, Nov 15, 2022 at 4:51 PM Tamar Christina > >> >> wrote: > >> >> > > >> >> > > -----Original Message----- > >> >> > > From: Hongtao Liu > >> >> > > Sent: Tuesday, November 15, 2022 8:36 AM > >> >> > > To: Tamar Christina > >> >> > > Cc: Richard Sandiford ; Tamar Christina > >> >> > > via Gcc-patches ; nd ; > >> >> > > rguenther@suse.de > >> >> > > Subject: Re: [PATCH 3/8]middle-end: Support extractions of > >> >> > > subvectors from arbitrary element position inside a vector > >> >> > > > >> >> > > Hi: > >> >> > > I'm from https://gcc.gnu.org/pipermail/gcc-patches/2022- > >> >> > > November/606040.html. > >> >> > > > } > >> >> > > > > >> >> > > > /* See if we can get a better vector mode before extracting. > >> >> > > > */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index > >> >> > > > > >> >> > > > >> >> cff37ccb0dfc3dd79b97d0abfd872f340855dc96..f338df410265dfe55b68961600 > >> >> > > 9 > >> >> > > 0 > >> >> > > > a453cc6a28d9 100644 > >> >> > > > --- a/gcc/optabs.cc > >> >> > > > +++ b/gcc/optabs.cc > >> >> > > > @@ -6267,6 +6267,7 @@ expand_vec_perm_const (machine_mode > >> >> mode, > >> >> > > rtx v0, rtx v1, > >> >> > > > v0_qi = gen_lowpart (qimode, v0); > >> >> > > > v1_qi = gen_lowpart (qimode, v1); > >> >> > > > if (targetm.vectorize.vec_perm_const != NULL > >> >> > > > + && targetm.can_change_mode_class (mode, qimode, > >> >> > > > + ALL_REGS) > >> >> > > It looks like you want to guard gen_lowpart, shouldn't it be better > >> >> > > to use validate_subreg or (tmp = gen_lowpart_if_possible (mode, > >> >> target_qi)). > >> >> > > IMHO, targetm.can_change_mode_class is mostly used for RA, but not > >> >> > > to guard gen_lowpart. > >> >> > > >> >> > Hmm I don't think this is quite true, there are existing usages in > >> >> > expr.cc and rtanal.cc That do this and aren't part of RA. As I > >> >> > mentioned before for instance the canoncalization of vec_select to subreg > >> >> in rtlanal for instances uses this. > >> >> In theory, we need to iterate through all reg classes that can be assigned for > >> >> both qimode and mode, if any regclass returns true for > >> >> targetm.can_change_mode_class, the bitcast(validate_subreg) should be ok. > >> >> Here we just passed ALL_REGS. > >> > > >> > Yes, and most targets where this transformation is valid return true here. > >> > > >> > I've checked: > >> > * alpha > >> > * arm > >> > * aarch64 > >> > * rs6000 > >> > * s390 > >> > * sparc > >> > * pa > >> > * mips > >> > > >> > And even the default example that other targets use from the documentation > >> > would return true as the size of the modes are the same. > >> > > >> > X86 and RISCV are the only two targets that I found (but didn't check all) that > >> > blankly return a result based on just the register classes. > >> > > >> > That is to say, there are more targets that adhere to the interpretation that > >> > rclass here means "should be possible in some class in rclass" rather than > >> > "should be possible in ALL classes of rclass". > >> > >> Yeah, I agree. A query "can something stored in ALL_REGS change from > >> mode M1 to mode M2?" is meaningful if at least one register R in ALL_REGS > >> can hold both M1 and M2. It's then the target's job to answer > >> conservatively so that the result covers all such R. > >> > >> In principle it's OK for a target to err on the side of caution and forbid > >> things that are actually OK. But that's going to risk losing performance > >> in some cases, and sometimes that loss of performance will be unacceptable. > >> IMO that's what's happening here. The target is applying x87 rules to > >> things that (AIUI) are never stored in x87 registers, and so losing > > Yes, it can be optimized since some mode will never assigned to x87 registers. > >> performance as a result. > >> > >> Note that the RA also uses ALL_REGS for some things, so this usage > >> isn't specific to non-RA code. > > RA passes the minimal reg class(REGNO_REG_CLASS) which contains REGN > > to decide if can_change_mode_class, not ALL_REGS. > > 511/* Given a hard REGN a FROM mode and a TO mode, return true if > > 512 REGN can change from mode FROM to mode TO. */ > > 513#define REG_CAN_CHANGE_MODE_P(REGN, FROM, TO) \ > > 514 (targetm.can_change_mode_class (FROM, TO, REGNO_REG_CLASS (REGN))) > > 515 > > > > So I still think using can_change_mode_class outside of RA with > > ALL_REGS passed to decide whether it's ok to generate subreg is not a > > good idea. > > But if the argument is that the only valid uses of can_change_mode_class > are through this macro, the hook isn't describing a class property, > it's describing the property of individual registers. If we say that > querying individual registers is the only valid thing to do them > we should change the hook to take a register number rather than > a class enum. > > The reason we have a class-based rather than register-based interface > is because it is useful to query classes before you've picked a > specific register. For individual registers in the minimal reg class, we assume they are not different from each other, I guess that's why we have REGNO_REG_CLASS and class-based interfaces other than register-based interfaces. But for ALL_REGS, it's not the minimal reg class, it's the largest. Using it It's not that suitable. If the argument is if some r in rclass is ok for mode change, the hook would return true, then why would RA use REGNO_REG_CLASS other than ALL_REGS. Another spot is in validate_subreg, we're using the minimal reg class instead of ALL_REGS. 973 /* This is a normal subreg. Verify that the offset is representable. */ 974 975 /* For hard registers, we already have most of these rules collected in 976 subreg_offset_representable_p. */ 977 if (reg && REG_P (reg) && HARD_REGISTER_P (reg)) 978 { 979 unsigned int regno = REGNO (reg); 980 981 if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) 982 && GET_MODE_INNER (imode) == omode) 983 ; 984 else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode)) 985 return false; I think we do need some hook in the middle end to query things like if some r in rclass is ok for mode change? but not reusing can_change_mode_class. > > Thanks, > Richard > > >> IMO it's not the job of target-independent code to iterate through > >> individual classes and aggregate the result. One of the reasons for > >> having union classes is to avoid the need to do that. And ALL_REGS > >> is the ultimate union class. :-) > >> > >> The patch looks correct to me. > >> > >> Thanks, > >> Richard > >> > >> >> > > >> >> > So there are already existing precedence for this. And the > >> >> > documentation for the hook says: > >> >> > > >> >> > "This hook returns true if it is possible to bitcast values held in registers of > >> >> class rclass from mode from to mode to and if doing so preserves the low- > >> >> order bits that are common to both modes. The result is only meaningful if > >> >> rclass has registers that can hold both from and to. The default > >> >> implementation returns true" > >> >> > > >> >> > So it looks like it's use outside of RA is perfectly valid.. and the > >> >> > documentation also mentions in the example the use from the mid-end as > >> >> an example. > >> >> > > >> >> > But if the mid-end maintainers are happy I'll use something else. > >> >> > > >> >> > Tamar > >> >> > > >> >> > > I did similar things in > >> >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579296.html > >> >> > > (and ALL_REGS doesn't cover all cases for registers which are both > >> >> > > available for qimode and mode, ALL_REGS fail doesn't mean it can't > >> >> > > be subreg, it just means parts of ALL_REGS can't be subreg. but with > >> >> > > a subset of ALL_REGS, there could be a reg class which return true > >> >> > > for > >> >> > > targetm.can_change_mode_class) > >> >> > > > && targetm.vectorize.vec_perm_const (qimode, qimode, > >> >> > > > target_qi, > >> >> > > v0_qi, > >> >> > > > v1_qi, qimode_indices)) > >> >> > > > return gen_lowpart (mode, target_qi); @@ -6311,7 +6312,8 > >> >> > > > @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > >> >> > > > } > >> >> > > > > >> >> > > > if (qimode != VOIDmode > >> >> > > > - && selector_fits_mode_p (qimode, qimode_indices)) > >> >> > > > + && selector_fits_mode_p (qimode, qimode_indices) > >> >> > > > + && targetm.can_change_mode_class (mode, qimode, ALL_REGS)) > >> >> > > > { > >> >> > > > icode = direct_optab_handler (vec_perm_optab, qimode); > >> >> > > > if (icode != CODE_FOR_nothing) diff --git > >> >> > > > a/gcc/testsuite/gcc.target/aarch64/ext_1.c > >> >> > > > b/gcc/testsuite/gcc.target/aarch64/ext_1.c > >> >> > > > new file mode 100644 > >> >> > > > index > >> >> > > > > >> >> > > > >> >> 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e5 > >> >> > > 71 > >> >> > > > b3bc2ddf887a > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > BR, > >> >> > > Hongtao > >> >> > >> >> > >> >> > >> >> -- > >> >> BR, > >> >> Hongtao -- BR, Hongtao