From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by sourceware.org (Postfix) with ESMTPS id 95D933896C0D for ; Thu, 17 Nov 2022 08:04:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95D933896C0D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-37063f855e5so11758597b3.3 for ; Thu, 17 Nov 2022 00:04:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=TqGqhN6exEPctJKM90Bb0TGDjTDomXrUvHUgZacMARY=; b=d2pyHIYi4Ex9fcZqFjVLJ/5aD9xffZHphPFrt/e5eGbrP0/w7BRQPFjeQQpWA1v4IS UQBFXugy4HIF7JMGsqqgPJTTnYGmDeo6LazDW7wL0OmmNHY92kbh8em0Lxj793Qmi6ti YPkdHbxyt1UxBbPCpNE3xhgS5bIx/wKwCWIR0Cer2oQLMMooIXawLXrpl8jCTE3eqHrm E6HKvhmM0VE+KUfdTbYvU+7lC+44DmH2zXhXUZZt/3EmbghfyikecRGcSajBfW8DNh6F o78co5Ji8tUL+Q/Ja+ZAP9TpAKJJ/dDqfbIa79LsugpEMYYe2bL1BSzxbeifsJa1PmzG gBbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TqGqhN6exEPctJKM90Bb0TGDjTDomXrUvHUgZacMARY=; b=kJa+TK3VZZ/coZWBT4n/Oe66R5f3HVXGzfoM490loKIctIq3e1Kvnp2/QPJjQNhdm+ Cfj3U8Q3OZzB7lo4lxA9oscCRRXyVyH/uDqSTmdtzmc5t9w6TEagBADi1T0my/GwC1xP iVfQMWkhFUVdx904VOkcjEJ0B4fk9W6/WZRtuhjAs2fQVv5gpABavsaMHDV2k+tqLfwo QfATNvn7O3srwsJgnDxPPb8JIzZMV9hac0W1XpdwkzsW9oY7rqcnI9GgMkXLwvmBIJZX zU1pFXE/C5DwdYkZigI6MrAQMeaBhP31uzdWaQILA/PvTSg3KSlrTX/0L4opR3EQxWSH kZEQ== X-Gm-Message-State: ANoB5pmacNk3y+WT8DyU6MWH79enYe4XNnUVLJA8YPHPrSwnYBxZOFvF rUUH+ca/mc8C2LPrhtO7q9r5mFiIpbqa0UQoNIA= X-Google-Smtp-Source: AA0mqf5pmkPE5ut0YbahGcHc0znsSdPWuo6poXsM5J8aG8IFktSUAA9HrFuqzZSCwPpkNYA/DEVX/U3sQj1+bKjDBVA= X-Received: by 2002:a0d:ce86:0:b0:36b:999a:da62 with SMTP id q128-20020a0dce86000000b0036b999ada62mr1036554ywd.249.1668672297759; Thu, 17 Nov 2022 00:04:57 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Hongtao Liu Date: Thu, 17 Nov 2022 16:04:46 +0800 Message-ID: Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector To: Tamar Christina , Hongtao Liu , Tamar Christina via Gcc-patches , nd , "rguenther@suse.de" , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_MANYTO,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 16, 2022 at 1:39 AM Richard Sandiford wrote: > > Tamar Christina writes: > >> -----Original Message----- > >> From: Hongtao Liu > >> Sent: Tuesday, November 15, 2022 9:37 AM > >> To: Tamar Christina > >> Cc: Richard Sandiford ; Tamar Christina via > >> Gcc-patches ; nd ; > >> rguenther@suse.de > >> Subject: Re: [PATCH 3/8]middle-end: Support extractions of subvectors from > >> arbitrary element position inside a vector > >> > >> On Tue, Nov 15, 2022 at 4:51 PM Tamar Christina > >> wrote: > >> > > >> > > -----Original Message----- > >> > > From: Hongtao Liu > >> > > Sent: Tuesday, November 15, 2022 8:36 AM > >> > > To: Tamar Christina > >> > > Cc: Richard Sandiford ; Tamar Christina > >> > > via Gcc-patches ; nd ; > >> > > rguenther@suse.de > >> > > Subject: Re: [PATCH 3/8]middle-end: Support extractions of > >> > > subvectors from arbitrary element position inside a vector > >> > > > >> > > Hi: > >> > > I'm from https://gcc.gnu.org/pipermail/gcc-patches/2022- > >> > > November/606040.html. > >> > > > } > >> > > > > >> > > > /* See if we can get a better vector mode before extracting. > >> > > > */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index > >> > > > > >> > > > >> cff37ccb0dfc3dd79b97d0abfd872f340855dc96..f338df410265dfe55b68961600 > >> > > 9 > >> > > 0 > >> > > > a453cc6a28d9 100644 > >> > > > --- a/gcc/optabs.cc > >> > > > +++ b/gcc/optabs.cc > >> > > > @@ -6267,6 +6267,7 @@ expand_vec_perm_const (machine_mode > >> mode, > >> > > rtx v0, rtx v1, > >> > > > v0_qi = gen_lowpart (qimode, v0); > >> > > > v1_qi = gen_lowpart (qimode, v1); > >> > > > if (targetm.vectorize.vec_perm_const != NULL > >> > > > + && targetm.can_change_mode_class (mode, qimode, > >> > > > + ALL_REGS) > >> > > It looks like you want to guard gen_lowpart, shouldn't it be better > >> > > to use validate_subreg or (tmp = gen_lowpart_if_possible (mode, > >> target_qi)). > >> > > IMHO, targetm.can_change_mode_class is mostly used for RA, but not > >> > > to guard gen_lowpart. > >> > > >> > Hmm I don't think this is quite true, there are existing usages in > >> > expr.cc and rtanal.cc That do this and aren't part of RA. As I > >> > mentioned before for instance the canoncalization of vec_select to subreg > >> in rtlanal for instances uses this. > >> In theory, we need to iterate through all reg classes that can be assigned for > >> both qimode and mode, if any regclass returns true for > >> targetm.can_change_mode_class, the bitcast(validate_subreg) should be ok. > >> Here we just passed ALL_REGS. > > > > Yes, and most targets where this transformation is valid return true here. > > > > I've checked: > > * alpha > > * arm > > * aarch64 > > * rs6000 > > * s390 > > * sparc > > * pa > > * mips > > > > And even the default example that other targets use from the documentation > > would return true as the size of the modes are the same. > > > > X86 and RISCV are the only two targets that I found (but didn't check all) that > > blankly return a result based on just the register classes. > > > > That is to say, there are more targets that adhere to the interpretation that > > rclass here means "should be possible in some class in rclass" rather than > > "should be possible in ALL classes of rclass". > > Yeah, I agree. A query "can something stored in ALL_REGS change from > mode M1 to mode M2?" is meaningful if at least one register R in ALL_REGS > can hold both M1 and M2. It's then the target's job to answer > conservatively so that the result covers all such R. > > In principle it's OK for a target to err on the side of caution and forbid > things that are actually OK. But that's going to risk losing performance > in some cases, and sometimes that loss of performance will be unacceptable. > IMO that's what's happening here. The target is applying x87 rules to > things that (AIUI) are never stored in x87 registers, and so losing Yes, it can be optimized since some mode will never assigned to x87 registers. > performance as a result. > > Note that the RA also uses ALL_REGS for some things, so this usage > isn't specific to non-RA code. RA passes the minimal reg class(REGNO_REG_CLASS) which contains REGN to decide if can_change_mode_class, not ALL_REGS. 511/* Given a hard REGN a FROM mode and a TO mode, return true if 512 REGN can change from mode FROM to mode TO. */ 513#define REG_CAN_CHANGE_MODE_P(REGN, FROM, TO) \ 514 (targetm.can_change_mode_class (FROM, TO, REGNO_REG_CLASS (REGN))) 515 So I still think using can_change_mode_class outside of RA with ALL_REGS passed to decide whether it's ok to generate subreg is not a good idea. > > IMO it's not the job of target-independent code to iterate through > individual classes and aggregate the result. One of the reasons for > having union classes is to avoid the need to do that. And ALL_REGS > is the ultimate union class. :-) > > The patch looks correct to me. > > Thanks, > Richard > > >> > > >> > So there are already existing precedence for this. And the > >> > documentation for the hook says: > >> > > >> > "This hook returns true if it is possible to bitcast values held in registers of > >> class rclass from mode from to mode to and if doing so preserves the low- > >> order bits that are common to both modes. The result is only meaningful if > >> rclass has registers that can hold both from and to. The default > >> implementation returns true" > >> > > >> > So it looks like it's use outside of RA is perfectly valid.. and the > >> > documentation also mentions in the example the use from the mid-end as > >> an example. > >> > > >> > But if the mid-end maintainers are happy I'll use something else. > >> > > >> > Tamar > >> > > >> > > I did similar things in > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579296.html > >> > > (and ALL_REGS doesn't cover all cases for registers which are both > >> > > available for qimode and mode, ALL_REGS fail doesn't mean it can't > >> > > be subreg, it just means parts of ALL_REGS can't be subreg. but with > >> > > a subset of ALL_REGS, there could be a reg class which return true > >> > > for > >> > > targetm.can_change_mode_class) > >> > > > && targetm.vectorize.vec_perm_const (qimode, qimode, > >> > > > target_qi, > >> > > v0_qi, > >> > > > v1_qi, qimode_indices)) > >> > > > return gen_lowpart (mode, target_qi); @@ -6311,7 +6312,8 > >> > > > @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > >> > > > } > >> > > > > >> > > > if (qimode != VOIDmode > >> > > > - && selector_fits_mode_p (qimode, qimode_indices)) > >> > > > + && selector_fits_mode_p (qimode, qimode_indices) > >> > > > + && targetm.can_change_mode_class (mode, qimode, ALL_REGS)) > >> > > > { > >> > > > icode = direct_optab_handler (vec_perm_optab, qimode); > >> > > > if (icode != CODE_FOR_nothing) diff --git > >> > > > a/gcc/testsuite/gcc.target/aarch64/ext_1.c > >> > > > b/gcc/testsuite/gcc.target/aarch64/ext_1.c > >> > > > new file mode 100644 > >> > > > index > >> > > > > >> > > > >> 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e5 > >> > > 71 > >> > > > b3bc2ddf887a > >> > > > >> > > > >> > > > >> > > > >> > > -- > >> > > BR, > >> > > Hongtao > >> > >> > >> > >> -- > >> BR, > >> Hongtao -- BR, Hongtao