From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 508303858D37 for ; Thu, 20 Jan 2022 09:43:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 508303858D37 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EF8331FB; Thu, 20 Jan 2022 01:43:35 -0800 (PST) Received: from [10.57.11.168] (unknown [10.57.11.168]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 22B783F766; Thu, 20 Jan 2022 01:43:34 -0800 (PST) Message-ID: Date: Thu, 20 Jan 2022 09:43:42 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Subject: Re: [PATCH v3 04/15] arm: Add GENERAL_AND_VPR_REGS regclass Content-Language: en-US To: Christophe Lyon Cc: Christophe Lyon , GCC Patches , Richard Sandiford References: <20220113145645.4077141-1-christophe.lyon@foss.st.com> <20220113145645.4077141-5-christophe.lyon@foss.st.com> <49a12bbc-66b8-e625-c37e-d2a90b21e6a2@arm.com> From: "Andre Vieira (lists)" In-Reply-To: X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HTML_MESSAGE, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jan 2022 09:43:38 -0000 On 20/01/2022 09:14, Christophe Lyon wrote: > > > On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches > wrote: > > Hi Christophe, > > On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote: > > At some point during the development of this patch series, it > appeared > > that in some cases the register allocator wants “VPR or general” > > rather than “VPR or general or FP” (which is the same thing as > > ALL_REGS).  The series does not seem to require this anymore, but it > > seems to be a good thing to do anyway, to give the register > allocator > > more freedom. > Not sure I fully understand this, but I guess it creates an extra > class > the register allocator can use to group things that can go into > VPR or > general reg? > > > > CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a > > regression in gcc.dg/stack-usage-1.c when compiled with -mthumb > > -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp. > I have not looked into this failure, but ... > > > > 2022-01-13  Christophe Lyon  > > > >       gcc/ > >       * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS. > >       (REG_CLASS_NAMES): Likewise. > >       (REG_CLASS_CONTENTS): Likewise. > >       (CLASS_MAX_NREGS): Handle VPR. > >       * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > index bb75921f32d..c3559ca8703 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream) > >   static unsigned int > >   arm_hard_regno_nregs (unsigned int regno, machine_mode mode) > >   { > > +  if (IS_VPR_REGNUM (regno)) > > +    return CEIL (GET_MODE_SIZE (mode), 2); > When do we ever want to use more than 1 register for VPR? > > > That was tricky. > Richard Sandiford helped me analyze the problem, I guess I can quote him: > > RS> I think the problem is a combination of a few things: > RS> > RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves > RS>     to or from the VPR_REG class get the maximum cost. > RS> > RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that > RS>    VPR is big enough to hold SImode. > RS> > RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough > RS>     to hold a mode M, IRA ensures that move costs for M involving C1 > RS>     are >= move costs for M involving C2. > RS> > RS> (1) is correct but (2) isn't.  IMO (3) is dubious: the trigger should > RS> be whether C2 is actually allowed to hold M, not whether C2 is big > enough > RS> to hold M.  However, changing that is likely to cause problems > elsewhere, > RS> and could lead to classes like GENERAL_AND_FP_REGS being used when > RS> FP_REGS are disabled (which might be confusing). > RS> I understand everything up until here. > RS> “Fixing” (2) using: > RS> > RS>  CEIL (GET_MODE_SIZE (mode), 2) I was wondering why not just return '1' for VPR_REGNUM, rather than use the fact that the mode-size we use for VPR is 2 bytes, so diving it by 2 makes 1. Unless we ever decide to use a larger mode for VPR, maybe that's what this is trying to address? I can't imagine we would ever need to though since for MVE there is only one VPR register and it is always 16-bits. Just feels overly complicated to me. > RS> > RS> for VPR_REG & VPR_REGNUM seems to make the costs correct.  I don't > know > RS> if it would cause other problems though. > RS> > RS> I don't think CLASS_MAX_NREGS should do anything special for > superclasses > RS> of VPR_REG, even though that makes the definition non-obvious.  If an > RS> SImode is stored in GENERAL_AND_VPR_REGS, it will in reality be stored > RS> in the GENERAL_REGS subset, so the maximum count should come from > there > RS> rather than VPR_REG. > > Does that answer your question? I guess it end's up being correct, just don't understand the complexity that's all.