From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id 1AB5E3858CDB for ; Thu, 20 Jul 2023 13:02:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1AB5E3858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-3fbc244d384so6381585e9.0 for ; Thu, 20 Jul 2023 06:02:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1689858177; x=1690462977; h=mime-version:user-agent:references:message-id:in-reply-to:subject :cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2lIZeSmnmc9jSbmnI9Fgv5lNUjXnDNIG0wp0Hm861zQ=; b=DO25n3p+cE6htcrcRuQGQvc9PMoQvsZTEC2iLcE/NL/DFokRR4Bug62io4BKzvfSwg i6p0oFL7lo1sLd9NYCOD2viY0Ly1rkYHBijYHrlHhBFN5qpQLoJM1+QAagG3SUg87W5Q 97V7ZO9XCoribmdjJWAa5jd69/sQGBRou/gsjmoYvT1/IJlAyzhzszSwMoichJlrXnL1 jM29CayFqf478VEess6p2S3SZdpcsvNNTlSOZxz++XLld77HGA8o5ckxmJqBoSxzFJei Vz2WJV4So9aaqu5bK8OM2QSXCKrGOzd+DMe5KsEqVI6ibzuR1wMMt1O0nwmgz//m+mJu Ro5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689858177; x=1690462977; h=mime-version:user-agent:references:message-id:in-reply-to:subject :cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2lIZeSmnmc9jSbmnI9Fgv5lNUjXnDNIG0wp0Hm861zQ=; b=IpPIH8/FmgpvFNGTJiflPw/j++dXmyyJ44bInYgJF99Z+oSjVpNFfGoDWHgTcH5SPu m/dyjPgBDGAsSWx0TsQs6jXQSJx0GE0993pCpIzn2i6ejq/fy2yv2ktsitTGcXSuFITm k2S0+hwyaX3pKc0yonMjFD/YR2mr/yvevjbi0VjAGFPxakUsOXmFS41gwSN8y2zwyA9l z/BDUXzSsqB9EaslcR832soYFXYXGtcwMhKaDaB/u3QV1CtEFXa/5I9ia3NvnLzHKe2i vMCcohaShDg5YPtIipOgE715fBNANSa48IkcGIa8dD8907/6jpR9PdoKhYwSEfUCRhS7 t3IA== X-Gm-Message-State: ABy/qLZTAKstkrlzNfN0GeF4eg5y6g5O72RAmoi6mYVU68Vnb7YxFGEx N51rPYvYq4t12BVK7vBXrx4FRw== X-Google-Smtp-Source: APBJJlGt16qynKwM3jXLFWMIovsrKEI+GZ9jJyUucear9tsMnlmNkWFf9Odr/5VuFK/p3rddDWjs2Q== X-Received: by 2002:a1c:7919:0:b0:3fb:ffa8:6d78 with SMTP id l25-20020a1c7919000000b003fbffa86d78mr6741927wme.36.1689858176865; Thu, 20 Jul 2023 06:02:56 -0700 (PDT) Received: from [192.168.0.201] ([212.69.42.53]) by smtp.gmail.com with ESMTPSA id m23-20020a7bcb97000000b003fa96620b23sm3937066wmi.12.2023.07.20.06.02.56 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jul 2023 06:02:56 -0700 (PDT) Date: Thu, 20 Jul 2023 14:02:54 +0100 (BST) From: "Maciej W. Rozycki" To: Richard Biener cc: Jiufu Guo , YunQiang Su , Rainer Orth , Mike Stump , gcc-patches@gcc.gnu.org Subject: Re: [PATCH 2/3] testsuite: Require 128-bit vectors for bb-slp-pr95839.c In-Reply-To: Message-ID: References: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 20 Jul 2023, Richard Biener wrote: > > There's no such requirement in the psABI and I fail to see a plausible > > justification. And direct GPR<->FPR move patterns are available in the > > backend for the V2SF mode. Also there's no delay slot requirement even > > for these move instructions for MIPS64r1+ ISA levels, which have this > > paired-single FP format defined. It seems to me a plain bug (or missed > > optimisation if you prefer). > > Definitely. OTOH parameter/return passing for V4SFmode while > appearantly being done in registers the backend(?) assigns BLKmode > to the V4SFmode arguments so they get immediately spilled in the MIPS NewABI targets use registers to return data of small aggregate types (effectively of up to the TImode size), so this seems reasonable to me. FP scalars and aggregates made of up to two fields are returned in FPRs and any other data is returned in GPRs: "* Function results are returned in $2 (and $3 if needed), or $f0 (and $f2 if needed), as appropriate for the type. Composite results (struct, union, or array) are returned in $2/$f0 and $3/$f2 according to the following rules: " - A struct with only one or two floating point fields is returned in $f0 (and $f2 if necessary). This is a generalization of the Fortran COMPLEX case. " - Any other struct or union results of at most 128 bits are returned in $2 (first 64 bits) and $3 (remainder, if necessary)." Given that V4SFmode data has more than two FP fields (it's effectively an array of four) it is correctly returned in GPRs (even though the advantage of this arrangement is questionable, but the NewABI predates the invention of the paired-single FP format by a few years, which was only introduced with the MIPS V ISA, and actually implemented with the MIPS64r1 ISA even later). A similar NewABI rule works here for the arguments. I suspect the relevant part of the backend handles it correctly for other modes and was missed in the update for V4SFmode, which was a change on its own. The only sufficiently old version of GCC I have ready to use is 4.1.2 and it produces the same code, so at least it does not seem to be a regression. > code moving the incoming hardregisters to pseudos (or stack as in > this case). It comes down to the issue that Jiufu Guo is eventually > addressing with adding SRA-style heuristics to the code chosing > the layout of that storage. Interestingly for the return value we get > TImode. That may come from the use of the GPRs I suppose. > Note we don't seem to be able to optimize > > (insn 6 21 8 2 (set (mem/c:DI (plus:DI (reg/f:DI 78 $frame) > (const_int 24 [0x18])) [1 a+8 S8 A64]) > (reg:DI 5 $5)) "t.c":4:1 322 {*movdi_64bit} > (expr_list:REG_DEAD (reg:DI 5 $5) > (nil))) > ... > (insn 40 7 41 2 (set (reg:V2SF 205 [ a+8 ]) > (mem/c:V2SF (plus:DI (reg/f:DI 78 $frame) > (const_int 24 [0x18])) [1 a+8 S8 A64])) "t.c":6:23 387 > {*movv2sf} > (expr_list:REG_EQUIV (mem/c:V2SF (plus:DI (reg/f:DI 78 $frame) > (const_int 24 [0x18])) [1 a+8 S8 A64]) > (nil))) > > for some reason. Maybe we are afraid of the hardreg use in the store, I believe the reason is the relevant constraints use the `*' modifier so as not to spill FP values to GPRs or vice versa (ISTR a discussion as to why we should prevent it from happening and I don't remember the outcome, but overall it seems reasonable to me), so once we've spilled to memory it won't be undone. That doesn't mean we should refrain from moving directly when data is there already in the "wrong" kind of register. > maybe it is because the store is in the prologue (before > NOTE_INSN_FUNCTION_BEG). Also postreload isn't able to fix this: > > (insn 6 21 8 2 (set (mem/c:DI (plus:DI (reg/f:DI 29 $sp) > (const_int 24 [0x18])) [1 a+8 S8 A64]) > (reg:DI 5 $5)) "t.c":4:1 322 {*movdi_64bit} > (nil)) > ... > (insn 40 7 41 2 (set (reg:V2SF 32 $f0 [orig:205 a+8 ] [205]) > (mem/c:V2SF (plus:DI (reg/f:DI 29 $sp) > (const_int 24 [0x18])) [1 a+8 S8 A64])) "t.c":6:23 387 > {*movv2sf} > (expr_list:REG_EQUIV (mem/c:V2SF (plus:DI (reg/f:DI 78 $frame) > (const_int 24 [0x18])) [1 a+8 S8 A64]) > (nil))) > > so something is amiss in the backend as well if you say there should be > direct moves available. There are, they're alternatives #5/#6 (`mtc'/`mfc') in `*movv2sf' and they're handled correctly by `mips_output_move' AFAICT. Hardware has always had it, so there's no ISA constraint here. But as I say, I'm leaving it to the backend maintainer to sort out. Maciej