From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=GT99=3Z=kernel.crashing.org=segher@sourceware.org>
Received: from gate.crashing.org (gate.crashing.org [63.228.1.57])
	by sourceware.org (Postfix) with ESMTP id 20FE6384F8A5;
	Fri, 25 Nov 2022 15:47:05 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 20FE6384F8A5
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org
Received: from gate.crashing.org (localhost.localdomain [127.0.0.1])
	by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 2APFk4AQ028085;
	Fri, 25 Nov 2022 09:46:04 -0600
Received: (from segher@localhost)
	by gate.crashing.org (8.14.1/8.14.1/Submit) id 2APFk3XU028083;
	Fri, 25 Nov 2022 09:46:03 -0600
X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f
Date: Fri, 25 Nov 2022 09:46:03 -0600
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Jiufu Guo <guojiufu@linux.ibm.com>
Cc: "Kewen.Lin" <linkw@linux.ibm.com>, dje.gcc@gmail.com, linkw@gcc.gnu.org,
        gcc-patches@gcc.gnu.org
Subject: Re: [PATCH]rs6000: Load high and low part of 64bit constant independently
Message-ID: <20221125154603.GH25951@gate.crashing.org>
References: <20220915083052.74903-1-guojiufu@linux.ibm.com> <7b482d49-3928-552c-ccf5-d391684b7f2b@linux.ibm.com> <7ev8n3p3u6.fsf@pike.rch.stglabs.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <7ev8n3p3u6.fsf@pike.rch.stglabs.ibm.com>
User-Agent: Mutt/1.4.2.3i
X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

Hi!

On Fri, Nov 25, 2022 at 09:21:21PM +0800, Jiufu Guo wrote:
> "Kewen.Lin" <linkw@linux.ibm.com> writes:
> > on 2022/9/15 16:30, Jiufu Guo wrote:
> >> For a complicate 64bit constant, blow is one instruction-sequence to
> >> build:
> >> 	lis 9,0x800a
> >> 	ori 9,9,0xabcd
> >> 	sldi 9,9,32
> >> 	oris 9,9,0xc167
> >> 	ori 9,9,0xfa16
> >> 
> >> while we can also use below sequence to build:
> >> 	lis 9,0xc167
> >> 	lis 10,0x800a
> >> 	ori 9,9,0xfa16
> >> 	ori 10,10,0xabcd
> >> 	rldimi 9,10,32,0
> >> This sequence is using 2 registers to build high and low part firstly,
> >> and then merge them.
> >> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more
> >> register with potential register pressure).

And crucially this patch only uses two registers if can_create_pseudo_p.
Please mention that.

> >> 	* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Update 64bit
> >> 	constant build.

If you don't give details of what this does, just say "Update." please.
But update to what?

"Generate more parallel code if can_create_pseudo_p." maybe?

> >> +	  rtx H = gen_reg_rtx (DImode);
> >> +	  rtx L = gen_reg_rtx (DImode);

Please don't use all-uppercase variable names, those are for macros.  In
fact, don't use uppercase in variable (and function etc.) names at all,
unless there is a really good reason to.

Just call it "high" and "low", or "hi" and "lo", or something?

> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> >> @@ -0,0 +1,27 @@
> >> +/* { dg-do run } */
> >> +/* { dg-options "-O2 -mdejagnu-cpu=power7  -save-temps" } */
> >
> > Why do we need power7 here?
> power8/9 are also ok for this case.  Actually, O just want to
> avoid to use new p10 instruction, like "pli", and then selected
> an old arch option.

Why does it need _at least_ p7, is the question (as I understand it).

To prohibit pli etc. you can do -mno-prefixed (which works on all older
CPUs just as well), or skip the test if prefixed insns are enabled, or
scan for the then generated code as well.  The first option is by far
the simplest.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> @@ -0,1 +1,27 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8  -save-temps" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */

p8 here makes no sense, we also want good and correct code generated for
older CPUs, so we should not preevent testing on those.  For that reason
you shouldn't use -mcpu= when not needed.  Like here :-)