[RFC] propgation leap over memory copy for struct

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jiufu Guo <guojiufu@linux.ibm.com>
To: gcc-patches@gcc.gnu.org
Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org,
	guojiufu@linux.ibm.com, rguenth@gcc.gnu.org, pinskia@gcc.gnu.org
Subject: [RFC] propgation leap over memory copy for struct
Date: Mon, 31 Oct 2022 10:42:35 +0800	[thread overview]
Message-ID: <20221031024235.110995-1-guojiufu@linux.ibm.com> (raw)

Hi,

We know that for struct variable assignment, memory copy may be used.
And for memcpy, we may load and store more bytes as possible at one time.
While it may be not best here:
1. Before/after stuct variable assignment, the vaiable may be operated.
And it is hard for some optimizations to leap over memcpy.  Then some struct
operations may be sub-optimimal.  Like the issue in PR65421.
2. The size of struct is constant mostly, the memcpy would be expanded.  Using
small size to load/store and executing in parallel may not slower than using
large size to loat/store. (sure, more registers may be used for smaller bytes.)

In PR65421, For source code as below:
////////t.c
#define FN 4
typedef struct { double a[FN]; } A;

A foo (const A *a) { return *a; }
A bar (const A a) { return a; }
///////

If FN<=2; the size of "A" fits into TImode, then this code can be optimized 
(by subreg/cse/fwprop/cprop) as:
-------
foo:
.LFB0:
        .cfi_startproc
        blr

bar:
.LFB1:
      	.cfi_startproc
	lfd 2,8(3)
	lfd 1,0(3)
	blr
--------
If the size of "A" is larger than any INT mode size, RTL insns would be 
generated as:
   13: r125:V2DI=[r112:DI+0x20]
   14: r126:V2DI=[r112:DI+0x30]
   15: [r112:DI]=r125:V2DI
   16: [r112:DI+0x10]=r126:V2DI  /// memcpy for assignment: D.3338 = arg;
   17: r127:DF=[r112:DI]
   18: r128:DF=[r112:DI+0x8]
   19: r129:DF=[r112:DI+0x10]
   20: r130:DF=[r112:DI+0x18]
------------

I'm thinking about ways to improve this.
Metod1: One way may be changing the memory copy by referencing the type 
of the struct if the size of struct is not too big. And generate insns 
like the below:
   13: r125:DF=[r112:DI+0x20]
   15: r126:DF=[r112:DI+0x28]
   17: r127:DF=[r112:DI+0x30]
   19: r128:DF=[r112:DI+0x38]
   14: [r112:DI]=r125:DF
   16: [r112:DI+0x8]=r126:DF
   18: [r112:DI+0x10]=r127:DF
   20: [r112:DI+0x18]=r128:DF
   21: r129:DF=[r112:DI]
   22: r130:DF=[r112:DI+0x8]
   23: r131:DF=[r112:DI+0x10]
   24: r132:DF=[r112:DI+0x18]

Then passes (cse, prop, dse...) could help to optimize the code.
Concerns of the method: we may not need to do this if the number of 
fields is too large.  And the types/modes of each load/store may
depend on the platform and not same with the type of the fields of
the struct. For example: 
For "struct {double a[3]; long long l;}", on ppc64le, DImode may be
 better for assignments on parameter.

Method2: One way may be enhancing CSE to make it able to treat one large
memory slot as two(or more) combined slots: 
   13: r125:V2DI#0=[r112:DI+0x20]
   13': r125:V2DI#8=[r112:DI+0x28]
   15: [r112:DI]#0=r125:V2DI#0
   15': [r112:DI]#8=r125:V2DI#8

This may seems more hack in CSE.

Method3: For some record type, use "PARALLEL:BLK" instead "MEM:BLK".
To do this, "moving" between "PARALLEL<->PARALLEL" and "PARALLEL<->MEM" 
may need to be enhanced.  This method may require more effort to make
it works for corner/unknown cases.

I'm wondering which would be more flexible to handle this issue?
Thanks for any comments and suggestions!

BR,
Jeff(Jiufu)

next             reply	other threads:[~2022-10-31  2:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31  2:42 Jiufu Guo [this message]
2022-10-31 22:13 ` Jeff Law
2022-11-01  0:49   ` Segher Boessenkool
2022-11-01  4:30     ` Jiufu Guo
2022-11-05 14:13       ` Richard Biener
2022-11-08  4:05         ` Jiufu Guo
2022-11-09  7:51           ` Jiufu Guo
2022-11-09  8:50             ` Richard Biener
2022-11-01  3:30   ` Jiufu Guo
2022-11-05 11:38   ` Richard Biener
2022-11-09  9:21     ` Jiufu Guo
2022-11-09 12:56       ` Richard Biener
2022-11-01  0:37 ` Segher Boessenkool
2022-11-01  3:01   ` Jiufu Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221031024235.110995-1-guojiufu@linux.ibm.com \
    --to=guojiufu@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@gcc.gnu.org \
    --cc=pinskia@gcc.gnu.org \
    --cc=rguenth@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).