public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Tree-SSA and POST_INC address mode inompatible in GCC4?
@ 2007-11-02 12:24 Bingfeng Mei
  2007-11-02 12:38 ` Ramana Radhakrishnan
  0 siblings, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2007-11-02 12:24 UTC (permalink / raw)
  To: gcc

Hello,

I look at the following the code to see what is the difference between
GCC4 and GCC3 in using POST_INC address mode (or other similar modes). 

void tst(char * __restrict__ a, char * __restrict__ b){
  *a++ = *b++;
  *a++ = *b++;
  *a++ = *b++;
  *a++ = *b++;
  *a++ = *b++;
  *a++ = *b++;
  *a = *b;
}


Using ARM processor as a target, GCC4.2.2 generates the following
assembly:
tst:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	mov	r2, r1
	ldrb	ip, [r2], #1	@ zero_extendqisi2
	mov	r3, r0
	strb	ip, [r3], #1
	ldrb	r1, [r1, #1]	@ zero_extendqisi2
	strb	r1, [r0, #1]
	ldrb	r1, [r2, #1]	@ zero_extendqisi2
	strb	r1, [r3, #1]
	add	r2, r2, #1
	ldrb	r1, [r2, #1]	@ zero_extendqisi2
	add	r3, r3, #1
	strb	r1, [r3, #1]
	add	r2, r2, #1
	ldrb	r1, [r2, #1]	@ zero_extendqisi2
	add	r3, r3, #1
	strb	r1, [r3, #1]
	add	r2, r2, #1
	ldrb	r1, [r2, #1]	@ zero_extendqisi2
	add	r3, r3, #1
	strb	r1, [r3, #1]
	ldrb	r2, [r2, #2]	@ zero_extendqisi2
	@ lr needed for prologue
	strb	r2, [r3, #2]
	bx	lr
	.size	tst, .-tst
	.ident	"GCC: (GNU) 4.2.2"

And GCC3.4.6 generates much better code by using POST_INC address mode
extensively

tst:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1], #1	@ zero_extendqisi2
	strb	r3, [r0], #1
	ldrb	r3, [r1, #0]	@ zero_extendqisi2
	@ lr needed for prologue
	strb	r3, [r0, #0]
	mov	pc, lr
	.size	tst, .-tst
	.ident	"GCC: (GNU) 3.4.6"

I look at dumped tst.c.102t.final_cleanup:
tst (a, b)
{
  char * restrict a.54;
  char * restrict a.53;
  char * restrict a.52;
  char * restrict a.51;
  char * restrict a.50;
  char * restrict b.48;
  char * restrict b.47;
  char * restrict b.46;
  char * restrict b.45;
  char * restrict b.44;

<bb 2>:
  *a = *b;
  a.50 = a + 1B;
  b.44 = b + 1B;
  *a.50 = *b.44;
  a.51 = a.50 + 1B;
  b.45 = b.44 + 1B;
  *a.51 = *b.45;
  a.52 = a.51 + 1B;
  b.46 = b.45 + 1B;
  *a.52 = *b.46;
  a.53 = a.52 + 1B;
  b.47 = b.46 + 1B;
  *a.53 = *b.47;
  a.54 = a.53 + 1B;
  b.48 = b.47 + 1B;
  *a.54 = *b.48;
  *(a.54 + 1B) = *(b.48 + 1B);
  return;

}
I believe it is a fundermental issue for Tree-SSA IR. POST_INC address
mode requires a pattern that the same variable is used for incrementing
(both USE and DEF), while the SSA form produces a different varible for
each DEF. Therefore, GCC4 cannot efficiently use POST_INC and other
similar address modes. Is there any solution to overcome this problem?
Any suggestion is greatly appreciated. 


Bingfeng Mei
Broadcom UK

^ permalink raw reply	[flat|nested] 16+ messages in thread
* RE: Tree-SSA and POST_INC address mode inompatible in GCC4?
@ 2007-11-03 14:47 J.C. Pizarro
  2007-11-03 14:55 ` Kenneth Zadeck
  2007-11-03 14:55 ` J.C. Pizarro
  0 siblings, 2 replies; 16+ messages in thread
From: J.C. Pizarro @ 2007-11-03 14:47 UTC (permalink / raw)
  To: Kenneth Zadeck, gcc

2007/11/3, Kenneth Zadeck wrote:
> I believe that this is something new and is most likely fallout from
> diego's reworking of the tree to rtl converter.
>
> To fix this will require a round of copy propagation, most likely in
> concert with some induction variable detection, since the most
> profitable place for this will be in loops.
>
> I wonder if any of this effects the rtl level induction variable
> discovery?
>
> > Hi, Ramana,
> > I tried the trunk version  with/without your patch. It still produces
> > the same code as gcc4.2.2 does. In auto-inc-dec.c, the comments say
> >
> >          *a
> >            ...
> >            a <- a + c
> >
> >         becomes
> >
> >            *(a += c) post
> >
> > But the problem is after Tree-SSA pass,  there is no
> >            a <- a + c
> > But something like
> >            a_1 <- a + c
> >
> > Unless the auto-inc-dec.c can reverse a_1 <- a + c to a <- a + c. I
> > don't see this transformation is applicable in most scenarios. Any
> > comments?
> >
> > Cheers,
> > Bingfeng

They need to add an algorithm post-SSA that the code reuse the variables
converting a_j <- phi(a_i,...) to a_k <- phi(a_k,...).

The algorithms of POST_INC and POST_DEC are very specific, so an above
general algorithm is sufficient.

   J.C. Pizarro

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-11-05 19:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-02 12:24 Tree-SSA and POST_INC address mode inompatible in GCC4? Bingfeng Mei
2007-11-02 12:38 ` Ramana Radhakrishnan
2007-11-02 14:34   ` Bingfeng Mei
2007-11-03 13:52     ` Kenneth Zadeck
2007-11-03 14:25       ` Zdenek Dvorak
2007-11-03 14:52         ` Kenneth Zadeck
2007-11-03 15:27           ` Zdenek Dvorak
2007-11-03 16:23             ` Bingfeng Mei
2007-11-03 16:37             ` Richard Guenther
2007-11-04  3:13               ` Daniel Berlin
2007-11-04 23:51       ` Mark Mitchell
2007-11-05 19:30         ` Paul Brook
2007-11-03 14:47 J.C. Pizarro
2007-11-03 14:55 ` Kenneth Zadeck
2007-11-03 14:59   ` J.C. Pizarro
2007-11-03 14:55 ` J.C. Pizarro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).