public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/42586]  New: load-modify-store on x86 should be single instruction
@ 2010-01-03  6:04 andi-gcc at firstfloor dot org
  2010-01-03  6:27 ` [Bug rtl-optimization/42586] " pinskia at gcc dot gnu dot org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  6:04 UTC (permalink / raw)
  To: gcc-bugs

Continuation from 42585

>From one of the examples of
http://embed.cs.utah.edu/embarrassing/dec_09/harvest/gcc-head_llvm-gcc-head/

struct _fat_ptr
{
  unsigned char *curr;
  unsigned char *base;
  unsigned char *last_plus_one;
};
int Cyc_string_ungetc (int ignore, struct _fat_ptr *sptr);
int
Cyc_string_ungetc (int ignore, struct _fat_ptr *sptr)
{
  struct _fat_ptr *_T0;
  struct _fat_ptr *_T1;
  struct _fat_ptr _T2;
  int _T3;
  struct _fat_ptr _ans;
  int _change;

  {
    _T0 = sptr;
    _T1 = sptr;
    _T2 = *sptr;
    _T3 = -1;
    _ans = _T2;
    _change = -1;
    _ans.curr += 4294967295U;
    *sptr = _ans;
    return (0);
  }
}

when compiled with -O2 -m32 on 4.5.0 20091219 generates

Cyc_string_ungetc:
        subl    $32, %esp
        movl    40(%esp), %eax
        movl    (%eax), %edx
        subl    $1, %edx
        movl    %edx, (%eax)
        xorl    %eax, %eax
        addl    $32, %esp
        ret

Apart from the useless stack frame manipulation it should use

       subl $1,(%eax) 

not load-modify-store as recommended in the Intel/AMD optimization
manuals for any modern CPU.

I experimented with different -mtune=s
(thinking it was maybe optimizing for Pentium5 where this made sense), but that
didn't help (apart from turning the subl into a decl). The code snippet
above is for -mtune=generic -m32


-- 
           Summary: load-modify-store on x86 should be single instruction
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: andi-gcc at firstfloor dot org
  GCC host triplet: x86_64-linux
GCC target triplet: x86_64-linux -m32


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/42586] load-modify-store on x86 should be single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
@ 2010-01-03  6:27 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:28 ` pinskia at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from pinskia at gcc dot gnu dot org  2010-01-03 06:27 -------
Well one thing is SRA does not do its job which is why there is useless stack
frame changes.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug rtl-optimization/42586] load-modify-store on x86 should be single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
  2010-01-03  6:27 ` [Bug rtl-optimization/42586] " pinskia at gcc dot gnu dot org
@ 2010-01-03  6:28 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:30 ` [Bug tree-optimization/42586] SRA does not always work pinskia at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from pinskia at gcc dot gnu dot org  2010-01-03 06:27 -------
Oh and -Os is very bad looking :).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] SRA does not always work
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
  2010-01-03  6:27 ` [Bug rtl-optimization/42586] " pinskia at gcc dot gnu dot org
  2010-01-03  6:28 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  6:30 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:33 ` pinskia at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from pinskia at gcc dot gnu dot org  2010-01-03 06:30 -------
Oh in fact I think the reason is the SRA issue.  In that:
int
Cyc_string_ungetc (int ignore, struct _fat_ptr *sptr)
{
  sptr->curr += 4294967295U;
}
Works.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |tree-optimization
           Keywords|                            |missed-optimization
            Summary|load-modify-store on x86    |SRA does not always work
                   |should be single instruction|
            Version|unknown                     |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] SRA does not always work
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (2 preceding siblings ...)
  2010-01-03  6:30 ` [Bug tree-optimization/42586] SRA does not always work pinskia at gcc dot gnu dot org
@ 2010-01-03  6:33 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:35 ` pinskia at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2010-01-03 06:33 -------
Hmm, 4.5 is worse off than 4.4 with respect of the stack space usage.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] SRA does not always work
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (3 preceding siblings ...)
  2010-01-03  6:33 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  6:35 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:39 ` pinskia at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from pinskia at gcc dot gnu dot org  2010-01-03 06:35 -------
4.5 has:
  _T2 = *sptr_1(D);
  _T2$curr_14 = sptr_1(D)->curr;
  _ans = _T2;
  D.2697_7 = _T2$curr_14 + -1;
  *sptr_1(D) = _ans;
  sptr_1(D)->curr = D.2697_7;

While 4.4 does:
  _T2$base = sptr->base;
  D.1587 = sptr->curr + -1;
  sptr->last_plus_one = sptr->last_plus_one;
  sptr->base = _T2$base;
  sptr->curr = D.1587;


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] SRA does not always work
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (4 preceding siblings ...)
  2010-01-03  6:35 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  6:39 ` pinskia at gcc dot gnu dot org
  2010-01-03  6:55 ` [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction andi-gcc at firstfloor dot org
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from pinskia at gcc dot gnu dot org  2010-01-03 06:38 -------
I think if we get the old SRA behavior back for this code, we will get this
optimization in 4.5 since we remove the sptr->base and the other unnecessary
store during PRE.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (5 preceding siblings ...)
  2010-01-03  6:39 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  6:55 ` andi-gcc at firstfloor dot org
  2010-01-03  6:57 ` pinskia at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  6:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from andi-gcc at firstfloor dot org  2010-01-03 06:55 -------
This bug was not about the unnecessary stack frame, but about unnecessary 
read-modify-write
(it seems you didn't read my description completely)

For stack frame probably another bug should be opened. I can do that.

I reverted the summary to the original one.


-- 

andi-gcc at firstfloor dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SRA does not always work    |New: load-modify-store on
                   |                            |x86 should be \ single
                   |                            |instruction


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (6 preceding siblings ...)
  2010-01-03  6:55 ` [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction andi-gcc at firstfloor dot org
@ 2010-01-03  6:57 ` pinskia at gcc dot gnu dot org
  2010-01-03  7:00 ` andi-gcc at firstfloor dot org
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  6:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from pinskia at gcc dot gnu dot org  2010-01-03 06:57 -------
(In reply to comment #7)
> I reverted the summary to the original one.

Well it is all interrelated :).  Which is what my point was about that.  Did
you notice comment #3 which shows that load-modify-store actually does happen
in some cases.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (7 preceding siblings ...)
  2010-01-03  6:57 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  7:00 ` andi-gcc at firstfloor dot org
  2010-01-03  7:02 ` pinskia at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  7:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from andi-gcc at firstfloor dot org  2010-01-03 07:00 -------
SRA issue handled in #42590


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (8 preceding siblings ...)
  2010-01-03  7:00 ` andi-gcc at firstfloor dot org
@ 2010-01-03  7:02 ` pinskia at gcc dot gnu dot org
  2010-01-03  7:03 ` andi-gcc at firstfloor dot org
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  7:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from pinskia at gcc dot gnu dot org  2010-01-03 07:02 -------
(In reply to comment #9)
> SRA issue handled in #42590

And that is the same as the -Os issue really ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (9 preceding siblings ...)
  2010-01-03  7:02 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  7:03 ` andi-gcc at firstfloor dot org
  2010-01-03  7:07 ` pinskia at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  7:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from andi-gcc at firstfloor dot org  2010-01-03 07:03 -------
My naive assumption was that the read-modify-write pattern is handled late by
the RTL backend when generating instructions while SRA is somewhere early in
the tree oriented middle end.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (10 preceding siblings ...)
  2010-01-03  7:03 ` andi-gcc at firstfloor dot org
@ 2010-01-03  7:07 ` pinskia at gcc dot gnu dot org
  2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  7:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from pinskia at gcc dot gnu dot org  2010-01-03 07:07 -------
(In reply to comment #11)
> My naive assumption was that the read-modify-write pattern is handled late by
> the RTL backend when generating instructions while SRA is somewhere early in
> the tree oriented middle end.

Well the other issue is that only post-reload-cse deletes the load/stores
without SRA doing what needs to be done.  And since the deletion happens so
late, combine does not combine the three instruction which does the
read-modify-write.  :).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (11 preceding siblings ...)
  2010-01-03  7:07 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
  2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  7:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from pinskia at gcc dot gnu dot org  2010-01-03 07:09 -------
(In reply to comment #12)
> Well the other issue is that only post-reload-cse deletes the load/stores
> without SRA doing what needs to be done.  

This was PR 38513 which we still should be done at the RTL level before the
register allocation :).  In fact PR 38513 was fixed at the tree level  for 4.5
which causes the read-modify-store will work after SRA is fixed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (12 preceding siblings ...)
  2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
  2010-01-03  7:47 ` andi-gcc at firstfloor dot org
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  7:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from pinskia at gcc dot gnu dot org  2010-01-03 07:10 -------
If we manually do the SRA like SRA would do it in 4.4:
  unsigned char *a, *b, *c;
  a = sptr->curr;
  b = sptr->base;
  c = sptr->last_plus_one;
  a +=4294967295U;
  sptr->curr = a;
  sptr->base = b;
  sptr->last_plus_one = c;

We get the behavior you want.  This is why I said this is all interrelated. 
PRE is able to remove the stores at the tree level on the trunk which then it
looks like what I showed in comment #3.  Again this is all the same issue
really.  And there is no need to file more bugs about the same issue.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (13 preceding siblings ...)
  2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  7:47 ` andi-gcc at firstfloor dot org
  2010-01-03  7:49 ` pinskia at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  7:47 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from andi-gcc at firstfloor dot org  2010-01-03 07:47 -------
As an addendum the unnecessary stack frame seems to be a very common problem,
it's in a lot of the examples near the end of 
http://embed.cs.utah.edu/embarrassing/dec_09/harvest/gcc-head_suncc-5.10/
typically combined with tail call optimization


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (14 preceding siblings ...)
  2010-01-03  7:47 ` andi-gcc at firstfloor dot org
@ 2010-01-03  7:49 ` pinskia at gcc dot gnu dot org
  2010-01-03  8:13 ` andi-gcc at firstfloor dot org
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-03  7:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from pinskia at gcc dot gnu dot org  2010-01-03 07:49 -------
(In reply to comment #15)
> As an addendum the unnecessary stack frame seems to be a very common problem,
> it's in a lot of the examples near the end of 
> http://embed.cs.utah.edu/embarrassing/dec_09/harvest/gcc-head_suncc-5.10/
> typically combined with tail call optimization

Well the unnecessary stack frame is due to frame pointer is not omitted by
default with GCC due to debugging/backtrace reasons.  Try -fomit-frame-pointer
:).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (15 preceding siblings ...)
  2010-01-03  7:49 ` pinskia at gcc dot gnu dot org
@ 2010-01-03  8:13 ` andi-gcc at firstfloor dot org
  2010-01-03 11:11 ` [Bug tree-optimization/42586] load-modify-store on x86 should be a " rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03  8:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from andi-gcc at firstfloor dot org  2010-01-03 08:12 -------
Nope, the examples all disable frame pointer
(the first version of the tester had this problem, but that was fixed)

It's all not frame pointer set up anyways, but sub $...,%esp ... ; add
$...,%esp
with no use of the stack frame (or often even nothing at all) inbetween


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] load-modify-store on x86 should be a single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (16 preceding siblings ...)
  2010-01-03  8:13 ` andi-gcc at firstfloor dot org
@ 2010-01-03 11:11 ` rguenth at gcc dot gnu dot org
  2010-01-03 13:19 ` andi-gcc at firstfloor dot org
  2010-01-25 17:51 ` jamborm at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-03 11:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from rguenth at gcc dot gnu dot org  2010-01-03 11:11 -------
Confirmed.  This is indeed showing two cases of SRA behaving oddly.  First
early SRA produces

<bb 2>:
  _T2 = *sptr_1(D);
  _ans = _T2;
  _ans$curr_12 = _T2.curr;
  D.1965_6 = _ans$curr_12;
  D.1966_7 = D.1965_6 + -1;
  _ans$curr_2 = D.1966_7;
  *sptr_1(D) = _ans;
  sptr_1(D)->curr = _ans$curr_2;

from

<bb 2>:
  _T2 = *sptr_1(D);
  _ans = _T2;
  D.1965_6 = _ans.curr;
  D.1966_7 = D.1965_6 + -1;
  _ans.curr = D.1966_7;
  *sptr_1(D) = _ans;

then late SRA continues to obfuscate things by producing

<bb 2>:
  _T2 = *sptr_1(D);
  _T2$curr_14 = sptr_1(D)->curr;
  _ans = _T2;
  _ans.curr = _T2$curr_14;
  _ans$curr_12 = _T2$curr_14;
  D.1966_7 = _ans$curr_12 + -1;
  *sptr_1(D) = _ans;
  sptr_1(D)->curr = D.1966_7;

note how both times it increases the amount of variables and their lifetime
(I think it's bad we run SRA twice anyway).  The first SRA pass should
have produced

  _T2$curr_10 = sptr_1(D)->curr;
  D.1965_6 = _T2$curr_10;
  D.1966_7 = D.1965_6 + -1;
  _T2$curr_11 = D.1966_7;
  sptr_1(D)->curr = _T2$curr_11;

The old ESRA implementation decomposed all structure copies completely
and thus left to produce the above by further scalar optimizations:

<bb 2>:
  _T2$last_plus_one_2 = sptr_1(D)->last_plus_one;
  _T2$base_3 = sptr_1(D)->base;
  _T2$curr_4 = sptr_1(D)->curr;
  _ans$last_plus_one_5 = _T2$last_plus_one_2;
  _ans$base_8 = _T2$base_3;
  _ans$curr_9 = _T2$curr_4;
  D.1262_6 = _ans$curr_9;
  D.1263_7 = D.1262_6 + -1;
  _ans$curr_10 = D.1263_7;
  sptr_1(D)->last_plus_one ={v} _ans$last_plus_one_5;
  sptr_1(D)->base ={v} _ans$base_8;
  sptr_1(D)->curr ={v} _ans$curr_10;


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2010-01-03 11:11:22
               date|                            |
            Summary|New: load-modify-store on   |load-modify-store on x86
                   |x86 should be \ single      |should be a single
                   |instruction                 |instruction


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] load-modify-store on x86 should be a single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (17 preceding siblings ...)
  2010-01-03 11:11 ` [Bug tree-optimization/42586] load-modify-store on x86 should be a " rguenth at gcc dot gnu dot org
@ 2010-01-03 13:19 ` andi-gcc at firstfloor dot org
  2010-01-25 17:51 ` jamborm at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: andi-gcc at firstfloor dot org @ 2010-01-03 13:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from andi-gcc at firstfloor dot org  2010-01-03 13:18 -------

Here's another example with the bogus stack frame problem with a tail call
(the embarassing website has quite a lot of similar cases, normally
near the tail of the tables with 16 vs 22 bytes e.g. against sun cc
which also does tail optimization):


typedef long __clock_t;

typedef __clock_t clock_t;

struct lprofS_sSTACK_RECORD;

typedef struct lprofS_sSTACK_RECORD lprofS_STACK_RECORD;

struct lprofS_sSTACK_RECORD

{

  clock_t time_marker_function_local_time;

  clock_t time_marker_function_total_time;

  char *file_defined;

  char *function_name;

  char *source_code;

  long line_defined;

  long current_line;

  float local_time;

  float total_time;

  lprofS_STACK_RECORD *next;

};

typedef lprofS_STACK_RECORD *lprofS_STACK;

struct lprofP_sSTATE;

typedef struct lprofP_sSTATE lprofP_STATE;

struct lprofP_sSTATE

{

  int stack_level;

  lprofS_STACK stack_top;

};

extern void lprofC_start_timer (clock_t * time_marker);

void lprofM_resume_local_time (lprofP_STATE * S);

void

lprofM_resume_local_time (lprofP_STATE * S)

{



  {

    lprofC_start_timer (&(S->stack_top)->time_marker_function_local_time);

    return;

  }

}

gives with -m32 -fomit-frame-pointer -O2 and 4.5.0 20091219

 subl    $12, %esp
        movl    16(%esp), %eax
        movl    4(%eax), %eax
        movl    %eax, 16(%esp)
        addl    $12, %esp
        jmp     lprofC_start_timer

The subl/addl are completely unnecessary


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Bug tree-optimization/42586] load-modify-store on x86 should be a single instruction
  2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
                   ` (18 preceding siblings ...)
  2010-01-03 13:19 ` andi-gcc at firstfloor dot org
@ 2010-01-25 17:51 ` jamborm at gcc dot gnu dot org
  19 siblings, 0 replies; 21+ messages in thread
From: jamborm at gcc dot gnu dot org @ 2010-01-25 17:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from jamborm at gcc dot gnu dot org  2010-01-25 17:51 -------
(In reply to comment #6)
> I think if we get the old SRA behavior back for this code, we will get this
> optimization in 4.5 since we remove the sptr->base and the other unnecessary
> store during PRE.
> 

We now should have the "old SRA behavior" back (this was PR 42585).  I
have checked only very briefly but I believe that this testcase is now
compiled as requested in the bug summary.  I'd be grateful if someone
could double-check this and close this bug if it is indeed so.
Thanks.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42586


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-01-25 17:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-03  6:04 [Bug rtl-optimization/42586] New: load-modify-store on x86 should be single instruction andi-gcc at firstfloor dot org
2010-01-03  6:27 ` [Bug rtl-optimization/42586] " pinskia at gcc dot gnu dot org
2010-01-03  6:28 ` pinskia at gcc dot gnu dot org
2010-01-03  6:30 ` [Bug tree-optimization/42586] SRA does not always work pinskia at gcc dot gnu dot org
2010-01-03  6:33 ` pinskia at gcc dot gnu dot org
2010-01-03  6:35 ` pinskia at gcc dot gnu dot org
2010-01-03  6:39 ` pinskia at gcc dot gnu dot org
2010-01-03  6:55 ` [Bug tree-optimization/42586] New: load-modify-store on x86 should be \ single instruction andi-gcc at firstfloor dot org
2010-01-03  6:57 ` pinskia at gcc dot gnu dot org
2010-01-03  7:00 ` andi-gcc at firstfloor dot org
2010-01-03  7:02 ` pinskia at gcc dot gnu dot org
2010-01-03  7:03 ` andi-gcc at firstfloor dot org
2010-01-03  7:07 ` pinskia at gcc dot gnu dot org
2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
2010-01-03  7:10 ` pinskia at gcc dot gnu dot org
2010-01-03  7:47 ` andi-gcc at firstfloor dot org
2010-01-03  7:49 ` pinskia at gcc dot gnu dot org
2010-01-03  8:13 ` andi-gcc at firstfloor dot org
2010-01-03 11:11 ` [Bug tree-optimization/42586] load-modify-store on x86 should be a " rguenth at gcc dot gnu dot org
2010-01-03 13:19 ` andi-gcc at firstfloor dot org
2010-01-25 17:51 ` jamborm at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).