[Bug target/23302] [4.1 Regression] extra move generated on x86

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/23302] [4.1 Regression] extra move generated on x86
       [not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
@ 2005-10-17  9:31 ` steven at gcc dot gnu dot org
  2005-10-31  4:43 ` mmitchel at gcc dot gnu dot org
  2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
  2 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-10-17  9:31 UTC (permalink / raw)
  To: gcc-bugs



-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |minor


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
       [not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
  2005-10-17  9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
@ 2005-10-31  4:43 ` mmitchel at gcc dot gnu dot org
  2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
  2 siblings, 0 replies; 7+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31  4:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from mmitchel at gcc dot gnu dot org  2005-10-31 04:43 -------
I think we should look for some solution to this problem, without reverting the
previous patch.  If this problem is amenable to a peephole, let's solve it that
way.

That said, I'm going to downgrade this to P4; if we can't fix it for 4.1, we'll
look again for 4.2.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P2                          |P4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
       [not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
  2005-10-17  9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
  2005-10-31  4:43 ` mmitchel at gcc dot gnu dot org
@ 2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
  2 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-10-31 18:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from hubicka at gcc dot gnu dot org  2005-10-31 18:15 -------
Actually the cited 4.0 sequence do not obey the const int x86_read_modify =
~(m_PENT | m_PPRO);
setting -march=athlon or using -Os makes the move go away.  Additionally I get
following sequence out of 4.0 in SUSE distro:
        movl    term, %eax
        movl    4012(%ebx), %ecx
        movl    4028(%eax), %eax
        imull   %ecx, %eax
that also contains the extra memory move.
So I am going to close this as invalid.
Honza


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/23302] New: extra move generated on x86
@ 2005-08-09 22:05 dann at godzilla dot ics dot uci dot edu
  2005-08-10  0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2005-08-09 22:05 UTC (permalink / raw)
  To: gcc-bugs

Compiling:

typedef char Boolean;
typedef unsigned char Char;
typedef Char *ScrnPtr;
typedef ScrnPtr *ScrnBuf;

typedef struct _WidgetRec *Widget;


typedef struct {
  int foo [1000];
  int max_col;
  int max_row;
  Widget scrollWidget;
  int savelines;
  ScrnBuf visbuf;
  ScrnBuf allbuf;
  Char *sbuf_address;
} TScreen;

typedef struct _XtermWidgetRec {
  TScreen screen;
  int num_ptrs;
}  *XtermWidget;


extern ScrnBuf Allocate (int nrow, int ncol, Char **addr);

extern XtermWidget term;

void
VTallocbuf(void)
{
  TScreen *screen = &term->screen;
  int nrows = screen->max_row + 1;

  if (screen->scrollWidget)
    nrows += screen->savelines;
  screen->allbuf = Allocate(nrows, screen->max_col + 1,
			    &screen->sbuf_address);
  if (screen->scrollWidget)
    screen->visbuf = &screen->allbuf[term->num_ptrs * screen->savelines];
  else
    screen->visbuf = screen->allbuf;
  return;
}

with 4.0 and 4.1 -march=i686 -O2 generates: (the significant part of the sdiff 
is shown) 

        movl    term, %edx                                    |         movl   
term, %eax
        movl    4012(%ebx), %eax                              |         movl   
4012(%ebx), %ecx
        imull   4028(%edx), %eax                              |         movl   
4028(%eax), %eax
        leal    (%ecx,%eax,4), %eax                           |         imull  
%ecx, %eax
                                                              >         leal   
(%edx,%eax,4), %eax


note that 4.1 generates an extra movl instruction. 
This is one of the reasons for 4.1 the code size regression in PR23153.

-- 
           Summary: extra move generated on x86
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dann at godzilla dot ics dot uci dot edu
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
  2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
@ 2005-08-10  0:06 ` pinskia at gcc dot gnu dot org
  2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-08-10  0:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-08-10 00:05 -------
Confirmed, the rtl dumps at .combinea are almost the same except for:
-(insn 45 44 46 3 (set (reg:SI 70 [ <variable>.savelines ])
-        (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
-                (const_int 4012 [0xfac])) [5 <variable>.savelines+0 S4 A32])) 35 {*movsi_1} (nil)
-    (nil))
+(insn 39 38 40 3 (set (reg:SI 65 [ <variable>.num_ptrs ])
+        (mem/s:SI (plus:SI (reg/f:SI 63 [ term ])
+                (const_int 4028 [0xfbc])) [4 <variable>.num_ptrs+0 S4 A32])) 34 {*movsi_1} (insn_list:
REG_DEP_TRUE 38 (nil))
+    (expr_list:REG_DEAD (reg/f:SI 63 [ term ])
+        (nil)))
 
-(insn 46 45 47 3 (parallel [
-            (set (reg:SI 71)
-                (mult:SI (mem/s:SI (plus:SI (reg/f:SI 68 [ term ])
-                            (const_int 4028 [0xfbc])) [5 <variable>.num_ptrs+0 S4 A32])
-                    (reg:SI 70 [ <variable>.savelines ])))
+(insn 40 39 41 3 (parallel [
+            (set (reg:SI 64)
+                (mult:SI (reg:SI 65 [ <variable>.num_ptrs ])
+                    (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
+                            (const_int 4012 [0xfac])) [4 <variable>.savelines+0 S4 A32])))
             (clobber (reg:CC 17 flags))
-        ]) 173 {*mulsi3_1} (insn_list:REG_DEP_TRUE 43 (insn_list:REG_DEP_TRUE 45 (nil)))
-    (expr_list:REG_DEAD (reg/f:SI 68 [ term ])
+        ]) 182 {*mulsi3_1} (insn_list:REG_DEP_TRUE 39 (nil))
+    (expr_list:REG_DEAD (reg:SI 65 [ <variable>.num_ptrs ])
         (expr_list:REG_UNUSED (reg:CC 17 flags)
-            (expr_list:REG_DEAD (reg:SI 70 [ <variable>.savelines ])
-                (nil)))))
+            (nil))))


Which is because we expand the multiply (from tree to rtl) as:
(insn 38 36 39 (set (reg/f:SI 63)
        (mem/f/i:SI (symbol_ref:SI ("term") [flags 0x40] <var_decl 0xb7cf60b0 term>) [9 term+0 S4 A32])) 
-1 (nil)
    (nil))

(insn 39 38 40 (set (reg:SI 65)
        (mem/s:SI (plus:SI (reg/f:SI 63)
                (const_int 4028 [0xfbc])) [4 <variable>.num_ptrs+0 S4 A32])) -1 (nil)
    (nil))

(insn 40 39 41 (parallel [
            (set (reg:SI 64)
                (mult:SI (reg:SI 65)
                    (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
                            (const_int 4012 [0xfac])) [4 <variable>.savelines+0 S4 A32])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

before we actually expanded the load of the screen->savelines seperately:
(insn 43 41 44 (set (reg/f:SI 68)
        (mem/f/i:SI (symbol_ref:SI ("term") [flags 0x40] <var_decl 0xb7cc8a8c term>) [9 term+0 S4 A32])) 
-1 (nil)
    (nil))
                
(insn 44 43 45 (set (reg:SI 69)
        (mem/s:SI (plus:SI (reg/f:SI 68)
                (const_int 4028 [0xfbc])) [5 <variable>.num_ptrs+0 S4 A32])) -1 (nil)
    (nil))
            
(insn 45 44 46 (set (reg:SI 70)
        (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
                (const_int 4012 [0xfac])) [5 <variable>.savelines+0 S4 A32])) -1 (nil)
    (nil))
                
(insn 46 45 47 (parallel [
            (set (reg:SI 71)
                (mult:SI (reg:SI 69)
                    (reg:SI 70)))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil)) 


This is either a target problem for the expansion or a middle-end problem while expand, I am going to 
say a target issue.

I think this was caused by:
2005-07-30  Jan Hubicka  <jh@suse.cz>

        * expr.c (expand_expr_real_1): Do not load mem targets into register.
        * i386.c (ix86_fixup_binary_operands): Likewise.
        (ix86_expand_unary_operator): Likewise.
        (ix86_expand_fp_absneg_operator): Likewise.
        * optabs.c (expand_vec_cond_expr): Validate dest.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu dot
                   |                            |org
             Status|UNCONFIRMED                 |NEW
          Component|rtl-optimization            |target
     Ever Confirmed|                            |1
  GCC build triplet|i686-pc-linux-gnu           |
   GCC host triplet|i686-pc-linux-gnu           |
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2005-08-10 00:05:55
               date|                            |
            Summary|extra move generated on x86 |[4.1 Regression] extra move
                   |                            |generated on x86
   Target Milestone|---                         |4.1.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
  2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
  2005-08-10  0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
@ 2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
  2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
  2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
  3 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-09-28 16:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From hubicka at gcc dot gnu dot org  2005-09-28 16:52 -------
The actual problem here is that from combine's point of view the two
alternatives (lea preceeded by loads, or add with memory operand followed by
shift) looks equivalent and previously the shorter sequence was purely choosed
by luck because combine followed the right edge first. It is not quite possible
to solve this by combine's splitting mechanizm as the number of instruction
don't change before the read-modify instructions are broken up.

While it might be probably possible to design peephole or combiner insn patter
I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
optimizing this by pure luck and the patch seems to have overall positive effect
on code size...

Honza

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
  2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
  2005-08-10  0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
  2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
@ 2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
  2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
  3 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-09-28 16:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From hubicka at gcc dot gnu dot org  2005-09-28 16:51 -------
The actual problem here is that from combine's point of view the two
alternatives (lea preceeded by loads, or add with memory operand followed by
shift) looks equivalent and previously the shorter sequence was purely choosed
by luck because combine followed the right edge first. It is not quite possible
to solve this by combine's splitting mechanizm as the number of instruction
don't change before the read-modify instructions are broken up.

While it might be probably possible to design peephole or combiner insn patter
I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
optimizing this by pure luck and the patch seems to have overall positive effect
on code size...

Honza

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/23302] [4.1 Regression] extra move generated on x86
  2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
                   ` (2 preceding siblings ...)
  2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
@ 2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
  3 siblings, 0 replies; 7+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2005-09-28 17:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dann at godzilla dot ics dot uci dot edu  2005-09-28 17:29 -------
(In reply to comment #2)
> While it might be probably possible to design peephole or combiner insn patter
> I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
> optimizing this by pure luck and the patch seems to have overall positive effect
> on code size...

IMHO closing these bugs as WONTFIX is not the right thing to do. This is clearly
a missed optimization opportunity. The fact that it worked by chance before your
(overall good) patch does not make fixing this issue less desirable.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-10-31 18:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
2005-10-17  9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
2005-10-31  4:43 ` mmitchel at gcc dot gnu dot org
2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
2005-08-10  0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).