public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/23302] [4.1 Regression] extra move generated on x86
[not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
@ 2005-10-17 9:31 ` steven at gcc dot gnu dot org
2005-10-31 4:43 ` mmitchel at gcc dot gnu dot org
2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
2 siblings, 0 replies; 7+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-10-17 9:31 UTC (permalink / raw)
To: gcc-bugs
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |minor
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
[not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
2005-10-17 9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
@ 2005-10-31 4:43 ` mmitchel at gcc dot gnu dot org
2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
2 siblings, 0 replies; 7+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31 4:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from mmitchel at gcc dot gnu dot org 2005-10-31 04:43 -------
I think we should look for some solution to this problem, without reverting the
previous patch. If this problem is amenable to a peephole, let's solve it that
way.
That said, I'm going to downgrade this to P4; if we can't fix it for 4.1, we'll
look again for 4.2.
--
mmitchel at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P2 |P4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
[not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
2005-10-17 9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
2005-10-31 4:43 ` mmitchel at gcc dot gnu dot org
@ 2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
2 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-10-31 18:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from hubicka at gcc dot gnu dot org 2005-10-31 18:15 -------
Actually the cited 4.0 sequence do not obey the const int x86_read_modify =
~(m_PENT | m_PPRO);
setting -march=athlon or using -Os makes the move go away. Additionally I get
following sequence out of 4.0 in SUSE distro:
movl term, %eax
movl 4012(%ebx), %ecx
movl 4028(%eax), %eax
imull %ecx, %eax
that also contains the extra memory move.
So I am going to close this as invalid.
Honza
--
hubicka at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/23302] New: extra move generated on x86
@ 2005-08-09 22:05 dann at godzilla dot ics dot uci dot edu
2005-08-10 0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2005-08-09 22:05 UTC (permalink / raw)
To: gcc-bugs
Compiling:
typedef char Boolean;
typedef unsigned char Char;
typedef Char *ScrnPtr;
typedef ScrnPtr *ScrnBuf;
typedef struct _WidgetRec *Widget;
typedef struct {
int foo [1000];
int max_col;
int max_row;
Widget scrollWidget;
int savelines;
ScrnBuf visbuf;
ScrnBuf allbuf;
Char *sbuf_address;
} TScreen;
typedef struct _XtermWidgetRec {
TScreen screen;
int num_ptrs;
} *XtermWidget;
extern ScrnBuf Allocate (int nrow, int ncol, Char **addr);
extern XtermWidget term;
void
VTallocbuf(void)
{
TScreen *screen = &term->screen;
int nrows = screen->max_row + 1;
if (screen->scrollWidget)
nrows += screen->savelines;
screen->allbuf = Allocate(nrows, screen->max_col + 1,
&screen->sbuf_address);
if (screen->scrollWidget)
screen->visbuf = &screen->allbuf[term->num_ptrs * screen->savelines];
else
screen->visbuf = screen->allbuf;
return;
}
with 4.0 and 4.1 -march=i686 -O2 generates: (the significant part of the sdiff
is shown)
movl term, %edx | movl
term, %eax
movl 4012(%ebx), %eax | movl
4012(%ebx), %ecx
imull 4028(%edx), %eax | movl
4028(%eax), %eax
leal (%ecx,%eax,4), %eax | imull
%ecx, %eax
> leal
(%edx,%eax,4), %eax
note that 4.1 generates an extra movl instruction.
This is one of the reasons for 4.1 the code size regression in PR23153.
--
Summary: extra move generated on x86
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dann at godzilla dot ics dot uci dot edu
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
@ 2005-08-10 0:06 ` pinskia at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-08-10 0:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-08-10 00:05 -------
Confirmed, the rtl dumps at .combinea are almost the same except for:
-(insn 45 44 46 3 (set (reg:SI 70 [ <variable>.savelines ])
- (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
- (const_int 4012 [0xfac])) [5 <variable>.savelines+0 S4 A32])) 35 {*movsi_1} (nil)
- (nil))
+(insn 39 38 40 3 (set (reg:SI 65 [ <variable>.num_ptrs ])
+ (mem/s:SI (plus:SI (reg/f:SI 63 [ term ])
+ (const_int 4028 [0xfbc])) [4 <variable>.num_ptrs+0 S4 A32])) 34 {*movsi_1} (insn_list:
REG_DEP_TRUE 38 (nil))
+ (expr_list:REG_DEAD (reg/f:SI 63 [ term ])
+ (nil)))
-(insn 46 45 47 3 (parallel [
- (set (reg:SI 71)
- (mult:SI (mem/s:SI (plus:SI (reg/f:SI 68 [ term ])
- (const_int 4028 [0xfbc])) [5 <variable>.num_ptrs+0 S4 A32])
- (reg:SI 70 [ <variable>.savelines ])))
+(insn 40 39 41 3 (parallel [
+ (set (reg:SI 64)
+ (mult:SI (reg:SI 65 [ <variable>.num_ptrs ])
+ (mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
+ (const_int 4012 [0xfac])) [4 <variable>.savelines+0 S4 A32])))
(clobber (reg:CC 17 flags))
- ]) 173 {*mulsi3_1} (insn_list:REG_DEP_TRUE 43 (insn_list:REG_DEP_TRUE 45 (nil)))
- (expr_list:REG_DEAD (reg/f:SI 68 [ term ])
+ ]) 182 {*mulsi3_1} (insn_list:REG_DEP_TRUE 39 (nil))
+ (expr_list:REG_DEAD (reg:SI 65 [ <variable>.num_ptrs ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
- (expr_list:REG_DEAD (reg:SI 70 [ <variable>.savelines ])
- (nil)))))
+ (nil))))
Which is because we expand the multiply (from tree to rtl) as:
(insn 38 36 39 (set (reg/f:SI 63)
(mem/f/i:SI (symbol_ref:SI ("term") [flags 0x40] <var_decl 0xb7cf60b0 term>) [9 term+0 S4 A32]))
-1 (nil)
(nil))
(insn 39 38 40 (set (reg:SI 65)
(mem/s:SI (plus:SI (reg/f:SI 63)
(const_int 4028 [0xfbc])) [4 <variable>.num_ptrs+0 S4 A32])) -1 (nil)
(nil))
(insn 40 39 41 (parallel [
(set (reg:SI 64)
(mult:SI (reg:SI 65)
(mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
(const_int 4012 [0xfac])) [4 <variable>.savelines+0 S4 A32])))
(clobber (reg:CC 17 flags))
]) -1 (nil)
(nil))
before we actually expanded the load of the screen->savelines seperately:
(insn 43 41 44 (set (reg/f:SI 68)
(mem/f/i:SI (symbol_ref:SI ("term") [flags 0x40] <var_decl 0xb7cc8a8c term>) [9 term+0 S4 A32]))
-1 (nil)
(nil))
(insn 44 43 45 (set (reg:SI 69)
(mem/s:SI (plus:SI (reg/f:SI 68)
(const_int 4028 [0xfbc])) [5 <variable>.num_ptrs+0 S4 A32])) -1 (nil)
(nil))
(insn 45 44 46 (set (reg:SI 70)
(mem/s:SI (plus:SI (reg/v/f:SI 59 [ screen ])
(const_int 4012 [0xfac])) [5 <variable>.savelines+0 S4 A32])) -1 (nil)
(nil))
(insn 46 45 47 (parallel [
(set (reg:SI 71)
(mult:SI (reg:SI 69)
(reg:SI 70)))
(clobber (reg:CC 17 flags))
]) -1 (nil)
(nil))
This is either a target problem for the expansion or a middle-end problem while expand, I am going to
say a target issue.
I think this was caused by:
2005-07-30 Jan Hubicka <jh@suse.cz>
* expr.c (expand_expr_real_1): Do not load mem targets into register.
* i386.c (ix86_fixup_binary_operands): Likewise.
(ix86_expand_unary_operator): Likewise.
(ix86_expand_fp_absneg_operator): Likewise.
* optabs.c (expand_vec_cond_expr): Validate dest.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu dot
| |org
Status|UNCONFIRMED |NEW
Component|rtl-optimization |target
Ever Confirmed| |1
GCC build triplet|i686-pc-linux-gnu |
GCC host triplet|i686-pc-linux-gnu |
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2005-08-10 00:05:55
date| |
Summary|extra move generated on x86 |[4.1 Regression] extra move
| |generated on x86
Target Milestone|--- |4.1.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
2005-08-10 0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
@ 2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
3 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-09-28 16:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From hubicka at gcc dot gnu dot org 2005-09-28 16:52 -------
The actual problem here is that from combine's point of view the two
alternatives (lea preceeded by loads, or add with memory operand followed by
shift) looks equivalent and previously the shorter sequence was purely choosed
by luck because combine followed the right edge first. It is not quite possible
to solve this by combine's splitting mechanizm as the number of instruction
don't change before the read-modify instructions are broken up.
While it might be probably possible to design peephole or combiner insn patter
I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
optimizing this by pure luck and the patch seems to have overall positive effect
on code size...
Honza
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
2005-08-10 0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
@ 2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
3 siblings, 0 replies; 7+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-09-28 16:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From hubicka at gcc dot gnu dot org 2005-09-28 16:51 -------
The actual problem here is that from combine's point of view the two
alternatives (lea preceeded by loads, or add with memory operand followed by
shift) looks equivalent and previously the shorter sequence was purely choosed
by luck because combine followed the right edge first. It is not quite possible
to solve this by combine's splitting mechanizm as the number of instruction
don't change before the read-modify instructions are broken up.
While it might be probably possible to design peephole or combiner insn patter
I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
optimizing this by pure luck and the patch seems to have overall positive effect
on code size...
Honza
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/23302] [4.1 Regression] extra move generated on x86
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
` (2 preceding siblings ...)
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
@ 2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
3 siblings, 0 replies; 7+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2005-09-28 17:30 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dann at godzilla dot ics dot uci dot edu 2005-09-28 17:29 -------
(In reply to comment #2)
> While it might be probably possible to design peephole or combiner insn patter
> I am tempted to close this and PR 23303 as WONTFIX as it seems to me we was
> optimizing this by pure luck and the patch seems to have overall positive effect
> on code size...
IMHO closing these bugs as WONTFIX is not the right thing to do. This is clearly
a missed optimization opportunity. The fact that it worked by chance before your
(overall good) patch does not make fixing this issue less desirable.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23302
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-10-31 18:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-23302-1008@http.gcc.gnu.org/bugzilla/>
2005-10-17 9:31 ` [Bug target/23302] [4.1 Regression] extra move generated on x86 steven at gcc dot gnu dot org
2005-10-31 4:43 ` mmitchel at gcc dot gnu dot org
2005-10-31 18:15 ` hubicka at gcc dot gnu dot org
2005-08-09 22:05 [Bug rtl-optimization/23302] New: " dann at godzilla dot ics dot uci dot edu
2005-08-10 0:06 ` [Bug target/23302] [4.1 Regression] " pinskia at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 16:52 ` hubicka at gcc dot gnu dot org
2005-09-28 17:30 ` dann at godzilla dot ics dot uci dot edu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).