public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86)
@ 2009-09-29 15:53 law at redhat dot com
2009-09-29 16:07 ` [Bug target/41505] " rguenth at gcc dot gnu dot org
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: law at redhat dot com @ 2009-09-29 15:53 UTC (permalink / raw)
To: gcc-bugs
Consider:
foo(int *x)
{
*x = 0;
}
Compile with -Os -fomit-frame-pointer and you'll get something like this:
movl 4(%esp), %eax
movl $0, (%eax)
It would be 2 bytes shorter to instead load the constant 0 via an xor
instruction into a scratch register, then store the scratch register into the
memory location. Something like this:
movl 4(%esp), %eax
xor %edx, %edx
movl %edx, (%eax)
ISTM this could easily be implemented with a peep2.
I'm not well versed enough in x86 instruction timings to know if the xor
sequence is going to generally be faster.
--
Summary: GCC choosing poor code sequence for certain stores (x86)
Product: gcc
Version: unknown
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: law at redhat dot com
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
@ 2009-09-29 16:07 ` rguenth at gcc dot gnu dot org
2009-09-29 17:12 ` law at redhat dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-29 16:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2009-09-29 16:07 -------
difficult
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
2009-09-29 16:07 ` [Bug target/41505] " rguenth at gcc dot gnu dot org
@ 2009-09-29 17:12 ` law at redhat dot com
2009-09-29 21:18 ` rth at gcc dot gnu dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: law at redhat dot com @ 2009-09-29 17:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from law at redhat dot com 2009-09-29 17:12 -------
I don't understand your comment Richard. Isn't it just something like this?
(define_peephole2
[(match_scratch:SI 2 "r")
(set (match_operand:SI 0 "memory_operand" "")
(match_operand:SI 1 "const_0_operand" ""))]
""
[(set (match_dup 2) (match_dup 1))
(set (match_dup 0) (match_dup 2))]
"")
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
2009-09-29 16:07 ` [Bug target/41505] " rguenth at gcc dot gnu dot org
2009-09-29 17:12 ` law at redhat dot com
@ 2009-09-29 21:18 ` rth at gcc dot gnu dot org
2009-09-29 21:56 ` law at redhat dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: rth at gcc dot gnu dot org @ 2009-09-29 21:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rth at gcc dot gnu dot org 2009-09-29 21:18 -------
There are already peepholes for this, though the condition appears to be
slightly wrong for -Os. See i386.md:21121 :
(define_peephole2
[(match_scratch:SI 1 "r")
(set (match_operand:SI 0 "memory_operand" "")
(const_int 0))]
"optimize_insn_for_speed_p ()
&& ! TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cur_cost ()->large_insn
&& peep2_regno_dead_p (0, FLAGS_REG)"
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
` (2 preceding siblings ...)
2009-09-29 21:18 ` rth at gcc dot gnu dot org
@ 2009-09-29 21:56 ` law at redhat dot com
2009-09-29 23:43 ` rth at gcc dot gnu dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: law at redhat dot com @ 2009-09-29 21:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from law at redhat dot com 2009-09-29 21:55 -------
Subject: Re: GCC choosing poor code sequence for certain
stores (x86)
On 09/29/09 15:18, rth at gcc dot gnu dot org wrote:
> ------- Comment #3 from rth at gcc dot gnu dot org 2009-09-29 21:18 -------
> There are already peepholes for this, though the condition appears to be
> slightly wrong for -Os. See i386.md:21121 :
>
> (define_peephole2
> [(match_scratch:SI 1 "r")
> (set (match_operand:SI 0 "memory_operand" "")
> (const_int 0))]
> "optimize_insn_for_speed_p ()
> && ! TARGET_USE_MOV0
> && TARGET_SPLIT_LONG_MOVES
> && get_attr_length (insn)>= ix86_cur_cost ()->large_insn
> && peep2_regno_dead_p (0, FLAGS_REG)"
>
>
Ah, yes, the flags register needs to be available.
As for the condition, after reading optimization guides for the various
x86 chips that
mov $0, <mem>
is generally going to be faster than
xor temp, temp
mov temp, <mem>
So I was thinking we'd want something like this for the condition.
((optimize_insn_for_size_p ()
|| (!TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cur_cost()->large_insn))
&& peep2_regno_dead_p (0, FLAGS_REG)
Which I think should always give us the xor sequence when optimizing for
size or when optimizing for the odd x86 implementation where the xor
sequence is faster.
I can easily bundle that up as a patch if it looks right to you...
Jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
` (3 preceding siblings ...)
2009-09-29 21:56 ` law at redhat dot com
@ 2009-09-29 23:43 ` rth at gcc dot gnu dot org
2009-09-30 9:22 ` jakub at gcc dot gnu dot org
2009-09-30 14:47 ` law at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: rth at gcc dot gnu dot org @ 2009-09-29 23:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rth at gcc dot gnu dot org 2009-09-29 23:43 -------
Yeah, that looks right.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
` (4 preceding siblings ...)
2009-09-29 23:43 ` rth at gcc dot gnu dot org
@ 2009-09-30 9:22 ` jakub at gcc dot gnu dot org
2009-09-30 14:47 ` law at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-09-30 9:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from jakub at gcc dot gnu dot org 2009-09-30 09:22 -------
For x86-64 we perhaps want further checks for the size optimization - if the
scratch register is %r8d through %r15d, 3 byte xorl %r8d, %r8d and e.g. 3 byte
movl %r8d, (%rdx) won't be shorter than movl $0, (%rdx) which is 6 bytes).
And likely the 2 insns will be slower.
But if the address already needs rex prefix, it is still a win.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
` (5 preceding siblings ...)
2009-09-30 9:22 ` jakub at gcc dot gnu dot org
@ 2009-09-30 14:47 ` law at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: law at redhat dot com @ 2009-09-30 14:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from law at redhat dot com 2009-09-30 14:47 -------
Subject: Re: GCC choosing poor code sequence for certain
stores (x86)
On 09/30/09 03:22, jakub at gcc dot gnu dot org wrote:
> ------- Comment #6 from jakub at gcc dot gnu dot org 2009-09-30 09:22 -------
> For x86-64 we perhaps want further checks for the size optimization - if the
> scratch register is %r8d through %r15d, 3 byte xorl %r8d, %r8d and e.g. 3 byte
> movl %r8d, (%rdx) won't be shorter than movl $0, (%rdx) which is 6 bytes).
> And likely the 2 insns will be slower.
> But if the address already needs rex prefix, it is still a win.
>
>
>
Do we have any good way to test if the address needs a rex prefix? I
see the rex_prefix attribute in i386.md, but that's for testing an
entire insn and based on my quick reading of i386.md it's not complete
as many insns set the attribute explicitly.
Jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/41505] GCC choosing poor code sequence for certain stores (x86)
[not found] <bug-41505-4@http.gcc.gnu.org/bugzilla/>
@ 2021-07-26 18:57 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 18:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a dup of bug 11877 which is now fixed on the trunk.
*** This bug has been marked as a duplicate of bug 11877 ***
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-07-26 18:57 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-29 15:53 [Bug target/41505] New: GCC choosing poor code sequence for certain stores (x86) law at redhat dot com
2009-09-29 16:07 ` [Bug target/41505] " rguenth at gcc dot gnu dot org
2009-09-29 17:12 ` law at redhat dot com
2009-09-29 21:18 ` rth at gcc dot gnu dot org
2009-09-29 21:56 ` law at redhat dot com
2009-09-29 23:43 ` rth at gcc dot gnu dot org
2009-09-30 9:22 ` jakub at gcc dot gnu dot org
2009-09-30 14:47 ` law at redhat dot com
[not found] <bug-41505-4@http.gcc.gnu.org/bugzilla/>
2021-07-26 18:57 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).