public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
@ 2021-06-21 7:56 ` cvs-commit at gcc dot gnu.org
2021-06-22 8:18 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-21 7:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:9cedbaab8e048b90ceb9ceef0d851385fae67cde
commit r12-1668-g9cedbaab8e048b90ceb9ceef0d851385fae67cde
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Mon Jun 21 08:54:50 2021 +0100
PR target/11877: Use xor to write zero to memory with -Os
The following patch attempts to resolve PR target/11877 (without
triggering PR/23102). On x86_64, writing an SImode or DImode zero
to memory uses an instruction encoding that is larger than first
clearing a register (using xor) then writing that to memory. Hence,
after reload, the peephole2 pass can determine if there's a suitable
free register, and if so, use that to shrink the code size with -Os.
To improve code size, and avoid inserting a large number of xor
instructions (PR target/23102), this patch makes use of peephole2's
efficient pattern matching to use a single temporary for a run of
consecutive writes. In theory, one could do better still with a
new target-specific pass, gated on -Os, to shrink these instructions
(like stv), but that's probably overkill for the little remaining
space savings.
Evaluating this patch on the CSiBE benchmark (v2.1.1) results in a
0.26% code size improvement (3715273 bytes down to 3705477) on x86_64
with -Os [saving 1 byte every 400]. 549 of 894 tests improve, two
tests grow larger. Analysis of these 2 pathological cases reveals
that although peephole2's match_scratch prefers to use a call-clobbered
register (to avoid requiring a new stack frame), very rarely this
interacts with GCC's shrink wrapping optimization, which may previously
have avoided saving/restoring a call clobbered register, such as %eax,
in the calling function.
2021-06-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/11877
* config/i386/i386.md: New define_peephole2s to shrink writing
1, 2 or 4 consecutive zeros to memory when optimizing for size.
gcc/testsuite/ChangeLog
PR target/11877
* gcc.target/i386/pr11877.c: New test case.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
2021-06-21 7:56 ` [Bug target/11877] gcc should use xor trick with -Os cvs-commit at gcc dot gnu.org
@ 2021-06-22 8:18 ` cvs-commit at gcc dot gnu.org
2021-07-10 8:24 ` roger at nextmovesoftware dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-22 8:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:d58a66aa0faa64bfbd85e528be5104293dd41d0e
commit r12-1712-gd58a66aa0faa64bfbd85e528be5104293dd41d0e
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue Jun 22 10:16:18 2021 +0200
i386: Use xor to write zero to memory with -Os even for more than 4 stores
[PR11877]
> > 2021-06-20 Roger Sayle <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> > PR target/11877
> > * config/i386/i386.md: New define_peephole2s to shrink writing
> > 1, 2 or 4 consecutive zeros to memory when optimizing for size.
It unfortunately doesn't extend well to larger memory clearing.
Consider e.g.
void
foo (int *p)
{
p[0] = 0;
p[7] = 0;
p[23] = 0;
p[41] = 0;
p[48] = 0;
p[59] = 0;
p[69] = 0;
p[78] = 0;
p[83] = 0;
p[89] = 0;
p[98] = 0;
p[121] = 0;
p[132] = 0;
p[143] = 0;
p[154] = 0;
}
where with the patch we emit:
xorl %eax, %eax
xorl %edx, %edx
xorl %ecx, %ecx
xorl %esi, %esi
xorl %r8d, %r8d
movl %eax, (%rdi)
movl %eax, 28(%rdi)
movl %eax, 92(%rdi)
movl %eax, 164(%rdi)
movl %edx, 192(%rdi)
movl %edx, 236(%rdi)
movl %edx, 276(%rdi)
movl %edx, 312(%rdi)
movl %ecx, 332(%rdi)
movl %ecx, 356(%rdi)
movl %ecx, 392(%rdi)
movl %ecx, 484(%rdi)
movl %esi, 528(%rdi)
movl %esi, 572(%rdi)
movl %r8d, 616(%rdi)
Here is an incremental patch that emits:
xorl %eax, %eax
movl %eax, (%rdi)
movl %eax, 28(%rdi)
movl %eax, 92(%rdi)
movl %eax, 164(%rdi)
movl %eax, 192(%rdi)
movl %eax, 236(%rdi)
movl %eax, 276(%rdi)
movl %eax, 312(%rdi)
movl %eax, 332(%rdi)
movl %eax, 356(%rdi)
movl %eax, 392(%rdi)
movl %eax, 484(%rdi)
movl %eax, 528(%rdi)
movl %eax, 572(%rdi)
movl %eax, 616(%rdi)
instead.
2021-06-22 Jakub Jelinek <jakub@redhat.com>
PR target/11877
* config/i386/i386-protos.h (ix86_last_zero_store_uid): Declare.
* config/i386/i386-expand.c (ix86_last_zero_store_uid): New
variable.
* config/i386/i386.c (ix86_expand_prologue): Clear it.
* config/i386/i386.md (peephole2s for 1/2/4 stores of const0_rtx):
Remove "" from match_operand. Emit new insns using emit_move_insn
and
set ix86_last_zero_store_uid to INSN_UID of the last store.
Add peephole2s for 1/2/4 stores of const0_rtx following previous
successful peep2s.
* gcc.target/i386/pr11877-2.c: New test.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
2021-06-21 7:56 ` [Bug target/11877] gcc should use xor trick with -Os cvs-commit at gcc dot gnu.org
2021-06-22 8:18 ` cvs-commit at gcc dot gnu.org
@ 2021-07-10 8:24 ` roger at nextmovesoftware dot com
2021-07-26 18:54 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: roger at nextmovesoftware dot com @ 2021-07-10 8:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
CC| |roger at nextmovesoftware dot com
Status|ASSIGNED |RESOLVED
Target Milestone|--- |12.0
--- Comment #12 from Roger Sayle <roger at nextmovesoftware dot com> ---
Fixed on mainline thanks to Jakub's patch.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2021-07-10 8:24 ` roger at nextmovesoftware dot com
@ 2021-07-26 18:54 ` pinskia at gcc dot gnu.org
2021-07-26 18:57 ` pinskia at gcc dot gnu.org
2021-07-26 22:20 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 18:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |msharov at users dot sourceforge.n
| |et
--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 49127 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2021-07-26 18:54 ` pinskia at gcc dot gnu.org
@ 2021-07-26 18:57 ` pinskia at gcc dot gnu.org
2021-07-26 22:20 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 18:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jeffreyalaw at gmail dot com
--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 41505 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2021-07-26 18:57 ` pinskia at gcc dot gnu.org
@ 2021-07-26 22:20 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 22:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |andi-gcc at firstfloor dot org
--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 32629 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
[not found] <bug-11877-5724@http.gcc.gnu.org/bugzilla/>
@ 2006-01-05 20:22 ` dann at godzilla dot ics dot uci dot edu
0 siblings, 0 replies; 16+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2006-01-05 20:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from dann at godzilla dot ics dot uci dot edu 2006-01-05 20:22 -------
(In reply to comment #7)
> *** Bug 23338 has been marked as a duplicate of this bug. ***
>
Bug 23338 contained a patch that might fixed this issue. Here it is, so
that it can be evaluated.
*** i386.md 08 Aug 2005 16:38:37 -0700 1.652
--- i386.md 11 Aug 2005 11:27:11 -0700
***************
*** 18874,18881 ****
[(match_scratch:SI 1 "r")
(set (match_operand:SI 0 "memory_operand" "")
(const_int 0))]
! "! optimize_size
! && ! TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cost->large_insn
&& peep2_regno_dead_p (0, FLAGS_REG)"
--- 18874,18880 ----
[(match_scratch:SI 1 "r")
(set (match_operand:SI 0 "memory_operand" "")
(const_int 0))]
! "! TARGET_USE_MOV0
&& TARGET_SPLIT_LONG_MOVES
&& get_attr_length (insn) >= ix86_cost->large_insn
&& peep2_regno_dead_p (0, FLAGS_REG)"
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (6 preceding siblings ...)
2004-03-25 19:16 ` kazu at cs dot umass dot edu
@ 2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
8 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-08-12 5:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-08-12 05:27 -------
*** Bug 23338 has been marked as a duplicate of this bug. ***
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |dann at godzilla dot ics dot
| |uci dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (7 preceding siblings ...)
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
@ 2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
8 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-08-12 5:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-08-12 05:28 -------
PR 23102 is the bug for multiple xors.
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |23102
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (5 preceding siblings ...)
2004-02-01 17:24 ` kazu at cs dot umass dot edu
@ 2004-03-25 19:16 ` kazu at cs dot umass dot edu
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
8 siblings, 0 replies; 16+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-03-25 19:16 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-03-25 19:16 -------
Even if you split the long move in ix86_expand_move,
the constant 0 is propagated into the two moves.
I guess the right way may be uncse sometime after register allocation.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (4 preceding siblings ...)
2004-01-04 7:50 ` pinskia at gcc dot gnu dot org
@ 2004-02-01 17:24 ` kazu at cs dot umass dot edu
2004-03-25 19:16 ` kazu at cs dot umass dot edu
` (2 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-02-01 17:24 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-02-01 17:24 -------
Then we need something like an un-cse pass.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (3 preceding siblings ...)
2004-01-04 6:55 ` kazu at cs dot umass dot edu
@ 2004-01-04 7:50 ` pinskia at gcc dot gnu dot org
2004-02-01 17:24 ` kazu at cs dot umass dot edu
` (3 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-01-04 7:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-01-04 07:50 -------
What about expanding (set (mem:DI ...) (const_int 0)) at expand time, this will cause more
opportunities to happen and then the discusion is up to other parts of the compiler.
It looks like an easy change to ix86_expand_move.
Also interesting is that this testcase:
void
foo (long *p)
{
*p = 0;
p[1] = 0;
}
Does not use the xor trick either.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
` (2 preceding siblings ...)
2003-12-31 21:39 ` kazu at cs dot umass dot edu
@ 2004-01-04 6:55 ` kazu at cs dot umass dot edu
2004-01-04 7:50 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: kazu at cs dot umass dot edu @ 2004-01-04 6:55 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2004-01-04 06:55 -------
Patch posted:
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00153.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
2003-08-10 16:03 ` [Bug target/11877] " pinskia at gcc dot gnu dot org
2003-08-23 1:15 ` dhazeghi at yahoo dot com
@ 2003-12-31 21:39 ` kazu at cs dot umass dot edu
2004-01-04 6:55 ` kazu at cs dot umass dot edu
` (5 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: kazu at cs dot umass dot edu @ 2003-12-31 21:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From kazu at cs dot umass dot edu 2003-12-31 21:19 -------
(set (mem:DI ...) (const_int 0)) is split into two moves in SImode after reload.
We could delay the split until after peephole2.
In peephole2, if a scratch reg is available,
load 0 into it with XOR and then copy that reg to two mem:SI locations.
Reduced to:
void
foo (long long *p)
{
*p = 0;
}
The reduction from 13 bytes down to 7 bytes sounds impressive.
My proposed solution would still leave two XORs, though.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |kazu at cs dot umass dot edu
AssignedTo|unassigned at gcc dot gnu |kazu at cs dot umass dot edu
|dot org |
Status|NEW |ASSIGNED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
2003-08-10 16:03 ` [Bug target/11877] " pinskia at gcc dot gnu dot org
@ 2003-08-23 1:15 ` dhazeghi at yahoo dot com
2003-12-31 21:39 ` kazu at cs dot umass dot edu
` (6 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: dhazeghi at yahoo dot com @ 2003-08-23 1:15 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
dhazeghi at yahoo dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
GCC build triplet|386-linux |i386-linux
GCC target triplet|386-linux |i386-linux
Target Milestone|3.4 |---
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/11877] gcc should use xor trick with -Os
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
@ 2003-08-10 16:03 ` pinskia at gcc dot gnu dot org
2003-08-23 1:15 ` dhazeghi at yahoo dot com
` (7 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-08-10 16:03 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11877
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2003-08-10 16:03:50
date| |
------- Additional Comments From pinskia at gcc dot gnu dot org 2003-08-10 16:03 -------
I can confirm that the xor trick will be a size win.
The movl still happens with the mainline (20030810).
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-07-26 22:20 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-11877-4@http.gcc.gnu.org/bugzilla/>
2021-06-21 7:56 ` [Bug target/11877] gcc should use xor trick with -Os cvs-commit at gcc dot gnu.org
2021-06-22 8:18 ` cvs-commit at gcc dot gnu.org
2021-07-10 8:24 ` roger at nextmovesoftware dot com
2021-07-26 18:54 ` pinskia at gcc dot gnu.org
2021-07-26 18:57 ` pinskia at gcc dot gnu.org
2021-07-26 22:20 ` pinskia at gcc dot gnu.org
[not found] <bug-11877-5724@http.gcc.gnu.org/bugzilla/>
2006-01-05 20:22 ` dann at godzilla dot ics dot uci dot edu
2003-08-10 15:47 [Bug target/11877] New: " debian-gcc at lists dot debian dot org
2003-08-10 16:03 ` [Bug target/11877] " pinskia at gcc dot gnu dot org
2003-08-23 1:15 ` dhazeghi at yahoo dot com
2003-12-31 21:39 ` kazu at cs dot umass dot edu
2004-01-04 6:55 ` kazu at cs dot umass dot edu
2004-01-04 7:50 ` pinskia at gcc dot gnu dot org
2004-02-01 17:24 ` kazu at cs dot umass dot edu
2004-03-25 19:16 ` kazu at cs dot umass dot edu
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
2005-08-12 5:28 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).