* [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
@ 2021-12-21 12:27 Roger Sayle
2021-12-22 8:10 ` Uros Bizjak
0 siblings, 1 reply; 6+ messages in thread
From: Roger Sayle @ 2021-12-21 12:27 UTC (permalink / raw)
To: 'GCC Patches'
[-- Attachment #1: Type: text/plain, Size: 684 bytes --]
My apologies for the inconvenience. The new support for -Oz using
push/pop for small integer constants on x86_64 is only a win/correct
for loading registers. Fixed by adding !MEM_P tests in the appropriate
locations.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures. Ok for mainline?
2021-12-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/103773
* config/i386/i386.md (*movdi_internal): Only use short
push/pop sequence for register (non-memory) destinations.
(*movsi_internal): Likewise.
gcc/testsuite/ChangeLog
PR target/103773
* gcc.target/i386/pr103773.c: New test case.
Roger
--
[-- Attachment #2: patcho2b.txt --]
[-- Type: text/plain, Size: 1295 bytes --]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d25453f..e596f8b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2217,7 +2217,8 @@
if (optimize_size > 1
&& TARGET_64BIT
&& CONST_INT_P (operands[1])
- && IN_RANGE (INTVAL (operands[1]), -128, 127))
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !MEM_P (operands[0]))
return "push{q}\t%1\n\tpop{q}\t%0";
return "mov{l}\t{%k1, %k0|%k0, %k1}";
}
@@ -2440,7 +2441,8 @@
return "lea{l}\t{%E1, %0|%0, %E1}";
else if (optimize_size > 1
&& CONST_INT_P (operands[1])
- && IN_RANGE (INTVAL (operands[1]), -128, 127))
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !MEM_P (operands[0]))
{
if (TARGET_64BIT)
return "push{q}\t%1\n\tpop{q}\t%q0";
diff --git a/gcc/testsuite/gcc.target/i386/pr103773.c b/gcc/testsuite/gcc.target/i386/pr103773.c
new file mode 100644
index 0000000..1e4b8ce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103773.c
@@ -0,0 +1,12 @@
+/* { dg-do run } */
+/* { dg-options "-Oz" } */
+
+unsigned long long x;
+
+int main (void)
+{
+ __builtin_memset (&x, 0xff, 4);
+ if (x != 0xffffffff)
+ __builtin_abort ();
+ return 0;
+}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
2021-12-21 12:27 [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory Roger Sayle
@ 2021-12-22 8:10 ` Uros Bizjak
2021-12-22 8:19 ` Uros Bizjak
0 siblings, 1 reply; 6+ messages in thread
From: Uros Bizjak @ 2021-12-22 8:10 UTC (permalink / raw)
To: Roger Sayle; +Cc: GCC Patches
On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> My apologies for the inconvenience. The new support for -Oz using
> push/pop for small integer constants on x86_64 is only a win/correct
> for loading registers. Fixed by adding !MEM_P tests in the appropriate
> locations.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures. Ok for mainline?
>
>
> 2021-12-21 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> PR target/103773
> * config/i386/i386.md (*movdi_internal): Only use short
> push/pop sequence for register (non-memory) destinations.
> (*movsi_internal): Likewise.
>
> gcc/testsuite/ChangeLog
> PR target/103773
> * gcc.target/i386/pr103773.c: New test case.
Ouch, as pointed out in the PR, this approach clobbers the red zone.
Please revert the original patch.
Thanks,
Uros.
>
> Roger
> --
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
2021-12-22 8:10 ` Uros Bizjak
@ 2021-12-22 8:19 ` Uros Bizjak
2021-12-22 9:26 ` Roger Sayle
0 siblings, 1 reply; 6+ messages in thread
From: Uros Bizjak @ 2021-12-22 8:19 UTC (permalink / raw)
To: Roger Sayle; +Cc: GCC Patches
On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
> >
> >
> > My apologies for the inconvenience. The new support for -Oz using
> > push/pop for small integer constants on x86_64 is only a win/correct
> > for loading registers. Fixed by adding !MEM_P tests in the appropriate
> > locations.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures. Ok for mainline?
> >
> >
> > 2021-12-21 Roger Sayle <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> > PR target/103773
> > * config/i386/i386.md (*movdi_internal): Only use short
> > push/pop sequence for register (non-memory) destinations.
> > (*movsi_internal): Likewise.
> >
> > gcc/testsuite/ChangeLog
> > PR target/103773
> > * gcc.target/i386/pr103773.c: New test case.
>
> Ouch, as pointed out in the PR, this approach clobbers the red zone.
>
> Please revert the original patch.
*Maybe* we can use frame->red_zone_size here, but the frame is
recalculated several times during the compilation. I think it is just
too dangerous to use push/pop w.r.t. red zone clobbering.
Uros.
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
2021-12-22 8:19 ` Uros Bizjak
@ 2021-12-22 9:26 ` Roger Sayle
2021-12-22 10:26 ` Uros Bizjak
0 siblings, 1 reply; 6+ messages in thread
From: Roger Sayle @ 2021-12-22 9:26 UTC (permalink / raw)
To: 'Uros Bizjak'; +Cc: 'GCC Patches'
[-- Attachment #1: Type: text/plain, Size: 2910 bytes --]
Hi Uros,
Would you consider the following variant that disables this optimization when a
red zone is used by the current function? You're right that cfun's red_zone_size is
recalculated dynamically, but ix86_red_zone_used should be a better "gate" given
that this logic resides very late during compilation, in the output templates, where
whether or not a red zone is used is known.
On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs
219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't
comparable as this testing included the 0/-1 write to memory changes).
Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check
with no new failures.
2021-12-22 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/103773
* config/i386/i386.md (*movdi_internal): Only use short
push/pop sequence for register (non-memory) destinations
when the current function doesn't make use of a red zone.
(*movsi_internal): Likewise.
gcc/testsuite/ChangeLog
PR target/103773
* gcc.target/i386/pr103773.c: New test case.
Please let me know what you think. I'll revert, if this tweak doesn't address
your concerns.
Roger
--
> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: 22 December 2021 08:20
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> Subject: Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to
> memory.
>
> On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle
> <roger@nextmovesoftware.com> wrote:
> > >
> > >
> > > My apologies for the inconvenience. The new support for -Oz using
> > > push/pop for small integer constants on x86_64 is only a win/correct
> > > for loading registers. Fixed by adding !MEM_P tests in the
> > > appropriate locations.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make
> > > bootstrap and make -k check with no new failures. Ok for mainline?
> > >
> > >
> > > 2021-12-21 Roger Sayle <roger@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > > PR target/103773
> > > * config/i386/i386.md (*movdi_internal): Only use short
> > > push/pop sequence for register (non-memory) destinations.
> > > (*movsi_internal): Likewise.
> > >
> > > gcc/testsuite/ChangeLog
> > > PR target/103773
> > > * gcc.target/i386/pr103773.c: New test case.
> >
> > Ouch, as pointed out in the PR, this approach clobbers the red zone.
> >
> > Please revert the original patch.
>
> *Maybe* we can use frame->red_zone_size here, but the frame is recalculated
> several times during the compilation. I think it is just too dangerous to use
> push/pop w.r.t. red zone clobbering.
>
> Uros.
[-- Attachment #2: patcho4.txt --]
[-- Type: text/plain, Size: 927 bytes --]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d25453f..489cede 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2217,7 +2217,9 @@
if (optimize_size > 1
&& TARGET_64BIT
&& CONST_INT_P (operands[1])
- && IN_RANGE (INTVAL (operands[1]), -128, 127))
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !MEM_P (operands[0])
+ && !ix86_red_zone_used)
return "push{q}\t%1\n\tpop{q}\t%0";
return "mov{l}\t{%k1, %k0|%k0, %k1}";
}
@@ -2440,7 +2442,9 @@
return "lea{l}\t{%E1, %0|%0, %E1}";
else if (optimize_size > 1
&& CONST_INT_P (operands[1])
- && IN_RANGE (INTVAL (operands[1]), -128, 127))
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !MEM_P (operands[0])
+ && !ix86_red_zone_used)
{
if (TARGET_64BIT)
return "push{q}\t%1\n\tpop{q}\t%q0";
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
2021-12-22 9:26 ` Roger Sayle
@ 2021-12-22 10:26 ` Uros Bizjak
2021-12-22 11:27 ` Uros Bizjak
0 siblings, 1 reply; 6+ messages in thread
From: Uros Bizjak @ 2021-12-22 10:26 UTC (permalink / raw)
To: Roger Sayle; +Cc: GCC Patches
On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Would you consider the following variant that disables this optimization when a
> red zone is used by the current function? You're right that cfun's red_zone_size is
> recalculated dynamically, but ix86_red_zone_used should be a better "gate" given
> that this logic resides very late during compilation, in the output templates, where
> whether or not a red zone is used is known.
>
> On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs
> 219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't
> comparable as this testing included the 0/-1 write to memory changes).
>
> Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check
> with no new failures.
>
> 2021-12-22 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> PR target/103773
> * config/i386/i386.md (*movdi_internal): Only use short
> push/pop sequence for register (non-memory) destinations
> when the current function doesn't make use of a red zone.
> (*movsi_internal): Likewise.
>
> gcc/testsuite/ChangeLog
> PR target/103773
> * gcc.target/i386/pr103773.c: New test case.
>
> Please let me know what you think. I'll revert, if this tweak doesn't address
> your concerns.
Yes, using ix86_red_zone_used looks safe.
OTOH, is there a reason the transformation is not implemented via
peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
peephole2 pass is instanced well after register allocation.
Uros.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory.
2021-12-22 10:26 ` Uros Bizjak
@ 2021-12-22 11:27 ` Uros Bizjak
0 siblings, 0 replies; 6+ messages in thread
From: Uros Bizjak @ 2021-12-22 11:27 UTC (permalink / raw)
To: Roger Sayle; +Cc: GCC Patches
[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]
On Wed, Dec 22, 2021 at 11:26 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
> >
> >
> > Hi Uros,
> > Would you consider the following variant that disables this optimization when a
> > red zone is used by the current function? You're right that cfun's red_zone_size is
> > recalculated dynamically, but ix86_red_zone_used should be a better "gate" given
> > that this logic resides very late during compilation, in the output templates, where
> > whether or not a red zone is used is known.
> >
> > On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs
> > 219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't
> > comparable as this testing included the 0/-1 write to memory changes).
> >
> > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check
> > with no new failures.
> >
> > 2021-12-22 Roger Sayle <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> > PR target/103773
> > * config/i386/i386.md (*movdi_internal): Only use short
> > push/pop sequence for register (non-memory) destinations
> > when the current function doesn't make use of a red zone.
> > (*movsi_internal): Likewise.
> >
> > gcc/testsuite/ChangeLog
> > PR target/103773
> > * gcc.target/i386/pr103773.c: New test case.
> >
> > Please let me know what you think. I'll revert, if this tweak doesn't address
> > your concerns.
>
> Yes, using ix86_red_zone_used looks safe.
>
> OTOH, is there a reason the transformation is not implemented via
> peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
> peephole2 pass is instanced well after register allocation.
Something like the attached patch.
Uros.
[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 1000 bytes --]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 58b10643fcb..e5d603f0025 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2514,6 +2514,24 @@
]
(symbol_ref "true")))])
+(define_peephole2
+ [(set (match_operand:SWI48 0 "general_reg_operand")
+ (match_operand:SWI48 1 "const_int_operand"))]
+ "optimize_insn_for_size_p () && optimize_size > 1
+ && IN_RANGE (INTVAL (operands[1]), -128, 127)
+ && !ix86_red_zone_used"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (match_dup 3))]
+{
+ if (GET_MODE (operands[0]) != word_mode)
+ operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0]));
+
+ operands[2] = gen_rtx_MEM (word_mode,
+ gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx));
+ operands[3] = gen_rtx_MEM (word_mode,
+ gen_rtx_POST_INC (Pmode, stack_pointer_rtx));
+})
+
(define_insn "*movhi_internal"
[(set (match_operand:HI 0 "nonimmediate_operand"
"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-22 11:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-21 12:27 [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory Roger Sayle
2021-12-22 8:10 ` Uros Bizjak
2021-12-22 8:19 ` Uros Bizjak
2021-12-22 9:26 ` Roger Sayle
2021-12-22 10:26 ` Uros Bizjak
2021-12-22 11:27 ` Uros Bizjak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).