* [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
@ 2014-04-23 18:03 Jeff Law
2014-04-24 8:24 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Law @ 2014-04-23 18:03 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]
The more aggressive threading across loop backedges requires
invalidating equivalences that do not hold across all iterations of a loop.
At first glance, invaliding at PHI nodes should be sufficient as any
statement which potentially generated a new equivalence would be
reprocessed as we come across the backedge. However, there is one
important case where that does not hold.
Specifically we might have derived a value from a conditional and the
conditional might have been fed by a statement that doesn't produce
useful equivalences (such as a GIMPLE_ASM). Thus the equivalence from
the conditional is still visible because no new equivalence will be
recorded for the GIMPLE_ASM.
So if the result of the GIMPLE_ASM that gets used in the conditional
varies from one loop iteration to the next, we could use a stale value
from a prior iteration to thread the current iteration. That's exactly
what happens in the ffmpeg code.
Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also
verified that the sample audio in the referenced BZs no longer chops off
after ~2 seconds.
Installed on the trunk. OK for 4.9.1 after a suitable soak period on
the trunk?
[-- Attachment #2: PATCH --]
[-- Type: text/plain, Size: 3539 bytes --]
commit 02269351ce3a81b5470b8137fb3c34bca27011da
Author: Jeff Law <law@redhat.com>
Date: Wed Apr 23 00:25:47 2014 -0600
PR tree-optimization/60902
* tree-ssa-threadedge.c
(record_temporary_equivalences_from_stmts_at_dest): Make sure to
invalidate outputs from statements that do not produce useful
outputs for threading.
PR tree-optimization/60902
* gcc.target/i386/pr60902.c: New test.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 638c0da..ddebba7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2014-04-23 Jeff Law <law@redhat.com>
+
+ PR tree-optimization/60902
+ * tree-ssa-threadedge.c
+ (record_temporary_equivalences_from_stmts_at_dest): Make sure to
+ invalidate outputs from statements that do not produce useful
+ outputs for threading.
+
2014-04-23 Venkataramanan Kumar <venkataramanan.kumar@linaro.org>
* config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 126ad08..62b07f4 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-04-23 Jeff Law <law@redhat.com>
+
+ PR tree-optimization/60902
+ * gcc.target/i386/pr60902.c: New test.
+
2014-04-23 Alex Velenko <Alex.Velenko@arm.com>
* gcc.target/aarch64/vdup_lane_1.c: New testcase.
diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c b/gcc/testsuite/gcc.target/i386/pr60902.c
new file mode 100644
index 0000000..b81dcd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr60902.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+extern void abort ();
+extern void exit (int);
+
+int x;
+
+foo()
+{
+ static int count;
+ count++;
+ if (count > 1)
+ abort ();
+}
+
+static inline int
+frob ()
+{
+ int a;
+ __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x));
+ x++;
+ return a;
+}
+
+int
+main ()
+{
+ int i;
+ for (i = 0; i < 10 && frob () == 0; i++)
+ foo();
+ exit (0);
+}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index c447b72..8a0103b 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
&& (gimple_code (stmt) != GIMPLE_CALL
|| gimple_call_lhs (stmt) == NULL_TREE
|| TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
- continue;
+ {
+ /* STMT might still have DEFS and we need to invalidate any known
+ equivalences for them.
+
+ Consider if STMT is a GIMPLE_ASM with one or more outputs that
+ feeds a conditional inside a loop. We might derive an equivalence
+ due to the conditional. */
+ tree op;
+ ssa_op_iter iter;
+
+ if (backedge_seen)
+ FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
+ {
+ /* This call only invalidates equivalences created by
+ PHI nodes. This is by design to keep the cost of
+ of invalidation reasonable. */
+ invalidate_equivalences (op, stack, src_map, dst_map);
+
+ /* However, conditionals can imply values for real
+ operands as well. And those won't be recorded in the
+ maps. In fact, those equivalences may be recorded totally
+ outside the threading code. We can just create a new
+ temporary NULL equivalence here. */
+ record_temporary_equivalence (op, NULL_TREE, stack);
+ }
+
+ continue;
+ }
/* The result of __builtin_object_size depends on all the arguments
of a phi node. Temporarily using only one edge produces invalid
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
2014-04-23 18:03 [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops Jeff Law
@ 2014-04-24 8:24 ` Richard Biener
2014-04-24 15:44 ` Jeff Law
0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2014-04-24 8:24 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
On Wed, Apr 23, 2014 at 8:01 PM, Jeff Law <law@redhat.com> wrote:
>
> The more aggressive threading across loop backedges requires invalidating
> equivalences that do not hold across all iterations of a loop.
>
> At first glance, invaliding at PHI nodes should be sufficient as any
> statement which potentially generated a new equivalence would be reprocessed
> as we come across the backedge. However, there is one important case where
> that does not hold.
>
> Specifically we might have derived a value from a conditional and the
> conditional might have been fed by a statement that doesn't produce useful
> equivalences (such as a GIMPLE_ASM). Thus the equivalence from the
> conditional is still visible because no new equivalence will be recorded for
> the GIMPLE_ASM.
>
> So if the result of the GIMPLE_ASM that gets used in the conditional varies
> from one loop iteration to the next, we could use a stale value from a prior
> iteration to thread the current iteration. That's exactly what happens in
> the ffmpeg code.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also
> verified that the sample audio in the referenced BZs no longer chops off
> after ~2 seconds.
>
> Installed on the trunk. OK for 4.9.1 after a suitable soak period on the
> trunk?
Ok, but ...
>
>
>
> commit 02269351ce3a81b5470b8137fb3c34bca27011da
> Author: Jeff Law <law@redhat.com>
> Date: Wed Apr 23 00:25:47 2014 -0600
>
> PR tree-optimization/60902
> * tree-ssa-threadedge.c
> (record_temporary_equivalences_from_stmts_at_dest): Make sure to
> invalidate outputs from statements that do not produce useful
> outputs for threading.
>
> PR tree-optimization/60902
> * gcc.target/i386/pr60902.c: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 638c0da..ddebba7 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2014-04-23 Jeff Law <law@redhat.com>
> +
> + PR tree-optimization/60902
> + * tree-ssa-threadedge.c
> + (record_temporary_equivalences_from_stmts_at_dest): Make sure to
> + invalidate outputs from statements that do not produce useful
> + outputs for threading.
> +
> 2014-04-23 Venkataramanan Kumar <venkataramanan.kumar@linaro.org>
>
> * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 126ad08..62b07f4 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-04-23 Jeff Law <law@redhat.com>
> +
> + PR tree-optimization/60902
> + * gcc.target/i386/pr60902.c: New test.
> +
> 2014-04-23 Alex Velenko <Alex.Velenko@arm.com>
>
> * gcc.target/aarch64/vdup_lane_1.c: New testcase.
> diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c
> b/gcc/testsuite/gcc.target/i386/pr60902.c
> new file mode 100644
> index 0000000..b81dcd7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr60902.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +extern void abort ();
> +extern void exit (int);
> +
> +int x;
> +
> +foo()
> +{
> + static int count;
> + count++;
> + if (count > 1)
> + abort ();
> +}
> +
> +static inline int
> +frob ()
> +{
> + int a;
> + __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x));
> + x++;
> + return a;
> +}
> +
> +int
> +main ()
> +{
> + int i;
> + for (i = 0; i < 10 && frob () == 0; i++)
> + foo();
> + exit (0);
> +}
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index c447b72..8a0103b 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge
> e,
> && (gimple_code (stmt) != GIMPLE_CALL
> || gimple_call_lhs (stmt) == NULL_TREE
> || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
> - continue;
> + {
> + /* STMT might still have DEFS and we need to invalidate any known
> + equivalences for them.
> +
> + Consider if STMT is a GIMPLE_ASM with one or more outputs that
> + feeds a conditional inside a loop. We might derive an
> equivalence
> + due to the conditional. */
> + tree op;
> + ssa_op_iter iter;
> +
> + if (backedge_seen)
> + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
You only need SSA_OP_DEF here, no need to process virtual
operands.
> + {
> + /* This call only invalidates equivalences created by
> + PHI nodes. This is by design to keep the cost of
> + of invalidation reasonable. */
> + invalidate_equivalences (op, stack, src_map, dst_map);
> +
> + /* However, conditionals can imply values for real
> + operands as well. And those won't be recorded in the
> + maps. In fact, those equivalences may be recorded
> totally
> + outside the threading code. We can just create a new
> + temporary NULL equivalence here. */
> + record_temporary_equivalence (op, NULL_TREE, stack);
> + }
> +
> + continue;
> + }
>
> /* The result of __builtin_object_size depends on all the arguments
> of a phi node. Temporarily using only one edge produces invalid
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
2014-04-24 8:24 ` Richard Biener
@ 2014-04-24 15:44 ` Jeff Law
2014-04-25 8:24 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Law @ 2014-04-24 15:44 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches
On 04/24/14 02:22, Richard Biener wrote:
>> +
>> + if (backedge_seen)
>> + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
>
> You only need SSA_OP_DEF here, no need to process virtual
> operands.
I went back and forth on this. I couldn't come up with a case where
we'd do the wrong thing with virtuals, but processing all the DEFs is
conservatively correct, so I left it.
If you'd like to see this changed to only walk the real defs, I'm happy
to make that change.
jeff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
2014-04-24 15:44 ` Jeff Law
@ 2014-04-25 8:24 ` Richard Biener
0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2014-04-25 8:24 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches
On Thu, Apr 24, 2014 at 5:34 PM, Jeff Law <law@redhat.com> wrote:
> On 04/24/14 02:22, Richard Biener wrote:
>>>
>>> +
>>> + if (backedge_seen)
>>> + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
>>
>>
>> You only need SSA_OP_DEF here, no need to process virtual
>> operands.
>
> I went back and forth on this. I couldn't come up with a case where we'd do
> the wrong thing with virtuals, but processing all the DEFs is conservatively
> correct, so I left it.
>
> If you'd like to see this changed to only walk the real defs, I'm happy to
> make that change.
Well, it's cheaper. And we don't (and can't) record equivalencies on
virtuals.
So, yes please.
Richard.
> jeff
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-04-25 8:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-23 18:03 [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops Jeff Law
2014-04-24 8:24 ` Richard Biener
2014-04-24 15:44 ` Jeff Law
2014-04-25 8:24 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).