public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Atom: Enabling unroll at O2 optimization level
@ 2012-04-10 18:43 Igor Zamyatin
  2012-04-11  8:39 ` Richard Guenther
  0 siblings, 1 reply; 8+ messages in thread
From: Igor Zamyatin @ 2012-04-10 18:43 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 594 bytes --]

Hi All!

Here is a patch that enables unroll at O2 for Atom.

This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
bits).

Tested for i386 and x86-64, ok for trunk?

Thanks,
Igor

ChangeLog:

2012-04-10  Yakovlev Vladimir  <vladimir.b.yakovlev@intel.com>

       * gcc/config/i386/i386.c (check_imull): New routine.
       (ix86_loop_unroll_adjust): New target hook.
       (ix86_option_override_internal): Enable unrolling on Atom at -O2.
       (TARGET_LOOP_UNROLL_ADJUST): New define.

[-- Attachment #2: unroll.patch --]
[-- Type: application/octet-stream, Size: 3405 bytes --]

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8974ddc..1a25678 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fibheap.h"
 #include "opts.h"
 #include "diagnostic.h"
+#include "cfgloop.h"
 
 enum upper_128bits_state
 {
@@ -2635,6 +2636,47 @@ static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
   "btver1"
 };
 \f
+static int
+check_imull (rtx *x, unsigned *op_count)
+{
+  if (*x && GET_CODE (*x) == MULT)
+    (*op_count)++;
+  return 0;
+}
+
+/* This target hook implementation for TARGET_LOOP_UNROLL_ADJUST calculates
+   a new number struct loop *loop should be unrolled if tuned for Atom at -O2.
+   The loop is analyzed for imull ops number.  */
+static unsigned
+ix86_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
+{
+  basic_block *bbs;
+  rtx insn;
+  unsigned i;
+  unsigned imull_count = 0;
+
+  if (optimize != 2
+      || optimize_size
+      || ix86_tune != PROCESSOR_ATOM)
+    return nunroll;
+
+  /* Count the number of memory references within the loop body.  */
+  bbs = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+    {
+      for (insn = BB_HEAD (bbs[i]); insn != BB_END (bbs[i]); insn = NEXT_INSN (insn))
+	if (INSN_P (insn) && INSN_CODE (insn) != -1)
+          for_each_rtx (&insn, (rtx_function) check_imull, &imull_count);
+    }
+  free (bbs);
+
+  /* Prevent division by zero, and we do not need to adjust nunroll in this case.  */
+  if (imull_count >= 5)
+    return 0;
+
+  return nunroll;
+}
+
 /* Return true if a red-zone is in use.  */
 
 static inline bool
@@ -3815,6 +3857,33 @@ ix86_option_override_internal (bool main_args_p)
       && TARGET_SOFTWARE_PREFETCHING_BENEFICIAL)
     flag_prefetch_loop_arrays = 1;
 
+  /* Enable unrolling at -O2 on Atom.  */
+  if (optimize == 2
+      && !optimize_size
+      && ix86_tune == PROCESSOR_ATOM
+      && !global_options_set.x_flag_unroll_loops
+      && !flag_unroll_loops)
+    {
+      int default_max_unrolled_insns = TARGET_64BIT == 0 ? 72 : 150;
+      int default_max_completely_peeled_insns = TARGET_64BIT == 0 ? 150 : 400;
+      flag_unroll_loops = 1;
+      flag_rename_registers = 0;
+      maybe_set_param_value (PARAM_MAX_UNROLL_TIMES,
+			     2,
+			     global_options.x_param_values,
+			     global_options_set.x_param_values);
+      if (!global_options_set.x_param_values[PARAM_MAX_UNROLLED_INSNS])
+	maybe_set_param_value (PARAM_MAX_UNROLLED_INSNS,
+			       default_max_unrolled_insns,
+			       global_options.x_param_values,
+			       global_options_set.x_param_values);
+      if (!global_options_set.x_param_values[PARAM_MAX_COMPLETELY_PEELED_INSNS])
+	maybe_set_param_value (PARAM_MAX_COMPLETELY_PEELED_INSNS,
+			       default_max_completely_peeled_insns,
+			       global_options.x_param_values,
+			       global_options_set.x_param_values);
+    }
+
   /* If using typedef char *va_list, signal that __builtin_va_start (&ap, 0)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT && !flag_split_stack)
@@ -39258,6 +39327,9 @@ ix86_autovectorize_vector_sizes (void)
 #define TARGET_INIT_LIBFUNCS darwin_rename_builtins
 #endif
 
+#undef TARGET_LOOP_UNROLL_ADJUST
+#define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 #include "gt-i386.h"

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-10 18:43 [PATCH] Atom: Enabling unroll at O2 optimization level Igor Zamyatin
@ 2012-04-11  8:39 ` Richard Guenther
  2012-04-11 13:35   ` Andi Kleen
  2012-04-12 11:06   ` Igor Zamyatin
  0 siblings, 2 replies; 8+ messages in thread
From: Richard Guenther @ 2012-04-11  8:39 UTC (permalink / raw)
  To: Igor Zamyatin; +Cc: gcc-patches

On Tue, Apr 10, 2012 at 8:43 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
> Hi All!
>
> Here is a patch that enables unroll at O2 for Atom.
>
> This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
> bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
> bits).

5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
why? Why do you disable register renaming?  check_imull requires a function
comment.

This completely looks like a hack for EEMBC2.0, so it's definitely not ok.

-O2 is not supposed to give best benchmark results.

Thanks,
Richard.

>
> Tested for i386 and x86-64, ok for trunk?
>
> Thanks,
> Igor
>
> ChangeLog:
>
> 2012-04-10  Yakovlev Vladimir  <vladimir.b.yakovlev@intel.com>
>
>        * gcc/config/i386/i386.c (check_imull): New routine.
>        (ix86_loop_unroll_adjust): New target hook.
>        (ix86_option_override_internal): Enable unrolling on Atom at -O2.
>        (TARGET_LOOP_UNROLL_ADJUST): New define.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-11  8:39 ` Richard Guenther
@ 2012-04-11 13:35   ` Andi Kleen
  2012-04-12 11:06     ` Igor Zamyatin
  2012-04-12 11:06   ` Igor Zamyatin
  1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2012-04-11 13:35 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Igor Zamyatin, gcc-patches

Richard Guenther <richard.guenther@gmail.com> writes:
>
> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
> why? Why do you disable register renaming?  check_imull requires a function
> comment.
>
> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>
> -O2 is not supposed to give best benchmark results.

Besides it is against the Intel Optimization Manual recommendation
to prefer small code on Atom to avoid falling out of the predecode hints
in the cache.

So would need much more benchmarking on macro workloads first at least.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-11 13:35   ` Andi Kleen
@ 2012-04-12 11:06     ` Igor Zamyatin
  2012-04-12 13:23       ` Andi Kleen
  0 siblings, 1 reply; 8+ messages in thread
From: Igor Zamyatin @ 2012-04-12 11:06 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Richard Guenther, gcc-patches

On Wed, Apr 11, 2012 at 5:34 PM, Andi Kleen <andi@firstfloor.org> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>>
>> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
>> why? Why do you disable register renaming?  check_imull requires a function
>> comment.
>>
>> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>>
>> -O2 is not supposed to give best benchmark results.
>
> Besides it is against the Intel Optimization Manual recommendation
> to prefer small code on Atom to avoid falling out of the predecode hints
> in the cache.

Yes, this is well-known concern for Atom. But in the same time unroll
could help a lot for inorder machines because it could provide more
opportunities to a compiler scheduler. And experiments showed that
unroll could really help.

>
> So would need much more benchmarking on macro workloads first at least.

Like what, for example? I believe in this case everything also
strongly depends on test usage model (e.g. it usually compiled with Os
not O2) and, let's say, internal test structure - whether there are
hot loops that suitable for unroll.

>
> -Andi
>
> --
> ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-11  8:39 ` Richard Guenther
  2012-04-11 13:35   ` Andi Kleen
@ 2012-04-12 11:06   ` Igor Zamyatin
  2012-04-12 11:17     ` Richard Guenther
  1 sibling, 1 reply; 8+ messages in thread
From: Igor Zamyatin @ 2012-04-12 11:06 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1653 bytes --]

On Wed, Apr 11, 2012 at 12:39 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Apr 10, 2012 at 8:43 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
>> Hi All!
>>
>> Here is a patch that enables unroll at O2 for Atom.
>>
>> This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
>> bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
>> bits).
>
> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
> why? Why do you disable register renaming?  check_imull requires a function
> comment.

Sure, enabling unroll for O3 could be the next step.
We can't avoid code size increase with unroll - what number do you
think will be appropriate?
Register renaming was the reason of several degradations during tuning process
Comment for check_imull was added

>
> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.

Why? EEMBC was measured and result provided here just because this
benchmark considers to be very relevant for Atom

>
> -O2 is not supposed to give best benchmark results.

O2 is wide-used so performance improvement could be important for users.

>
> Thanks,
> Richard.
>
>>
>> Tested for i386 and x86-64, ok for trunk?

Updated patch attached

>>
>> Thanks,
>> Igor
>>
>> ChangeLog:
>>
>> 2012-04-10  Yakovlev Vladimir  <vladimir.b.yakovlev@intel.com>
>>
>>        * gcc/config/i386/i386.c (check_imul): New routine.
>>        (ix86_loop_unroll_adjust): New target hook.
>>        (ix86_option_override_internal): Enable unrolling on Atom at -O2.
>>        (TARGET_LOOP_UNROLL_ADJUST): New define.

[-- Attachment #2: unroll1.patch --]
[-- Type: application/octet-stream, Size: 3625 bytes --]

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index af4af7c..76a0837 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fibheap.h"
 #include "opts.h"
 #include "diagnostic.h"
+#include "cfgloop.h"
 
 enum upper_128bits_state
 {
@@ -2635,6 +2636,54 @@ static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
   "btver1"
 };
 \f
+/* This routine is used in ix86_loop_unroll_adjust and helps to calculate 
+   number of imuls in a loop.  */
+
+static int
+check_imul (rtx *x, unsigned *op_count)
+{
+  if (*x && GET_CODE (*x) == MULT)
+    (*op_count)++;
+  return 0;
+}
+
+/* This target hook implementation for TARGET_LOOP_UNROLL_ADJUST calculates
+   a new number struct loop *loop should be unrolled if tuned for Atom at -O2.
+   The loop is analyzed for imull ops number.  */
+
+static unsigned
+ix86_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
+{
+  basic_block *bbs;
+  rtx insn;
+  unsigned i;
+  unsigned imul_count = 0;
+
+  if (optimize != 2
+      || optimize_size
+      || ix86_tune != PROCESSOR_ATOM)
+    return nunroll;
+
+  /* Count the number of imuls within the loop body. Due to Atom specific
+     imuls unrolling of loops that contaim them could harm the performance.  */
+  bbs = get_loop_body (loop);
+  for (i = 0; i < loop->num_nodes; i++)
+    {
+      for (insn = BB_HEAD (bbs[i]);
+           insn != BB_END (bbs[i]);
+           insn = NEXT_INSN (insn))
+	if (INSN_P (insn) && INSN_CODE (insn) != -1)
+          for_each_rtx (&insn, (rtx_function) check_imul, &imul_count);
+    }
+  free (bbs);
+
+  /* Prevent division by zero, and we do not need to adjust nunroll in this case.  */
+  if (imul_count >= 5)
+    return 0;
+
+  return nunroll;
+}
+
 /* Return true if a red-zone is in use.  */
 
 static inline bool
@@ -3815,6 +3864,33 @@ ix86_option_override_internal (bool main_args_p)
       && TARGET_SOFTWARE_PREFETCHING_BENEFICIAL)
     flag_prefetch_loop_arrays = 1;
 
+  /* Enable unrolling at -O2 on Atom.  */
+  if (optimize == 2
+      && !optimize_size
+      && ix86_tune == PROCESSOR_ATOM
+      && !global_options_set.x_flag_unroll_loops
+      && !flag_unroll_loops)
+    {
+      int default_max_unrolled_insns = TARGET_64BIT == 0 ? 72 : 150;
+      int default_max_completely_peeled_insns = TARGET_64BIT == 0 ? 150 : 400;
+      flag_unroll_loops = 1;
+      flag_rename_registers = 0;
+      maybe_set_param_value (PARAM_MAX_UNROLL_TIMES,
+			     2,
+			     global_options.x_param_values,
+			     global_options_set.x_param_values);
+      if (!global_options_set.x_param_values[PARAM_MAX_UNROLLED_INSNS])
+	maybe_set_param_value (PARAM_MAX_UNROLLED_INSNS,
+			       default_max_unrolled_insns,
+			       global_options.x_param_values,
+			       global_options_set.x_param_values);
+      if (!global_options_set.x_param_values[PARAM_MAX_COMPLETELY_PEELED_INSNS])
+	maybe_set_param_value (PARAM_MAX_COMPLETELY_PEELED_INSNS,
+			       default_max_completely_peeled_insns,
+			       global_options.x_param_values,
+			       global_options_set.x_param_values);
+    }
+
   /* If using typedef char *va_list, signal that __builtin_va_start (&ap, 0)
      can be optimized to ap = __builtin_next_arg (0).  */
   if (!TARGET_64BIT && !flag_split_stack)
@@ -39258,6 +39334,9 @@ ix86_autovectorize_vector_sizes (void)
 #define TARGET_INIT_LIBFUNCS darwin_rename_builtins
 #endif
 
+#undef TARGET_LOOP_UNROLL_ADJUST
+#define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 #include "gt-i386.h"

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-12 11:06   ` Igor Zamyatin
@ 2012-04-12 11:17     ` Richard Guenther
  2012-04-17 15:17       ` Igor Zamyatin
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Guenther @ 2012-04-12 11:17 UTC (permalink / raw)
  To: Igor Zamyatin; +Cc: gcc-patches

On Thu, Apr 12, 2012 at 1:05 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
> On Wed, Apr 11, 2012 at 12:39 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, Apr 10, 2012 at 8:43 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
>>> Hi All!
>>>
>>> Here is a patch that enables unroll at O2 for Atom.
>>>
>>> This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
>>> bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
>>> bits).
>>
>> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
>> why? Why do you disable register renaming?  check_imull requires a function
>> comment.
>
> Sure, enabling unroll for O3 could be the next step.
> We can't avoid code size increase with unroll - what number do you
> think will be appropriate?
> Register renaming was the reason of several degradations during tuning process
> Comment for check_imull was added
>
>>
>> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>
> Why? EEMBC was measured and result provided here just because this
> benchmark considers to be very relevant for Atom

I'd say that SPEC INT (2000 / 2006) is more relevant for Atom (SPEC FP
would be irrelevant OTOH).  Similar code size for, say, Mozilla Firefox
or GCC itself would be important.

>> -O2 is not supposed to give best benchmark results.
>
> O2 is wide-used so performance improvement could be important for users.

But not at a 5% size cost.  Please also always check the compile-time effect
which is important for -O2 as well.

Richard.

>>
>> Thanks,
>> Richard.
>>
>>>
>>> Tested for i386 and x86-64, ok for trunk?
>
> Updated patch attached
>
>>>
>>> Thanks,
>>> Igor
>>>
>>> ChangeLog:
>>>
>>> 2012-04-10  Yakovlev Vladimir  <vladimir.b.yakovlev@intel.com>
>>>
>>>        * gcc/config/i386/i386.c (check_imul): New routine.
>>>        (ix86_loop_unroll_adjust): New target hook.
>>>        (ix86_option_override_internal): Enable unrolling on Atom at -O2.
>>>        (TARGET_LOOP_UNROLL_ADJUST): New define.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-12 11:06     ` Igor Zamyatin
@ 2012-04-12 13:23       ` Andi Kleen
  0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2012-04-12 13:23 UTC (permalink / raw)
  To: Igor Zamyatin; +Cc: Andi Kleen, Richard Guenther, gcc-patches

> > So would need much more benchmarking on macro workloads first at least.
> 
> Like what, for example? I believe in this case everything also
> strongly depends on test usage model (e.g. it usually compiled with Os
> not O2) and, let's say, internal test structure - whether there are
> hot loops that suitable for unroll.

Normally the compiler doesn't know if a loop is hot unless you use
profile feedback. So worst case on a big code base you may end up
with a lot of unnecessary unrolling. On cold code it's just wasted
bytes, but there could be already icache limited code where it 
would be worse.

How about just a compiler bootstrap on Atom as a "worst case"?
For the benchmark can you use profile feedback?

BTW I know some loops are unrolled at -O3 by default at tree level because
the vectorizer likes it.  I actually have an older patch to dial this
down for some common cases.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Atom: Enabling unroll at O2 optimization level
  2012-04-12 11:17     ` Richard Guenther
@ 2012-04-17 15:17       ` Igor Zamyatin
  0 siblings, 0 replies; 8+ messages in thread
From: Igor Zamyatin @ 2012-04-17 15:17 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, areg.melikadamyan

On Thu, Apr 12, 2012 at 3:16 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Apr 12, 2012 at 1:05 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
>> On Wed, Apr 11, 2012 at 12:39 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Tue, Apr 10, 2012 at 8:43 PM, Igor Zamyatin <izamyatin@gmail.com> wrote:
>>>> Hi All!
>>>>
>>>> Here is a patch that enables unroll at O2 for Atom.
>>>>
>>>> This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
>>>> bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
>>>> bits).
>>>
>>> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
>>> why? Why do you disable register renaming?  check_imull requires a function
>>> comment.
>>
>> Sure, enabling unroll for O3 could be the next step.
>> We can't avoid code size increase with unroll - what number do you
>> think will be appropriate?
>> Register renaming was the reason of several degradations during tuning process
>> Comment for check_imull was added
>>
>>>
>>> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>>
>> Why? EEMBC was measured and result provided here just because this
>> benchmark considers to be very relevant for Atom
>
> I'd say that SPEC INT (2000 / 2006) is more relevant for Atom (SPEC FP
> would be irrelevant OTOH).  Similar code size for, say, Mozilla Firefox
> or GCC itself would be important.
>
>>> -O2 is not supposed to give best benchmark results.
>>
>> O2 is wide-used so performance improvement could be important for users.
>
> But not at a 5% size cost.  Please also always check the compile-time effect
> which is important for -O2 as well.

What would be an acceptable number of size cost/compile-time increase
for O2 and O3 on EEMBC, SPEC INT 2000 and Mozilla?

Is it possible in common to put Atom-specific unroll heuristics under
some option which could be mentioned in GCC docs?

>
> Richard.
>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>>
>>>> Tested for i386 and x86-64, ok for trunk?
>>
>> Updated patch attached
>>
>>>>
>>>> Thanks,
>>>> Igor
>>>>
>>>> ChangeLog:
>>>>
>>>> 2012-04-10  Yakovlev Vladimir  <vladimir.b.yakovlev@intel.com>
>>>>
>>>>        * gcc/config/i386/i386.c (check_imul): New routine.
>>>>        (ix86_loop_unroll_adjust): New target hook.
>>>>        (ix86_option_override_internal): Enable unrolling on Atom at -O2.
>>>>        (TARGET_LOOP_UNROLL_ADJUST): New define.

Thanks,
Igor

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-04-17 15:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-10 18:43 [PATCH] Atom: Enabling unroll at O2 optimization level Igor Zamyatin
2012-04-11  8:39 ` Richard Guenther
2012-04-11 13:35   ` Andi Kleen
2012-04-12 11:06     ` Igor Zamyatin
2012-04-12 13:23       ` Andi Kleen
2012-04-12 11:06   ` Igor Zamyatin
2012-04-12 11:17     ` Richard Guenther
2012-04-17 15:17       ` Igor Zamyatin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).