public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Lower sampling rate for autofdo bootstrap
@ 2019-01-14  8:20 Andi Kleen
  2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
  2019-01-14  9:06 ` [PATCH 1/3] Lower sampling rate for autofdo bootstrap Richard Biener
  0 siblings, 2 replies; 7+ messages in thread
From: Andi Kleen @ 2019-01-14  8:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: amker.cheng, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

autofdo create_gcov uses a lot of memory for large sample files.
Since gcc runs quite long the sample files generated during
the bootstrap are fairly ig.

Currently I can't even build make autoprofiledbootstrap on my system at
home because create_gcov needs more than 12GB and runs out of memory.

This should probably be fixed in create_gcov, but for now
lowering the sampling rate works well enough for me. The bootstrap
run is long enough that it gets good enough data in any case.

gcc/:
2019-01-14  Andi Kleen  <ak@linux.intel.com>

	* Makefile.in: Lower autofdo sampling rate by 10x.
	* Makefile.tpl: Dito.
---
 Makefile.in  | 2 +-
 Makefile.tpl | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index aa41730528a..28539a45372 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -387,7 +387,7 @@ MAKEINFO = @MAKEINFO@
 EXPECT = @EXPECT@
 RUNTEST = @RUNTEST@
 
-AUTO_PROFILE = gcc-auto-profile -c 1000000
+AUTO_PROFILE = gcc-auto-profile -c 10000000
 
 # This just becomes part of the MAKEINFO definition passed down to
 # sub-makes.  It lets flags be given on the command line while still
diff --git a/Makefile.tpl b/Makefile.tpl
index 1ab65ac8ec4..126296fb49a 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -390,7 +390,7 @@ MAKEINFO = @MAKEINFO@
 EXPECT = @EXPECT@
 RUNTEST = @RUNTEST@
 
-AUTO_PROFILE = gcc-auto-profile -c 1000000
+AUTO_PROFILE = gcc-auto-profile -c 10000000
 
 # This just becomes part of the MAKEINFO definition passed down to
 # sub-makes.  It lets flags be given on the command line while still
-- 
2.19.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Fix autoprofiledbootstrap
  2019-01-14  8:20 [PATCH 1/3] Lower sampling rate for autofdo bootstrap Andi Kleen
@ 2019-01-14  8:20 ` Andi Kleen
  2019-01-14  8:20   ` [PATCH 3/3] Increase iterations for autofdo tests Andi Kleen
                     ` (2 more replies)
  2019-01-14  9:06 ` [PATCH 1/3] Lower sampling rate for autofdo bootstrap Richard Biener
  1 sibling, 3 replies; 7+ messages in thread
From: Andi Kleen @ 2019-01-14  8:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: amker.cheng, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

autoprofiledbootstrap fails currently with

In file included from ../../gcc/gcc/hash-table.h:236,
                 from ../../gcc/gcc/coretypes.h:440,
                 from ../../gcc/gcc/ipa-devirt.c:110:
In static member function 'static void va_heap::release(vec<T, va_heap, vl_embed>*&) [with T = tree_node*]',
    inlined from 'void vec<T>::release() [with T = tree_node*]' at ../../gcc/gcc/vec.h:1679:20,
    inlined from 'auto_vec<T, N>::~auto_vec() [with T = tree_node*; long unsigned int N = 8]' at ../../gcc/gcc/vec.h:1436:5,
    inlined from 'vec<cgraph_node*> possible_polymorphic_call_targets(tree, long int, ipa_polymorphic_call_context, bool*, void**, bool)' at ../../gcc/gcc/ipa-devirt.c:3099:22:
../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
  311 |   ::free (v);
      |   ~~~~~~~^~~
../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
cc1plus: all warnings being treated as errors

The problem is that auto_vec uses a variable to keep track if the vector
is on the heap or auto. Normally this gets constant resolved, but only
when the right functions are inlined. With autofdo for some reason
the compiler decides to not inline these vec functions, even though
they are marked as "inline"

Mark them as ALWAYS_INLINE instead.

gcc/:

2019-01-14  Andi Kleen  <ak@linux.intel.com>

	* vec.h (using_auto_storage, release): Mark as ALWAYS_INLINE.
---
 gcc/vec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index 407269c5ad3..1f5b78b1fac 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1664,7 +1664,7 @@ vec<T, va_heap, vl_ptr>::create (unsigned nelems MEM_STAT_DECL)
 /* Free the memory occupied by the embedded vector.  */
 
 template<typename T>
-inline void
+ALWAYS_INLINE void
 vec<T, va_heap, vl_ptr>::release (void)
 {
   if (!m_vec)
@@ -1940,7 +1940,7 @@ vec<T, va_heap, vl_ptr>::reverse (void)
 }
 
 template<typename T>
-inline bool
+ALWAYS_INLINE bool
 vec<T, va_heap, vl_ptr>::using_auto_storage () const
 {
   return m_vec->m_vecpfx.m_using_auto_storage;
-- 
2.19.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] Increase iterations for autofdo tests
  2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
@ 2019-01-14  8:20   ` Andi Kleen
  2019-01-14  9:08     ` Richard Biener
  2019-01-14  9:04   ` [PATCH 2/3] Fix autoprofiledbootstrap Richard Biener
  2019-01-14  9:11   ` Bin.Cheng
  2 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2019-01-14  8:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: amker.cheng, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Bin cheng pointed out that the autofdo tests are unstable because they
don't have enough iterations for the perf sampling to get enough data.

Increase the iterations, but only for autofdo. This avoids any impact
on targets that use a slow emulator, which will never run the host
only autofdo tests.

gcc/testsuite/:

2019-01-14  Andi Kleen  <ak@linux.intel.com>

	* g++.dg/tree-prof/morefunc.C (ITER): Add.
	(test1): Use.
	(test2): Use.
	* gcc.dg/tree-prof/cold_partition_label.c (ITER): Add.
	(main): Use.
	* gcc.dg/tree-prof/crossmodule-indircall-1.c (ITER): Add.
	(main): Use
	* gcc.dg/tree-prof/indir-call-prof.c (ITER): Add.
	(main): Use.
	* gcc.dg/tree-prof/peel-1.c (ITER): Add.
	(t): Use.
	(main): Use.
	* gcc.dg/tree-prof/pr52027.c (ITER): Add.
	(main): Use.
	* gcc.dg/tree-prof/tracer-1.c (ITER): Add.
	(main): Use.
	* gcc.dg/tree-prof/unroll-1.c (ITER): Add.
	(t): Use.
	(main): Use.
	* gcc.dg/tree-prof/update-cunroll-2.c (ITER): Add.
	(main): Use.
	* lib/profopt.exp: Pass -DITER to autofdo compilations.
---
 gcc/testsuite/g++.dg/tree-prof/morefunc.C              |  8 ++++++--
 gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c  |  6 +++++-
 .../gcc.dg/tree-prof/crossmodule-indircall-1.c         | 10 +++++++---
 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c       |  6 +++++-
 gcc/testsuite/gcc.dg/tree-prof/peel-1.c                | 10 +++++++---
 gcc/testsuite/gcc.dg/tree-prof/pr52027.c               |  6 +++++-
 gcc/testsuite/gcc.dg/tree-prof/tracer-1.c              |  7 ++++++-
 gcc/testsuite/gcc.dg/tree-prof/unroll-1.c              | 10 +++++++---
 gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c      |  8 ++++++--
 gcc/testsuite/lib/profopt.exp                          |  4 ++--
 10 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-prof/morefunc.C b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
index a9bdc167f45..02b01c073e9 100644
--- a/gcc/testsuite/g++.dg/tree-prof/morefunc.C
+++ b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
@@ -2,6 +2,10 @@
 #include "reorder_class1.h"
 #include "reorder_class2.h"
 
+#ifndef ITER
+#define ITER 1000
+#endif
+
 int g;
 
 #ifdef _PROFILE_USE
@@ -19,7 +23,7 @@ static __attribute__((always_inline))
 void test1 (A *tc)
 {
   int i;
-  for (i = 0; i < 1000; i++)
+  for (i = 0; i < ITER; i++)
      g += tc->foo(); 
    if (g<100) g++;
 }
@@ -28,7 +32,7 @@ static __attribute__((always_inline))
 void test2 (B *tc)
 {
   int i;
-  for (i = 0; i < 1000000; i++)
+  for (i = 0; i < ITER; i++)
      g += tc->foo();
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
index 450308d6407..099069da6a7 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
@@ -9,6 +9,10 @@ const char *sarr[SIZE];
 const char *buf_hot;
 const char *buf_cold;
 
+#ifndef ITER
+#define ITER 1000000
+#endif
+
 __attribute__((noinline))
 void 
 foo (int path)
@@ -32,7 +36,7 @@ main (int argc, char *argv[])
   int i;
   buf_hot =  "hello";
   buf_cold = "world";
-  for (i = 0; i < 1000000; i++)
+  for (i = 0; i < ITER; i++)
     foo (argc);
   return 0;
 }
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
index 58109d54dc7..32d22c69c6c 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
@@ -2,6 +2,10 @@
 /* { dg-additional-sources "crossmodule-indircall-1a.c" } */
 /* { dg-options "-O3 -flto -DDOJOB=1" } */
 
+#ifndef ITER
+#define ITER 1000
+#endif
+
 int a;
 extern void (*p[2])(int n);
 void abort (void);
@@ -10,12 +14,12 @@ main()
 { int i;
 
   /* This call shall be converted.  */
-  for (i = 0;i<1000;i++)
+  for (i = 0;i<ITER;i++)
     p[0](1);
   /* This call shall not be converted.  */
-  for (i = 0;i<1000;i++)
+  for (i = 0;i<ITER;i++)
     p[i%2](2);
-  if (a != 1000)
+  if (a != ITER)
     abort ();
 
   return 0;
diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
index 53063c3e7fa..8b9dfbb78c7 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
@@ -1,5 +1,9 @@
 /* { dg-options "-O2 -fdump-tree-optimized -fdump-ipa-profile -fdump-ipa-afdo" } */
 
+#ifndef ITER
+#define ITER 100000
+#endif
+
 static int a1 (void)
 {
     return 10;
@@ -28,7 +32,7 @@ main (void)
   int (*p) (void);
   int  i;
 
-  for (i = 0; i < 10000000; i ++)
+  for (i = 0; i < ITER*100; i++)
     {
 	setp (&p, i);
 	p ();
diff --git a/gcc/testsuite/gcc.dg/tree-prof/peel-1.c b/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
index 7245b68c1ee..b6ed178e1ad 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
@@ -1,13 +1,17 @@
 /* { dg-options "-O3 -fdump-tree-cunroll-details -fno-unroll-loops -fpeel-loops" } */
 void abort();
 
-int a[1000];
+#ifndef ITER
+#define ITER 1000
+#endif
+
+int a[ITER];
 int
 __attribute__ ((noinline))
 t()
 {
   int i;
-  for (i=0;i<1000;i++)
+  for (i=0;i<ITER;i++)
     if (!a[i])
       return 1;
   abort ();
@@ -16,7 +20,7 @@ int
 main()
 {
   int i;
-  for (i=0;i<1000;i++)
+  for (i=0;i<ITER;i++)
     t();
   return 0;
 }
diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr52027.c b/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
index c46a14b2e86..bf2a83a336d 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
@@ -2,6 +2,10 @@
 /* { dg-require-effective-target freorder } */
 /* { dg-options "-O2 -freorder-blocks-and-partition -fno-reorder-functions" } */
 
+#ifndef ITER
+#define ITER 1000
+#endif
+
 void
 foo (int len)
 {
@@ -13,7 +17,7 @@ int
 main ()
 {
   int i;
-  for (i = 0; i < 1000; i++)
+  for (i = 0; i < ITER; i++)
     foo (8);
   return 0;
 }
diff --git a/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c b/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
index 1e64f284ac0..65570a5e96d 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
@@ -1,9 +1,14 @@
 /* { dg-options "-O2 -ftracer -fdump-tree-tracer" } */
+
+#ifndef ITER
+#define ITER 1000
+#endif
+
 volatile int a, b, c;
 int main ()
 {
   int i;
-  for (i = 0; i < 1000; i++)
+  for (i = 0; i < ITER; i++)
     {
       if (i % 17)
 	a++;
diff --git a/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c b/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
index 3ad0cf019b3..3027e75a241 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
@@ -1,13 +1,17 @@
 /* { dg-options "-O3 -fdump-rtl-loop2_unroll-details -funroll-loops -fno-peel-loops" } */
 void abort ();
 
-int a[1000];
+#ifndef ITER
+#define ITER 1000
+#endif
+
+int a[ITER];
 int
 __attribute__ ((noinline))
 t()
 {
   int i;
-  for (i=0;i<1000;i++)
+  for (i=0;i<ITER;i++)
     if (!a[i])
       return 1;
   abort ();
@@ -16,7 +20,7 @@ int
 main()
 {
   int i;
-  for (i=0;i<1000;i++)
+  for (i=0;i<ITER;i++)
     t();
   return 0;
 }
diff --git a/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c b/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
index c286816cdf8..de2d03ebaee 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
@@ -1,5 +1,9 @@
-
 /* { dg-options "-O2 -fdump-tree-optimized-blocks" } */
+
+#ifndef ITER
+#define ITER 1000
+#endif
+
 int a[8];
 __attribute__ ((noinline))
 int t()
@@ -14,7 +18,7 @@ int
 main ()
 {
   int i;
-  for (i = 0; i < 1000; i++)
+  for (i = 0; i < ITER; i++)
     t ();
   return 0;
 }
diff --git a/gcc/testsuite/lib/profopt.exp b/gcc/testsuite/lib/profopt.exp
index 65494cfd4f6..13e7828bf32 100644
--- a/gcc/testsuite/lib/profopt.exp
+++ b/gcc/testsuite/lib/profopt.exp
@@ -289,8 +289,8 @@ proc auto-profopt-execute { src } {
         return
     }
     set profile_wrapper [profopt-perf-wrapper]
-    set profile_option "-g"
-    set feedback_option "-fauto-profile"
+    set profile_option "-g -DITER=1000000"
+    set feedback_option "-fauto-profile -DITER=1000000"
     set run_autofdo 1
     profopt-execute $src
     unset profile_wrapper
-- 
2.19.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] Fix autoprofiledbootstrap
  2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
  2019-01-14  8:20   ` [PATCH 3/3] Increase iterations for autofdo tests Andi Kleen
@ 2019-01-14  9:04   ` Richard Biener
  2019-01-14  9:11   ` Bin.Cheng
  2 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-01-14  9:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: GCC Patches, Amker.Cheng, Andi Kleen

On Mon, Jan 14, 2019 at 9:20 AM Andi Kleen <andi@firstfloor.org> wrote:
>
> From: Andi Kleen <ak@linux.intel.com>
>
> autoprofiledbootstrap fails currently with
>
> In file included from ../../gcc/gcc/hash-table.h:236,
>                  from ../../gcc/gcc/coretypes.h:440,
>                  from ../../gcc/gcc/ipa-devirt.c:110:
> In static member function 'static void va_heap::release(vec<T, va_heap, vl_embed>*&) [with T = tree_node*]',
>     inlined from 'void vec<T>::release() [with T = tree_node*]' at ../../gcc/gcc/vec.h:1679:20,
>     inlined from 'auto_vec<T, N>::~auto_vec() [with T = tree_node*; long unsigned int N = 8]' at ../../gcc/gcc/vec.h:1436:5,
>     inlined from 'vec<cgraph_node*> possible_polymorphic_call_targets(tree, long int, ipa_polymorphic_call_context, bool*, void**, bool)' at ../../gcc/gcc/ipa-devirt.c:3099:22:
> ../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
>   311 |   ::free (v);
>       |   ~~~~~~~^~~
> ../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
> cc1plus: all warnings being treated as errors
>
> The problem is that auto_vec uses a variable to keep track if the vector
> is on the heap or auto. Normally this gets constant resolved, but only
> when the right functions are inlined. With autofdo for some reason
> the compiler decides to not inline these vec functions, even though
> they are marked as "inline"
>
> Mark them as ALWAYS_INLINE instead.

This might fix your case but I think it only papers over the issue.  Consider

 auto_vec<...> vec;
 not-inlined-foo (vec);

where the function can end up re-allocating the vector.  I think the more
appropriate fix is to add #pragma GCC diagnostic pus/pop and
ignored "-Wfree-nonheap-object" around the inline function (and hope
for the best that works in the inlined contexts...)

Richard.

> gcc/:
>
> 2019-01-14  Andi Kleen  <ak@linux.intel.com>
>
>         * vec.h (using_auto_storage, release): Mark as ALWAYS_INLINE.
> ---
>  gcc/vec.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/vec.h b/gcc/vec.h
> index 407269c5ad3..1f5b78b1fac 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -1664,7 +1664,7 @@ vec<T, va_heap, vl_ptr>::create (unsigned nelems MEM_STAT_DECL)
>  /* Free the memory occupied by the embedded vector.  */
>
>  template<typename T>
> -inline void
> +ALWAYS_INLINE void
>  vec<T, va_heap, vl_ptr>::release (void)
>  {
>    if (!m_vec)
> @@ -1940,7 +1940,7 @@ vec<T, va_heap, vl_ptr>::reverse (void)
>  }
>
>  template<typename T>
> -inline bool
> +ALWAYS_INLINE bool
>  vec<T, va_heap, vl_ptr>::using_auto_storage () const
>  {
>    return m_vec->m_vecpfx.m_using_auto_storage;
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Lower sampling rate for autofdo bootstrap
  2019-01-14  8:20 [PATCH 1/3] Lower sampling rate for autofdo bootstrap Andi Kleen
  2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
@ 2019-01-14  9:06 ` Richard Biener
  1 sibling, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-01-14  9:06 UTC (permalink / raw)
  To: Andi Kleen; +Cc: GCC Patches, Amker.Cheng, Andi Kleen

On Mon, Jan 14, 2019 at 9:20 AM Andi Kleen <andi@firstfloor.org> wrote:
>
> From: Andi Kleen <ak@linux.intel.com>
>
> autofdo create_gcov uses a lot of memory for large sample files.
> Since gcc runs quite long the sample files generated during
> the bootstrap are fairly ig.
>
> Currently I can't even build make autoprofiledbootstrap on my system at
> home because create_gcov needs more than 12GB and runs out of memory.
>
> This should probably be fixed in create_gcov, but for now
> lowering the sampling rate works well enough for me. The bootstrap
> run is long enough that it gets good enough data in any case.

OK.

Richard.

> gcc/:
> 2019-01-14  Andi Kleen  <ak@linux.intel.com>
>
>         * Makefile.in: Lower autofdo sampling rate by 10x.
>         * Makefile.tpl: Dito.
> ---
>  Makefile.in  | 2 +-
>  Makefile.tpl | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile.in b/Makefile.in
> index aa41730528a..28539a45372 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -387,7 +387,7 @@ MAKEINFO = @MAKEINFO@
>  EXPECT = @EXPECT@
>  RUNTEST = @RUNTEST@
>
> -AUTO_PROFILE = gcc-auto-profile -c 1000000
> +AUTO_PROFILE = gcc-auto-profile -c 10000000
>
>  # This just becomes part of the MAKEINFO definition passed down to
>  # sub-makes.  It lets flags be given on the command line while still
> diff --git a/Makefile.tpl b/Makefile.tpl
> index 1ab65ac8ec4..126296fb49a 100644
> --- a/Makefile.tpl
> +++ b/Makefile.tpl
> @@ -390,7 +390,7 @@ MAKEINFO = @MAKEINFO@
>  EXPECT = @EXPECT@
>  RUNTEST = @RUNTEST@
>
> -AUTO_PROFILE = gcc-auto-profile -c 1000000
> +AUTO_PROFILE = gcc-auto-profile -c 10000000
>
>  # This just becomes part of the MAKEINFO definition passed down to
>  # sub-makes.  It lets flags be given on the command line while still
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] Increase iterations for autofdo tests
  2019-01-14  8:20   ` [PATCH 3/3] Increase iterations for autofdo tests Andi Kleen
@ 2019-01-14  9:08     ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-01-14  9:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: GCC Patches, Amker.Cheng, Andi Kleen

On Mon, Jan 14, 2019 at 9:20 AM Andi Kleen <andi@firstfloor.org> wrote:
>
> From: Andi Kleen <ak@linux.intel.com>
>
> Bin cheng pointed out that the autofdo tests are unstable because they
> don't have enough iterations for the perf sampling to get enough data.
>
> Increase the iterations, but only for autofdo. This avoids any impact
> on targets that use a slow emulator, which will never run the host
> only autofdo tests.

Can you instead use sth like AFDO_ITER_FACTOR #defined to 1 if not
defined?

> gcc/testsuite/:
>
> 2019-01-14  Andi Kleen  <ak@linux.intel.com>
>
>         * g++.dg/tree-prof/morefunc.C (ITER): Add.
>         (test1): Use.
>         (test2): Use.
>         * gcc.dg/tree-prof/cold_partition_label.c (ITER): Add.
>         (main): Use.
>         * gcc.dg/tree-prof/crossmodule-indircall-1.c (ITER): Add.
>         (main): Use
>         * gcc.dg/tree-prof/indir-call-prof.c (ITER): Add.
>         (main): Use.
>         * gcc.dg/tree-prof/peel-1.c (ITER): Add.
>         (t): Use.
>         (main): Use.
>         * gcc.dg/tree-prof/pr52027.c (ITER): Add.
>         (main): Use.
>         * gcc.dg/tree-prof/tracer-1.c (ITER): Add.
>         (main): Use.
>         * gcc.dg/tree-prof/unroll-1.c (ITER): Add.
>         (t): Use.
>         (main): Use.
>         * gcc.dg/tree-prof/update-cunroll-2.c (ITER): Add.
>         (main): Use.
>         * lib/profopt.exp: Pass -DITER to autofdo compilations.
> ---
>  gcc/testsuite/g++.dg/tree-prof/morefunc.C              |  8 ++++++--
>  gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c  |  6 +++++-
>  .../gcc.dg/tree-prof/crossmodule-indircall-1.c         | 10 +++++++---
>  gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c       |  6 +++++-
>  gcc/testsuite/gcc.dg/tree-prof/peel-1.c                | 10 +++++++---
>  gcc/testsuite/gcc.dg/tree-prof/pr52027.c               |  6 +++++-
>  gcc/testsuite/gcc.dg/tree-prof/tracer-1.c              |  7 ++++++-
>  gcc/testsuite/gcc.dg/tree-prof/unroll-1.c              | 10 +++++++---
>  gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c      |  8 ++++++--
>  gcc/testsuite/lib/profopt.exp                          |  4 ++--
>  10 files changed, 56 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/testsuite/g++.dg/tree-prof/morefunc.C b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
> index a9bdc167f45..02b01c073e9 100644
> --- a/gcc/testsuite/g++.dg/tree-prof/morefunc.C
> +++ b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
> @@ -2,6 +2,10 @@
>  #include "reorder_class1.h"
>  #include "reorder_class2.h"
>
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
>  int g;
>
>  #ifdef _PROFILE_USE
> @@ -19,7 +23,7 @@ static __attribute__((always_inline))
>  void test1 (A *tc)
>  {
>    int i;
> -  for (i = 0; i < 1000; i++)
> +  for (i = 0; i < ITER; i++)
>       g += tc->foo();
>     if (g<100) g++;
>  }
> @@ -28,7 +32,7 @@ static __attribute__((always_inline))
>  void test2 (B *tc)
>  {
>    int i;
> -  for (i = 0; i < 1000000; i++)
> +  for (i = 0; i < ITER; i++)
>       g += tc->foo();
>  }
>
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
> index 450308d6407..099069da6a7 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
> @@ -9,6 +9,10 @@ const char *sarr[SIZE];
>  const char *buf_hot;
>  const char *buf_cold;
>
> +#ifndef ITER
> +#define ITER 1000000
> +#endif
> +
>  __attribute__((noinline))
>  void
>  foo (int path)
> @@ -32,7 +36,7 @@ main (int argc, char *argv[])
>    int i;
>    buf_hot =  "hello";
>    buf_cold = "world";
> -  for (i = 0; i < 1000000; i++)
> +  for (i = 0; i < ITER; i++)
>      foo (argc);
>    return 0;
>  }
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
> index 58109d54dc7..32d22c69c6c 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1.c
> @@ -2,6 +2,10 @@
>  /* { dg-additional-sources "crossmodule-indircall-1a.c" } */
>  /* { dg-options "-O3 -flto -DDOJOB=1" } */
>
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
>  int a;
>  extern void (*p[2])(int n);
>  void abort (void);
> @@ -10,12 +14,12 @@ main()
>  { int i;
>
>    /* This call shall be converted.  */
> -  for (i = 0;i<1000;i++)
> +  for (i = 0;i<ITER;i++)
>      p[0](1);
>    /* This call shall not be converted.  */
> -  for (i = 0;i<1000;i++)
> +  for (i = 0;i<ITER;i++)
>      p[i%2](2);
> -  if (a != 1000)
> +  if (a != ITER)
>      abort ();
>
>    return 0;
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
> index 53063c3e7fa..8b9dfbb78c7 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c
> @@ -1,5 +1,9 @@
>  /* { dg-options "-O2 -fdump-tree-optimized -fdump-ipa-profile -fdump-ipa-afdo" } */
>
> +#ifndef ITER
> +#define ITER 100000
> +#endif
> +
>  static int a1 (void)
>  {
>      return 10;
> @@ -28,7 +32,7 @@ main (void)
>    int (*p) (void);
>    int  i;
>
> -  for (i = 0; i < 10000000; i ++)
> +  for (i = 0; i < ITER*100; i++)
>      {
>         setp (&p, i);
>         p ();
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/peel-1.c b/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
> index 7245b68c1ee..b6ed178e1ad 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/peel-1.c
> @@ -1,13 +1,17 @@
>  /* { dg-options "-O3 -fdump-tree-cunroll-details -fno-unroll-loops -fpeel-loops" } */
>  void abort();
>
> -int a[1000];
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
> +int a[ITER];
>  int
>  __attribute__ ((noinline))
>  t()
>  {
>    int i;
> -  for (i=0;i<1000;i++)
> +  for (i=0;i<ITER;i++)
>      if (!a[i])
>        return 1;
>    abort ();
> @@ -16,7 +20,7 @@ int
>  main()
>  {
>    int i;
> -  for (i=0;i<1000;i++)
> +  for (i=0;i<ITER;i++)
>      t();
>    return 0;
>  }
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr52027.c b/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
> index c46a14b2e86..bf2a83a336d 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/pr52027.c
> @@ -2,6 +2,10 @@
>  /* { dg-require-effective-target freorder } */
>  /* { dg-options "-O2 -freorder-blocks-and-partition -fno-reorder-functions" } */
>
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
>  void
>  foo (int len)
>  {
> @@ -13,7 +17,7 @@ int
>  main ()
>  {
>    int i;
> -  for (i = 0; i < 1000; i++)
> +  for (i = 0; i < ITER; i++)
>      foo (8);
>    return 0;
>  }
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c b/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
> index 1e64f284ac0..65570a5e96d 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/tracer-1.c
> @@ -1,9 +1,14 @@
>  /* { dg-options "-O2 -ftracer -fdump-tree-tracer" } */
> +
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
>  volatile int a, b, c;
>  int main ()
>  {
>    int i;
> -  for (i = 0; i < 1000; i++)
> +  for (i = 0; i < ITER; i++)
>      {
>        if (i % 17)
>         a++;
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c b/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
> index 3ad0cf019b3..3027e75a241 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/unroll-1.c
> @@ -1,13 +1,17 @@
>  /* { dg-options "-O3 -fdump-rtl-loop2_unroll-details -funroll-loops -fno-peel-loops" } */
>  void abort ();
>
> -int a[1000];
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
> +int a[ITER];
>  int
>  __attribute__ ((noinline))
>  t()
>  {
>    int i;
> -  for (i=0;i<1000;i++)
> +  for (i=0;i<ITER;i++)
>      if (!a[i])
>        return 1;
>    abort ();
> @@ -16,7 +20,7 @@ int
>  main()
>  {
>    int i;
> -  for (i=0;i<1000;i++)
> +  for (i=0;i<ITER;i++)
>      t();
>    return 0;
>  }
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c b/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
> index c286816cdf8..de2d03ebaee 100644
> --- a/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-prof/update-cunroll-2.c
> @@ -1,5 +1,9 @@
> -
>  /* { dg-options "-O2 -fdump-tree-optimized-blocks" } */
> +
> +#ifndef ITER
> +#define ITER 1000
> +#endif
> +
>  int a[8];
>  __attribute__ ((noinline))
>  int t()
> @@ -14,7 +18,7 @@ int
>  main ()
>  {
>    int i;
> -  for (i = 0; i < 1000; i++)
> +  for (i = 0; i < ITER; i++)
>      t ();
>    return 0;
>  }
> diff --git a/gcc/testsuite/lib/profopt.exp b/gcc/testsuite/lib/profopt.exp
> index 65494cfd4f6..13e7828bf32 100644
> --- a/gcc/testsuite/lib/profopt.exp
> +++ b/gcc/testsuite/lib/profopt.exp
> @@ -289,8 +289,8 @@ proc auto-profopt-execute { src } {
>          return
>      }
>      set profile_wrapper [profopt-perf-wrapper]
> -    set profile_option "-g"
> -    set feedback_option "-fauto-profile"
> +    set profile_option "-g -DITER=1000000"
> +    set feedback_option "-fauto-profile -DITER=1000000"
>      set run_autofdo 1
>      profopt-execute $src
>      unset profile_wrapper
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] Fix autoprofiledbootstrap
  2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
  2019-01-14  8:20   ` [PATCH 3/3] Increase iterations for autofdo tests Andi Kleen
  2019-01-14  9:04   ` [PATCH 2/3] Fix autoprofiledbootstrap Richard Biener
@ 2019-01-14  9:11   ` Bin.Cheng
  2 siblings, 0 replies; 7+ messages in thread
From: Bin.Cheng @ 2019-01-14  9:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches List, Andi Kleen

On Mon, Jan 14, 2019 at 4:20 PM Andi Kleen <andi@firstfloor.org> wrote:
>
> From: Andi Kleen <ak@linux.intel.com>
>
> autoprofiledbootstrap fails currently with
>
> In file included from ../../gcc/gcc/hash-table.h:236,
>                  from ../../gcc/gcc/coretypes.h:440,
>                  from ../../gcc/gcc/ipa-devirt.c:110:
> In static member function 'static void va_heap::release(vec<T, va_heap, vl_embed>*&) [with T = tree_node*]',
>     inlined from 'void vec<T>::release() [with T = tree_node*]' at ../../gcc/gcc/vec.h:1679:20,
>     inlined from 'auto_vec<T, N>::~auto_vec() [with T = tree_node*; long unsigned int N = 8]' at ../../gcc/gcc/vec.h:1436:5,
>     inlined from 'vec<cgraph_node*> possible_polymorphic_call_targets(tree, long int, ipa_polymorphic_call_context, bool*, void**, bool)' at ../../gcc/gcc/ipa-devirt.c:3099:22:
> ../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
>   311 |   ::free (v);
>       |   ~~~~~~~^~~
> ../../gcc/gcc/vec.h:311:10: error: attempt to free a non-heap object 'bases_to_consider' [-Werror=free-nonheap-object]
> cc1plus: all warnings being treated as errors
>
> The problem is that auto_vec uses a variable to keep track if the vector
> is on the heap or auto. Normally this gets constant resolved, but only
> when the right functions are inlined. With autofdo for some reason
> the compiler decides to not inline these vec functions, even though
> they are marked as "inline"
A comment not closely related to this patch.  We observed the same
inline behavior in which perf data is inadequate, sometime it has
non-trivial impact on kernel compilation.  We have patch fall back to
guessed profile count if the profiled count is of low quality.  Will
send it out in GCC10.

Thanks,
bin
>
> Mark them as ALWAYS_INLINE instead.
>
> gcc/:
>
> 2019-01-14  Andi Kleen  <ak@linux.intel.com>
>
>         * vec.h (using_auto_storage, release): Mark as ALWAYS_INLINE.
> ---
>  gcc/vec.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/vec.h b/gcc/vec.h
> index 407269c5ad3..1f5b78b1fac 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -1664,7 +1664,7 @@ vec<T, va_heap, vl_ptr>::create (unsigned nelems MEM_STAT_DECL)
>  /* Free the memory occupied by the embedded vector.  */
>
>  template<typename T>
> -inline void
> +ALWAYS_INLINE void
>  vec<T, va_heap, vl_ptr>::release (void)
>  {
>    if (!m_vec)
> @@ -1940,7 +1940,7 @@ vec<T, va_heap, vl_ptr>::reverse (void)
>  }
>
>  template<typename T>
> -inline bool
> +ALWAYS_INLINE bool
>  vec<T, va_heap, vl_ptr>::using_auto_storage () const
>  {
>    return m_vec->m_vecpfx.m_using_auto_storage;
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-14  9:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-14  8:20 [PATCH 1/3] Lower sampling rate for autofdo bootstrap Andi Kleen
2019-01-14  8:20 ` [PATCH 2/3] Fix autoprofiledbootstrap Andi Kleen
2019-01-14  8:20   ` [PATCH 3/3] Increase iterations for autofdo tests Andi Kleen
2019-01-14  9:08     ` Richard Biener
2019-01-14  9:04   ` [PATCH 2/3] Fix autoprofiledbootstrap Richard Biener
2019-01-14  9:11   ` Bin.Cheng
2019-01-14  9:06 ` [PATCH 1/3] Lower sampling rate for autofdo bootstrap Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).