public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] switch lowering: limit number of cluster attemps
@ 2020-09-22 11:22 Martin Liška
  2020-09-22 12:19 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Liška @ 2020-09-22 11:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Richard Biener

Hi.

The patch is about a bail out limit that needs to be added to switch lowering.
Currently the algorithm is quadratic and needs some bail out. I've tested value
of 100K which corresponds to about 0.2s in the problematic test-case before
it's reached.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

	PR tree-optimization/96979
	* doc/invoke.texi: Document new param max-switch-clustering-attempts.
	* params.opt: Add new parameter.
	* tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
	Limit number of attempts.
	(bit_test_cluster::find_bit_tests): Likewise.

gcc/testsuite/ChangeLog:

	PR tree-optimization/96979
	* g++.dg/tree-ssa/pr96979.C: New test.
---
  gcc/doc/invoke.texi                     |  4 ++
  gcc/params.opt                          |  4 ++
  gcc/testsuite/g++.dg/tree-ssa/pr96979.C | 50 +++++++++++++++++++++++++
  gcc/tree-switch-conversion.c            | 17 +++++++++
  4 files changed, 75 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr96979.C

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 665c0ffc4a1..6a7833b1d75 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13452,6 +13452,10 @@ The smallest number of different values for which it is best to use a
  jump-table instead of a tree of conditional branches.  If the value is
  0, use the default for the machine.
  
+@item max-switch-clustering-attempts
+The maximum number of clustering attempts used
+in bit-test and jump-table switch expansion.
+
  @item jump-table-max-growth-ratio-for-size
  The maximum code size growth ratio when expanding
  into a jump table (in percent).  The parameter is used when
diff --git a/gcc/params.opt b/gcc/params.opt
index 1d864047ad2..f4dcb5426c7 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -82,6 +82,10 @@ The maximum length of a constant string for a builtin string cmp call eligible f
  Common Joined UInteger Var(param_case_values_threshold) Param Optimization
  The smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches, if 0, use the default for the machine.
  
+-param=max-switch-clustering-attempts=
+Common Joined UInteger Var(param_max_switch_clustering_attempts) Param Optimization Init(100000)
+The maximum number of clustering attempts used in bit-test and jump-table switch expansion.
+
  -param=comdat-sharing-probability=
  Common Joined UInteger Var(param_comdat_sharing_probability) Init(20) Param Optimization
  Probability that COMDAT function will be shared with different compilation unit.
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
new file mode 100644
index 00000000000..85c703a140d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
@@ -0,0 +1,50 @@
+/* PR tree-optimization/96979 */
+/* { dg-do compile } */
+/* { dg-options "-std=c++17 -O2 -fdump-tree-switchlower1" } */
+
+using u64 = unsigned long long;
+
+constexpr inline u64
+foo (const char *str) noexcept
+{
+  u64 value = 0xcbf29ce484222325ULL;
+  for (u64 i = 0; str[i]; i++)
+    value = (value ^ u64(str[i])) * 0x100000001b3ULL;
+  return value;
+}
+
+struct V
+{
+  enum W
+  {
+#define A(n) n,
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
+#define E D(foo1) D(foo2) D(foo3)
+    E
+    last
+  };
+
+  constexpr static W
+  bar (const u64 h) noexcept
+  {
+    switch (h)
+      {
+#undef A
+#define F(n) #n
+#define A(n) case foo (F(n)): return n;
+        E
+      }
+    return last;
+  }
+};
+
+int
+baz (const char *s)
+{
+  const u64 h = foo (s);
+  return V::bar (h);
+}
+
+/* { dg-final { scan-tree-dump-times ";; Bail out: --param=max-switch-clustering-attempts reached" 2 "switchlower1" } } */
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 186411ff3c4..e6a2c7a6a84 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1183,6 +1183,7 @@ jump_table_cluster::find_jump_tables (vec<cluster *> &clusters)
  
    min.quick_push (min_cluster_item (0, 0, 0));
  
+  HOST_WIDE_INT attempts = 0;
    for (unsigned i = 1; i <= l; i++)
      {
        /* Set minimal # of clusters with i-th item to infinite.  */
@@ -1194,6 +1195,14 @@ jump_table_cluster::find_jump_tables (vec<cluster *> &clusters)
  	  if (i - j < case_values_threshold ())
  	    s += i - j;
  
+	  if (attempts++ == param_max_switch_clustering_attempts)
+	    {
+	      if (dump_file)
+		fprintf (dump_file, ";; Bail out: "
+			 "--param=max-switch-clustering-attempts reached\n");
+	      return clusters.copy ();
+	    }
+
  	  /* Prefer clusters with smaller number of numbers covered.  */
  	  if ((min[j].m_count + 1 < min[i].m_count
  	       || (min[j].m_count + 1 == min[i].m_count
@@ -1308,6 +1317,7 @@ bit_test_cluster::find_bit_tests (vec<cluster *> &clusters)
  
    min.quick_push (min_cluster_item (0, 0, 0));
  
+  HOST_WIDE_INT attempts = 0;
    for (unsigned i = 1; i <= l; i++)
      {
        /* Set minimal # of clusters with i-th item to infinite.  */
@@ -1315,6 +1325,13 @@ bit_test_cluster::find_bit_tests (vec<cluster *> &clusters)
  
        for (unsigned j = 0; j < i; j++)
  	{
+	  if (attempts++ == param_max_switch_clustering_attempts)
+	    {
+	      if (dump_file)
+		fprintf (dump_file, ";; Bail out: "
+			 "--param=max-switch-clustering-attempts reached\n");
+	      return clusters.copy ();
+	    }
  	  if (min[j].m_count + 1 < min[i].m_count
  	      && can_be_handled (clusters, j, i - 1))
  	    min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-22 11:22 [PATCH] switch lowering: limit number of cluster attemps Martin Liška
@ 2020-09-22 12:19 ` Richard Biener
  2020-09-22 12:39   ` Martin Liška
  2020-09-23 10:10   ` Jakub Jelinek
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Biener @ 2020-09-22 12:19 UTC (permalink / raw)
  To: Martin Liška, gcc-patches; +Cc: Jakub Jelinek

On September 22, 2020 1:22:12 PM GMT+02:00, "Martin Liška" <mliska@suse.cz> wrote:
>Hi.
>
>The patch is about a bail out limit that needs to be added to switch
>lowering.
>Currently the algorithm is quadratic and needs some bail out. I've
>tested value
>of 100K which corresponds to about 0.2s in the problematic test-case
>before
>it's reached.
>
>Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
>Ready to be installed?

OK. Though the default limit looks high? 

Richard. 

>Thanks,
>Martin
>
>gcc/ChangeLog:
>
>	PR tree-optimization/96979
>	* doc/invoke.texi: Document new param max-switch-clustering-attempts.
>	* params.opt: Add new parameter.
>	* tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
>	Limit number of attempts.
>	(bit_test_cluster::find_bit_tests): Likewise.
>
>gcc/testsuite/ChangeLog:
>
>	PR tree-optimization/96979
>	* g++.dg/tree-ssa/pr96979.C: New test.
>---
>  gcc/doc/invoke.texi                     |  4 ++
>  gcc/params.opt                          |  4 ++
> gcc/testsuite/g++.dg/tree-ssa/pr96979.C | 50 +++++++++++++++++++++++++
>  gcc/tree-switch-conversion.c            | 17 +++++++++
>  4 files changed, 75 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr96979.C
>
>diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>index 665c0ffc4a1..6a7833b1d75 100644
>--- a/gcc/doc/invoke.texi
>+++ b/gcc/doc/invoke.texi
>@@ -13452,6 +13452,10 @@ The smallest number of different values for
>which it is best to use a
> jump-table instead of a tree of conditional branches.  If the value is
>  0, use the default for the machine.
>  
>+@item max-switch-clustering-attempts
>+The maximum number of clustering attempts used
>+in bit-test and jump-table switch expansion.
>+
>  @item jump-table-max-growth-ratio-for-size
>  The maximum code size growth ratio when expanding
>  into a jump table (in percent).  The parameter is used when
>diff --git a/gcc/params.opt b/gcc/params.opt
>index 1d864047ad2..f4dcb5426c7 100644
>--- a/gcc/params.opt
>+++ b/gcc/params.opt
>@@ -82,6 +82,10 @@ The maximum length of a constant string for a
>builtin string cmp call eligible f
>Common Joined UInteger Var(param_case_values_threshold) Param
>Optimization
>The smallest number of different values for which it is best to use a
>jump-table instead of a tree of conditional branches, if 0, use the
>default for the machine.
>  
>+-param=max-switch-clustering-attempts=
>+Common Joined UInteger Var(param_max_switch_clustering_attempts) Param
>Optimization Init(100000)
>+The maximum number of clustering attempts used in bit-test and
>jump-table switch expansion.
>+
>  -param=comdat-sharing-probability=
>Common Joined UInteger Var(param_comdat_sharing_probability) Init(20)
>Param Optimization
>Probability that COMDAT function will be shared with different
>compilation unit.
>diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
>b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
>new file mode 100644
>index 00000000000..85c703a140d
>--- /dev/null
>+++ b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
>@@ -0,0 +1,50 @@
>+/* PR tree-optimization/96979 */
>+/* { dg-do compile } */
>+/* { dg-options "-std=c++17 -O2 -fdump-tree-switchlower1" } */
>+
>+using u64 = unsigned long long;
>+
>+constexpr inline u64
>+foo (const char *str) noexcept
>+{
>+  u64 value = 0xcbf29ce484222325ULL;
>+  for (u64 i = 0; str[i]; i++)
>+    value = (value ^ u64(str[i])) * 0x100000001b3ULL;
>+  return value;
>+}
>+
>+struct V
>+{
>+  enum W
>+  {
>+#define A(n) n,
>+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6)
>A(n##7) A(n##8) A(n##9)
>+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6)
>B(n##7) B(n##8) B(n##9)
>+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6)
>C(n##7) C(n##8) C(n##9)
>+#define E D(foo1) D(foo2) D(foo3)
>+    E
>+    last
>+  };
>+
>+  constexpr static W
>+  bar (const u64 h) noexcept
>+  {
>+    switch (h)
>+      {
>+#undef A
>+#define F(n) #n
>+#define A(n) case foo (F(n)): return n;
>+        E
>+      }
>+    return last;
>+  }
>+};
>+
>+int
>+baz (const char *s)
>+{
>+  const u64 h = foo (s);
>+  return V::bar (h);
>+}
>+
>+/* { dg-final { scan-tree-dump-times ";; Bail out:
>--param=max-switch-clustering-attempts reached" 2 "switchlower1" } } */
>diff --git a/gcc/tree-switch-conversion.c
>b/gcc/tree-switch-conversion.c
>index 186411ff3c4..e6a2c7a6a84 100644
>--- a/gcc/tree-switch-conversion.c
>+++ b/gcc/tree-switch-conversion.c
>@@ -1183,6 +1183,7 @@ jump_table_cluster::find_jump_tables (vec<cluster
>*> &clusters)
>  
>    min.quick_push (min_cluster_item (0, 0, 0));
>  
>+  HOST_WIDE_INT attempts = 0;
>    for (unsigned i = 1; i <= l; i++)
>      {
>        /* Set minimal # of clusters with i-th item to infinite.  */
>@@ -1194,6 +1195,14 @@ jump_table_cluster::find_jump_tables
>(vec<cluster *> &clusters)
>  	  if (i - j < case_values_threshold ())
>  	    s += i - j;
>  
>+	  if (attempts++ == param_max_switch_clustering_attempts)
>+	    {
>+	      if (dump_file)
>+		fprintf (dump_file, ";; Bail out: "
>+			 "--param=max-switch-clustering-attempts reached\n");
>+	      return clusters.copy ();
>+	    }
>+
>  	  /* Prefer clusters with smaller number of numbers covered.  */
>  	  if ((min[j].m_count + 1 < min[i].m_count
>  	       || (min[j].m_count + 1 == min[i].m_count
>@@ -1308,6 +1317,7 @@ bit_test_cluster::find_bit_tests (vec<cluster *>
>&clusters)
>  
>    min.quick_push (min_cluster_item (0, 0, 0));
>  
>+  HOST_WIDE_INT attempts = 0;
>    for (unsigned i = 1; i <= l; i++)
>      {
>        /* Set minimal # of clusters with i-th item to infinite.  */
>@@ -1315,6 +1325,13 @@ bit_test_cluster::find_bit_tests (vec<cluster *>
>&clusters)
>  
>        for (unsigned j = 0; j < i; j++)
>  	{
>+	  if (attempts++ == param_max_switch_clustering_attempts)
>+	    {
>+	      if (dump_file)
>+		fprintf (dump_file, ";; Bail out: "
>+			 "--param=max-switch-clustering-attempts reached\n");
>+	      return clusters.copy ();
>+	    }
>  	  if (min[j].m_count + 1 < min[i].m_count
>  	      && can_be_handled (clusters, j, i - 1))
>  	    min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-22 12:19 ` Richard Biener
@ 2020-09-22 12:39   ` Martin Liška
  2020-09-23 10:10   ` Jakub Jelinek
  1 sibling, 0 replies; 12+ messages in thread
From: Martin Liška @ 2020-09-22 12:39 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Jakub Jelinek

On 9/22/20 2:19 PM, Richard Biener wrote:
> OK. Though the default limit looks high?

Yep, I'm going to install it with the param default value
equal to 10000.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-22 12:19 ` Richard Biener
  2020-09-22 12:39   ` Martin Liška
@ 2020-09-23 10:10   ` Jakub Jelinek
  2020-09-25  9:13     ` Martin Liška
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2020-09-23 10:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Liška, gcc-patches

On Tue, Sep 22, 2020 at 02:19:12PM +0200, Richard Biener via Gcc-patches wrote:
> On September 22, 2020 1:22:12 PM GMT+02:00, "Martin Liška" <mliska@suse.cz> wrote:
> >Hi.
> >
> >The patch is about a bail out limit that needs to be added to switch
> >lowering.
> >Currently the algorithm is quadratic and needs some bail out. I've
> >tested value
> >of 100K which corresponds to about 0.2s in the problematic test-case
> >before
> >it's reached.
> >
> >Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> >Ready to be installed?
> 
> OK. Though the default limit looks high? 

This (with the new very low limit of 10000) broke bootstrap on i686-linux
PATH=~/hbin:$PATH i386 ../configure --enable-languages=default,obj-c++,lto,go,brig,d --enable-checking=yes,rtl,extra
(where ~/hbin contains as/ld/gcc/g++ wrappers that ensure 32-bit
compilation).
When compiling insn-extract.c which contains a single large function with
a huge switch in it, we punt on the switch optimization and run out of
memory during DF (sure, latent problem).
I get
cc1plus: out of memory allocating 65536 bytes after a total of 2854985728 bytes
even with --param=max-switch-clustering-attempts=100000 though, or even with
=1000000.  It works fine with =10000000 (and compiles pretty quickly, 1m50sec).
Surprisingly, with =5000000 it takes huge amounts of time (in the switchconv
code - btw, couldn't it use wide_ints rather than const_binop to save
memory/compile time?, over 6 minutes, before it dies in DF).
What I see is that the function has slightly less than 3500 switch clusters,
so for O(n^2) that is definitely more than the 10000 (which is for 100
switch clusters).

With =10000000, I see that it finishes in find_jump_tables very quickly with
attempts equal to 5997916 when it exits successfully the loop.

Now, I wonder if it is really a good idea to measure attempts as the number
of the can_be_handled calls or something similar.

And
  static unsigned HOST_WIDE_INT get_range (tree low, tree high)
  {
    tree r = fold_build2 (MINUS_EXPR, TREE_TYPE (low), high, low);
    if (!tree_fits_uhwi_p (r))
      return 0;

    return tree_to_uhwi (r) + 1;
  }
is a big waste of time and memory.
If all labels are already checked to be INTEGER_CSTs of the same type
(fold_build2 wouldn't work well if they have different types), then
  static unsigned HOST_WIDE_INT get_range (tree low, tree high)
  {
    wide_int w = wi::to_wide (high) - wi::to_wide (low);
    if (wi::neg_p (w, TYPE_SIGN (TREE_TYPE (low))) || !wi::fits_uhwi_p (w))
      return 0;
    return wi::to_uhwi (w) + 1;
  }
should do it without having to look up new and new INTEGER_CSTs in the hash
tables and/or create them?

As for the DF problem, seems it is the MIR DF problem used by REE that runs
out of memory, perhaps REE should punt on certain kinds of cfgs?

	Jakub


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-23 10:10   ` Jakub Jelinek
@ 2020-09-25  9:13     ` Martin Liška
  2020-09-25 13:18       ` Richard Biener
  2020-09-25 13:52       ` Jakub Jelinek
  0 siblings, 2 replies; 12+ messages in thread
From: Martin Liška @ 2020-09-25  9:13 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 423 bytes --]

Hello.

All right, I come up with a rapid speed up that can allow us to remove
the introduced parameter. It contains 2 parts:
- BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
- JT: we spent quite some time in density calculation, we can guess it first
   and it leads to a fast bail out.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

[-- Attachment #2: 0002-switch-conversion-make-a-rapid-speed-up.patch --]
[-- Type: text/x-patch, Size: 5292 bytes --]

From dc4c1d129a50c7f51d28235506479f29d51dae07 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Thu, 24 Sep 2020 13:34:13 +0200
Subject: [PATCH 2/2] switch conversion: make a rapid speed up

gcc/ChangeLog:

	PR tree-optimization/96979
	* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
	Make a fast bail out.
	(bit_test_cluster::can_be_handled): Likewise here.
	* tree-switch-conversion.h (get_range): Use wi::to_wide instead
	of a folding.

gcc/testsuite/ChangeLog:

	PR tree-optimization/96979
	* g++.dg/tree-ssa/pr96979.C: New test.
---
 gcc/testsuite/g++.dg/tree-ssa/pr96979.C | 48 +++++++++++++++++++++++++
 gcc/tree-switch-conversion.c            | 32 ++++++++++++-----
 gcc/tree-switch-conversion.h            |  7 ++--
 3 files changed, 74 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr96979.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
new file mode 100644
index 00000000000..ec0f57a8548
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
@@ -0,0 +1,48 @@
+/* PR tree-optimization/96979 */
+/* { dg-do compile } */
+/* { dg-options "-std=c++17 -O2" } */
+
+using u64 = unsigned long long;
+
+constexpr inline u64
+foo (const char *str) noexcept
+{
+  u64 value = 0xcbf29ce484222325ULL;
+  for (u64 i = 0; str[i]; i++)
+    value = (value ^ u64(str[i])) * 0x100000001b3ULL;
+  return value;
+}
+
+struct V
+{
+  enum W
+  {
+#define A(n) n,
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
+#define E D(foo1) D(foo2) D(foo3)
+    E
+    last
+  };
+
+  constexpr static W
+  bar (const u64 h) noexcept
+  {
+    switch (h)
+      {
+#undef A
+#define F(n) #n
+#define A(n) case foo (F(n)): return n;
+        E
+      }
+    return last;
+  }
+};
+
+int
+baz (const char *s)
+{
+  const u64 h = foo (s);
+  return V::bar (h);
+}
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 186411ff3c4..3212e964b84 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1268,6 +1268,15 @@ jump_table_cluster::can_be_handled (const vec<cluster *> &clusters,
   if (range == 0)
     return false;
 
+  unsigned HOST_WIDE_INT lhs = 100 * range;
+  if (lhs < range)
+    return false;
+
+  /* First make quick guess as each cluster
+     can add at maximum 2 to the comparison_count.  */
+  if (lhs > 2 * max_ratio * (end - start + 1))
+    return false;
+
   unsigned HOST_WIDE_INT comparison_count = 0;
   for (unsigned i = start; i <= end; i++)
     {
@@ -1275,10 +1284,6 @@ jump_table_cluster::can_be_handled (const vec<cluster *> &clusters,
       comparison_count += sc->m_range_p ? 2 : 1;
     }
 
-  unsigned HOST_WIDE_INT lhs = 100 * range;
-  if (lhs < range)
-    return false;
-
   return lhs <= max_ratio * comparison_count;
 }
 
@@ -1364,12 +1369,12 @@ bit_test_cluster::can_be_handled (unsigned HOST_WIDE_INT range,
 {
   /* Check overflow.  */
   if (range == 0)
-    return 0;
+    return false;
 
   if (range >= GET_MODE_BITSIZE (word_mode))
     return false;
 
-  return uniq <= 3;
+  return uniq <= m_max_case_bit_tests;
 }
 
 /* Return true when cluster starting at START and ending at END (inclusive)
@@ -1379,6 +1384,7 @@ bool
 bit_test_cluster::can_be_handled (const vec<cluster *> &clusters,
 				  unsigned start, unsigned end)
 {
+  auto_vec<int> dest_bbs;
   /* For algorithm correctness, bit test for a single case must return
      true.  We bail out in is_beneficial if it's called just for
      a single case.  */
@@ -1387,15 +1393,23 @@ bit_test_cluster::can_be_handled (const vec<cluster *> &clusters,
 
   unsigned HOST_WIDE_INT range = get_range (clusters[start]->get_low (),
 					    clusters[end]->get_high ());
-  auto_bitmap dest_bbs;
+
+  /* Make a guess first.  */
+  if (!can_be_handled (range, m_max_case_bit_tests))
+    return false;
 
   for (unsigned i = start; i <= end; i++)
     {
       simple_cluster *sc = static_cast<simple_cluster *> (clusters[i]);
-      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
+      if (!dest_bbs.contains (sc->m_case_bb->index))
+	{
+	  dest_bbs.safe_push (sc->m_case_bb->index);
+	  if (dest_bbs.length () > m_max_case_bit_tests)
+	    return false;
+	}
     }
 
-  return can_be_handled (range, bitmap_count_bits (dest_bbs));
+  return true;
 }
 
 /* Return true when COUNT of cases of UNIQ labels is beneficial for bit test
diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index 9ebcf109319..dbfd9eecba2 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -84,11 +84,10 @@ public:
      then return 0.  */
   static unsigned HOST_WIDE_INT get_range (tree low, tree high)
   {
-    tree r = fold_build2 (MINUS_EXPR, TREE_TYPE (low), high, low);
-    if (!tree_fits_uhwi_p (r))
+    wide_int w = wi::to_wide (high) - wi::to_wide (low);
+    if (wi::neg_p (w, TYPE_SIGN (TREE_TYPE (low))) || !wi::fits_uhwi_p (w))
       return 0;
-
-    return tree_to_uhwi (r) + 1;
+    return w.to_uhwi () + 1;
   }
 
   /* Case label.  */
-- 
2.28.0


[-- Attachment #3: 0001-Revert-switch-lowering-limit-number-of-cluster-attem.patch --]
[-- Type: text/x-patch, Size: 5353 bytes --]

From e1956c0019d8df4a022ea17bc76f9e62fac182c7 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Thu, 24 Sep 2020 13:34:58 +0200
Subject: [PATCH 1/2] Revert "switch lowering: limit number of cluster attemps"

This reverts commit c6df6039e9180c580945266302ec14047d358364.
---
 gcc/doc/invoke.texi                     |  4 --
 gcc/params.opt                          |  4 --
 gcc/testsuite/g++.dg/tree-ssa/pr96979.C | 50 -------------------------
 gcc/tree-switch-conversion.c            | 17 ---------
 4 files changed, 75 deletions(-)
 delete mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr96979.C

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 75203ba2420..08f1102030c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13471,10 +13471,6 @@ The smallest number of different values for which it is best to use a
 jump-table instead of a tree of conditional branches.  If the value is
 0, use the default for the machine.
 
-@item max-switch-clustering-attempts
-The maximum number of clustering attempts used
-in bit-test and jump-table switch expansion.
-
 @item jump-table-max-growth-ratio-for-size
 The maximum code size growth ratio when expanding
 into a jump table (in percent).  The parameter is used when
diff --git a/gcc/params.opt b/gcc/params.opt
index 5f2e11d6c57..dcf5e020f01 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -82,10 +82,6 @@ The maximum length of a constant string for a builtin string cmp call eligible f
 Common Joined UInteger Var(param_case_values_threshold) Param Optimization
 The smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches, if 0, use the default for the machine.
 
--param=max-switch-clustering-attempts=
-Common Joined UInteger Var(param_max_switch_clustering_attempts) Param Optimization Init(10000)
-The maximum number of clustering attempts used in bit-test and jump-table switch expansion.
-
 -param=comdat-sharing-probability=
 Common Joined UInteger Var(param_comdat_sharing_probability) Init(20) Param Optimization
 Probability that COMDAT function will be shared with different compilation unit.
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
deleted file mode 100644
index 85c703a140d..00000000000
--- a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
+++ /dev/null
@@ -1,50 +0,0 @@
-/* PR tree-optimization/96979 */
-/* { dg-do compile } */
-/* { dg-options "-std=c++17 -O2 -fdump-tree-switchlower1" } */
-
-using u64 = unsigned long long;
-
-constexpr inline u64
-foo (const char *str) noexcept
-{
-  u64 value = 0xcbf29ce484222325ULL;
-  for (u64 i = 0; str[i]; i++)
-    value = (value ^ u64(str[i])) * 0x100000001b3ULL;
-  return value;
-}
-
-struct V
-{
-  enum W
-  {
-#define A(n) n,
-#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
-#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
-#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
-#define E D(foo1) D(foo2) D(foo3)
-    E
-    last
-  };
-
-  constexpr static W
-  bar (const u64 h) noexcept
-  {
-    switch (h)
-      {
-#undef A
-#define F(n) #n
-#define A(n) case foo (F(n)): return n;
-        E
-      }
-    return last;
-  }
-};
-
-int
-baz (const char *s)
-{
-  const u64 h = foo (s);
-  return V::bar (h);
-}
-
-/* { dg-final { scan-tree-dump-times ";; Bail out: --param=max-switch-clustering-attempts reached" 2 "switchlower1" } } */
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index e6a2c7a6a84..186411ff3c4 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1183,7 +1183,6 @@ jump_table_cluster::find_jump_tables (vec<cluster *> &clusters)
 
   min.quick_push (min_cluster_item (0, 0, 0));
 
-  HOST_WIDE_INT attempts = 0;
   for (unsigned i = 1; i <= l; i++)
     {
       /* Set minimal # of clusters with i-th item to infinite.  */
@@ -1195,14 +1194,6 @@ jump_table_cluster::find_jump_tables (vec<cluster *> &clusters)
 	  if (i - j < case_values_threshold ())
 	    s += i - j;
 
-	  if (attempts++ == param_max_switch_clustering_attempts)
-	    {
-	      if (dump_file)
-		fprintf (dump_file, ";; Bail out: "
-			 "--param=max-switch-clustering-attempts reached\n");
-	      return clusters.copy ();
-	    }
-
 	  /* Prefer clusters with smaller number of numbers covered.  */
 	  if ((min[j].m_count + 1 < min[i].m_count
 	       || (min[j].m_count + 1 == min[i].m_count
@@ -1317,7 +1308,6 @@ bit_test_cluster::find_bit_tests (vec<cluster *> &clusters)
 
   min.quick_push (min_cluster_item (0, 0, 0));
 
-  HOST_WIDE_INT attempts = 0;
   for (unsigned i = 1; i <= l; i++)
     {
       /* Set minimal # of clusters with i-th item to infinite.  */
@@ -1325,13 +1315,6 @@ bit_test_cluster::find_bit_tests (vec<cluster *> &clusters)
 
       for (unsigned j = 0; j < i; j++)
 	{
-	  if (attempts++ == param_max_switch_clustering_attempts)
-	    {
-	      if (dump_file)
-		fprintf (dump_file, ";; Bail out: "
-			 "--param=max-switch-clustering-attempts reached\n");
-	      return clusters.copy ();
-	    }
 	  if (min[j].m_count + 1 < min[i].m_count
 	      && can_be_handled (clusters, j, i - 1))
 	    min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25  9:13     ` Martin Liška
@ 2020-09-25 13:18       ` Richard Biener
  2020-09-25 13:32         ` Martin Liška
  2020-09-25 13:52       ` Jakub Jelinek
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Biener @ 2020-09-25 13:18 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jakub Jelinek, GCC Patches

On Fri, Sep 25, 2020 at 11:13 AM Martin Liška <mliska@suse.cz> wrote:
>
> Hello.
>
> All right, I come up with a rapid speed up that can allow us to remove
> the introduced parameter. It contains 2 parts:
> - BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
> - JT: we spent quite some time in density calculation, we can guess it first
>    and it leads to a fast bail out.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Err

+  auto_vec<int> dest_bbs;
-  auto_bitmap dest_bbs;

-      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
+      if (!dest_bbs.contains (sc->m_case_bb->index))
+       {
+         dest_bbs.safe_push (sc->m_case_bb->index);
+         if (dest_bbs.length () > m_max_case_bit_tests)
+           return false;
+       }

vec::contains is linear search so no.  Was this for the length check?
Just do

     if (bitmap_set_bit (...))
      {
        length++;
        if (length > ...)

> Thanks,
> Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25 13:18       ` Richard Biener
@ 2020-09-25 13:32         ` Martin Liška
  2020-09-25 13:45           ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Liška @ 2020-09-25 13:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, GCC Patches

On 9/25/20 3:18 PM, Richard Biener wrote:
> On Fri, Sep 25, 2020 at 11:13 AM Martin Liška <mliska@suse.cz> wrote:
>>
>> Hello.
>>
>> All right, I come up with a rapid speed up that can allow us to remove
>> the introduced parameter. It contains 2 parts:
>> - BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
>> - JT: we spent quite some time in density calculation, we can guess it first
>>     and it leads to a fast bail out.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> Err
> 
> +  auto_vec<int> dest_bbs;
> -  auto_bitmap dest_bbs;
> 
> -      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
> +      if (!dest_bbs.contains (sc->m_case_bb->index))
> +       {
> +         dest_bbs.safe_push (sc->m_case_bb->index);
> +         if (dest_bbs.length () > m_max_case_bit_tests)
> +           return false;
> +       }

That's intentional as m_max_case_bit_tests is a very small number (3) and
I want to track *distinct* indices in dest_bbs. So dest_bbs.contains
is a constant operation.

> 
> vec::contains is linear search so no.  Was this for the length check?
> Just do
> 
>       if (bitmap_set_bit (...))
>        {
>          length++;
>          if (length > ...)

I would need here bitmap_count_bits. Do you prefer it?

Martin

> 
>> Thanks,
>> Martin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25 13:32         ` Martin Liška
@ 2020-09-25 13:45           ` Richard Biener
  2020-09-25 14:15             ` Martin Liška
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2020-09-25 13:45 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jakub Jelinek, GCC Patches

On Fri, Sep 25, 2020 at 3:32 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 9/25/20 3:18 PM, Richard Biener wrote:
> > On Fri, Sep 25, 2020 at 11:13 AM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> Hello.
> >>
> >> All right, I come up with a rapid speed up that can allow us to remove
> >> the introduced parameter. It contains 2 parts:
> >> - BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
> >> - JT: we spent quite some time in density calculation, we can guess it first
> >>     and it leads to a fast bail out.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Err
> >
> > +  auto_vec<int> dest_bbs;
> > -  auto_bitmap dest_bbs;
> >
> > -      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
> > +      if (!dest_bbs.contains (sc->m_case_bb->index))
> > +       {
> > +         dest_bbs.safe_push (sc->m_case_bb->index);
> > +         if (dest_bbs.length () > m_max_case_bit_tests)
> > +           return false;
> > +       }
>
> That's intentional as m_max_case_bit_tests is a very small number (3) and
> I want to track *distinct* indices in dest_bbs. So dest_bbs.contains
> is a constant operation.

You're storing bb->index and formerly set bb->index bit, what's the difference?

For max 3 elements a vector is OK, of course but there should be a comment
that says this ;)  The static const is 'int' so it can in principle
hold up to two billion ;)

> >
> > vec::contains is linear search so no.  Was this for the length check?
> > Just do
> >
> >       if (bitmap_set_bit (...))
> >        {
> >          length++;
> >          if (length > ...)
>
> I would need here bitmap_count_bits. Do you prefer it?

bitmap_set_bit returns false if the bit was already set so you can
count as you add bits, see the length++ above.

For three elements the vec will be faster though.  May I suggest
to use

 auto_vec<int, m_max_case_bit_tests> dest_bbs;

then and quick_push rather than safe_push (need to guard the
push with the max_case_bit_test).

Richard.



> Martin
>
> >
> >> Thanks,
> >> Martin
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25  9:13     ` Martin Liška
  2020-09-25 13:18       ` Richard Biener
@ 2020-09-25 13:52       ` Jakub Jelinek
  2020-09-25 14:13         ` Martin Liška
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2020-09-25 13:52 UTC (permalink / raw)
  To: Martin Liška; +Cc: Richard Biener, gcc-patches

On Fri, Sep 25, 2020 at 11:13:06AM +0200, Martin Liška wrote:
> --- a/gcc/tree-switch-conversion.c
> +++ b/gcc/tree-switch-conversion.c
> @@ -1268,6 +1268,15 @@ jump_table_cluster::can_be_handled (const vec<cluster *> &clusters,
>    if (range == 0)
>      return false;
>  
> +  unsigned HOST_WIDE_INT lhs = 100 * range;
> +  if (lhs < range)
> +    return false;

If this test is meant to detect when 100 * range has overflowed,
then I think it is insufficient.
Perhaps do
  if (range > HOST_WIDE_INT_M1U / 100)
    return false;

  unsigned HOST_WIDE_INT lhs = 100 * range;
instead?

	Jakub


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25 13:52       ` Jakub Jelinek
@ 2020-09-25 14:13         ` Martin Liška
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Liška @ 2020-09-25 14:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

On 9/25/20 3:52 PM, Jakub Jelinek wrote:
> On Fri, Sep 25, 2020 at 11:13:06AM +0200, Martin Liška wrote:
>> --- a/gcc/tree-switch-conversion.c
>> +++ b/gcc/tree-switch-conversion.c
>> @@ -1268,6 +1268,15 @@ jump_table_cluster::can_be_handled (const vec<cluster *> &clusters,
>>     if (range == 0)
>>       return false;
>>   
>> +  unsigned HOST_WIDE_INT lhs = 100 * range;
>> +  if (lhs < range)
>> +    return false;
> 
> If this test is meant to detect when 100 * range has overflowed,
> then I think it is insufficient.
> Perhaps do
>    if (range > HOST_WIDE_INT_M1U / 100)
>      return false;
> 
>    unsigned HOST_WIDE_INT lhs = 100 * range;
> instead?

Yes, I'll add the check.

Thanks,
Martin

> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25 13:45           ` Richard Biener
@ 2020-09-25 14:15             ` Martin Liška
  2020-09-28  8:27               ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Liška @ 2020-09-25 14:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, GCC Patches

On 9/25/20 3:45 PM, Richard Biener wrote:
> On Fri, Sep 25, 2020 at 3:32 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 9/25/20 3:18 PM, Richard Biener wrote:
>>> On Fri, Sep 25, 2020 at 11:13 AM Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> Hello.
>>>>
>>>> All right, I come up with a rapid speed up that can allow us to remove
>>>> the introduced parameter. It contains 2 parts:
>>>> - BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
>>>> - JT: we spent quite some time in density calculation, we can guess it first
>>>>      and it leads to a fast bail out.
>>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>
>>> Err
>>>
>>> +  auto_vec<int> dest_bbs;
>>> -  auto_bitmap dest_bbs;
>>>
>>> -      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
>>> +      if (!dest_bbs.contains (sc->m_case_bb->index))
>>> +       {
>>> +         dest_bbs.safe_push (sc->m_case_bb->index);
>>> +         if (dest_bbs.length () > m_max_case_bit_tests)
>>> +           return false;
>>> +       }
>>
>> That's intentional as m_max_case_bit_tests is a very small number (3) and
>> I want to track *distinct* indices in dest_bbs. So dest_bbs.contains
>> is a constant operation.
> 
> You're storing bb->index and formerly set bb->index bit, what's the difference?
> 
> For max 3 elements a vector is OK, of course but there should be a comment
> that says this ;)  The static const is 'int' so it can in principle
> hold up to two billion ;)

Sure, comment is needed.

> 
>>>
>>> vec::contains is linear search so no.  Was this for the length check?
>>> Just do
>>>
>>>        if (bitmap_set_bit (...))
>>>         {
>>>           length++;
>>>           if (length > ...)
>>
>> I would need here bitmap_count_bits. Do you prefer it?
> 
> bitmap_set_bit returns false if the bit was already set so you can
> count as you add bits, see the length++ above.

Ah, got it!

> 
> For three elements the vec will be faster though.  May I suggest
> to use
> 
>   auto_vec<int, m_max_case_bit_tests> dest_bbs;
> 
> then and quick_push rather than safe_push (need to guard the
> push with the max_case_bit_test).

Yes.

Is the patch fine with that (and Jakub's comment)?

Martin

> 
> Richard.
> 
> 
> 
>> Martin
>>
>>>
>>>> Thanks,
>>>> Martin
>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] switch lowering: limit number of cluster attemps
  2020-09-25 14:15             ` Martin Liška
@ 2020-09-28  8:27               ` Richard Biener
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Biener @ 2020-09-28  8:27 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jakub Jelinek, GCC Patches

On Fri, Sep 25, 2020 at 4:15 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 9/25/20 3:45 PM, Richard Biener wrote:
> > On Fri, Sep 25, 2020 at 3:32 PM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> On 9/25/20 3:18 PM, Richard Biener wrote:
> >>> On Fri, Sep 25, 2020 at 11:13 AM Martin Liška <mliska@suse.cz> wrote:
> >>>>
> >>>> Hello.
> >>>>
> >>>> All right, I come up with a rapid speed up that can allow us to remove
> >>>> the introduced parameter. It contains 2 parts:
> >>>> - BIT TEST: we allow at maximum a range that is smaller GET_MODE_BITSIZE
> >>>> - JT: we spent quite some time in density calculation, we can guess it first
> >>>>      and it leads to a fast bail out.
> >>>>
> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>>>
> >>>> Ready to be installed?
> >>>
> >>> Err
> >>>
> >>> +  auto_vec<int> dest_bbs;
> >>> -  auto_bitmap dest_bbs;
> >>>
> >>> -      bitmap_set_bit (dest_bbs, sc->m_case_bb->index);
> >>> +      if (!dest_bbs.contains (sc->m_case_bb->index))
> >>> +       {
> >>> +         dest_bbs.safe_push (sc->m_case_bb->index);
> >>> +         if (dest_bbs.length () > m_max_case_bit_tests)
> >>> +           return false;
> >>> +       }
> >>
> >> That's intentional as m_max_case_bit_tests is a very small number (3) and
> >> I want to track *distinct* indices in dest_bbs. So dest_bbs.contains
> >> is a constant operation.
> >
> > You're storing bb->index and formerly set bb->index bit, what's the difference?
> >
> > For max 3 elements a vector is OK, of course but there should be a comment
> > that says this ;)  The static const is 'int' so it can in principle
> > hold up to two billion ;)
>
> Sure, comment is needed.
>
> >
> >>>
> >>> vec::contains is linear search so no.  Was this for the length check?
> >>> Just do
> >>>
> >>>        if (bitmap_set_bit (...))
> >>>         {
> >>>           length++;
> >>>           if (length > ...)
> >>
> >> I would need here bitmap_count_bits. Do you prefer it?
> >
> > bitmap_set_bit returns false if the bit was already set so you can
> > count as you add bits, see the length++ above.
>
> Ah, got it!
>
> >
> > For three elements the vec will be faster though.  May I suggest
> > to use
> >
> >   auto_vec<int, m_max_case_bit_tests> dest_bbs;
> >
> > then and quick_push rather than safe_push (need to guard the
> > push with the max_case_bit_test).
>
> Yes.
>
> Is the patch fine with that (and Jakub's comment)?

Yes.

Thanks,
Richard.

> Martin
>
> >
> > Richard.
> >
> >
> >
> >> Martin
> >>
> >>>
> >>>> Thanks,
> >>>> Martin
> >>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-09-28  8:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 11:22 [PATCH] switch lowering: limit number of cluster attemps Martin Liška
2020-09-22 12:19 ` Richard Biener
2020-09-22 12:39   ` Martin Liška
2020-09-23 10:10   ` Jakub Jelinek
2020-09-25  9:13     ` Martin Liška
2020-09-25 13:18       ` Richard Biener
2020-09-25 13:32         ` Martin Liška
2020-09-25 13:45           ` Richard Biener
2020-09-25 14:15             ` Martin Liška
2020-09-28  8:27               ` Richard Biener
2020-09-25 13:52       ` Jakub Jelinek
2020-09-25 14:13         ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).