public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization
Date: Thu,  8 Dec 2022 11:59:44 +0100	[thread overview]
Message-ID: <20221208105944.660323-1-jose.marchesi@oracle.com> (raw)

The expand_expr_divmod function in expr.cc attempts to optimize cases
where both arguments of a division/modulus are known to be positive
when interpreted as signed.  In these cases, both signed division and
unsigned division will raise the same value, and therefore the
cheapest option can be used.

In order to determine what is the cheaper option in the current
target, expand_expr_divmod actually expands both a signed divmod and
an unsigned divmod using local "sequences":

  start_sequence ();
  ...
  expand_divmod (... signed ...);
  ...
  end_sequence ();

  start_sequence ();
  ...
  expand_divmod (... unsigned ...);
  ...
  end_sequence ();

And then compares the cost of each generated sequence, choosing the
best one.  Finally, it emits the selected expanded sequence and
returns the rtx with the result.

This approach has a caveat.  Some targets do not provide instructions
for division/modulus instructions.  In the case of BPF, it provides
unsigned division/modulus, but not signed division/modulus.

In these cases, the expand_divmod tries can contain calls to funcalls.
For example, in BPF:

  start_sequence ();
  ...
  expand_divmod (... signed ...); -> This generates funcall to __divdi3
  ...
  end_sequence ();

  start_sequence ();
  ...
  expand_divmod (... unsigned ...); -> This generates direct `div' insn.
  ...
  end_sequence ();

The problem is that when a funcall is expanded, an accompanying global
symbol definition is written in the output stream:

  .global __divdi3

And this symbol definition remains in the compiled assembly file, even
if the sequence using the direct `div' instruction above is used.

This is particularly bad in BPF, because the kernel bpf loader chokes
on the spurious symbol __divdi3 and makes the resulting BPF object
unloadable (note that BPF objects are not linked before processed by
the kernel.)

In order to fix this, this patch modifies expand_expr_divmod in the
following way:

- When trying each sequence (signed, unsigned) the expand_divmod calls
  are told to _not_ use libcalls if everything else fails.  This is
  done by passing OPTAB_WIDEN as the `methods' argument.  (Before it
  was using the default value OPTAB_LIB_WIDEN.)

- If any of the tried expanded sequences contain a funcall, then the
  optimization is not attempted.

A couple of BPF tests are also added to make sure this doesn't break
at any point in the future.

Tested in bpf-unknown-none and x86_64-linux-gnu.
Regtested in x86_64-linux-gnu.  No regressions.

gcc/ChangeLog

	* expr.cc (expand_expr_divmod): Avoid side-effects of trying
	sequences involving funcalls in optimization.

gcc/testsuite/ChangeLog:

	* gcc.target/bpf/divmod-funcall-1.c: New test.
	* gcc.target/bpf/divmod-funcall-2.c: Likewise.
---
 gcc/expr.cc                                   | 44 +++++++++++--------
 .../gcc.target/bpf/divmod-funcall-1.c         |  8 ++++
 .../gcc.target/bpf/divmod-funcall-2.c         |  8 ++++
 3 files changed, 41 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c

diff --git a/gcc/expr.cc b/gcc/expr.cc
index d9407432ea5..4d4be5d7bda 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9168,32 +9168,38 @@ expand_expr_divmod (tree_code code, machine_mode mode, tree treeop0,
       do_pending_stack_adjust ();
       start_sequence ();
       rtx uns_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1,
-				   op0, op1, target, 1);
+				   op0, op1, target, 1, OPTAB_WIDEN);
       rtx_insn *uns_insns = get_insns ();
       end_sequence ();
       start_sequence ();
       rtx sgn_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1,
-				   op0, op1, target, 0);
+				   op0, op1, target, 0, OPTAB_WIDEN);
       rtx_insn *sgn_insns = get_insns ();
       end_sequence ();
-      unsigned uns_cost = seq_cost (uns_insns, speed_p);
-      unsigned sgn_cost = seq_cost (sgn_insns, speed_p);
 
-      /* If costs are the same then use as tie breaker the other other
-	 factor.  */
-      if (uns_cost == sgn_cost)
-	{
-	  uns_cost = seq_cost (uns_insns, !speed_p);
-	  sgn_cost = seq_cost (sgn_insns, !speed_p);
-	}
-
-      if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp))
-	{
-	  emit_insn (uns_insns);
-	  return uns_ret;
-	}
-      emit_insn (sgn_insns);
-      return sgn_ret;
+      /* Do not try to optimize if any of the sequences tried above
+         resulted in a funcall.  */
+      if (uns_ret && sgn_ret)
+        {
+          unsigned uns_cost = seq_cost (uns_insns, speed_p);
+          unsigned sgn_cost = seq_cost (sgn_insns, speed_p);
+
+          /* If costs are the same then use as tie breaker the other
+             other factor.  */
+          if (uns_cost == sgn_cost)
+            {
+              uns_cost = seq_cost (uns_insns, !speed_p);
+              sgn_cost = seq_cost (sgn_insns, !speed_p);
+            }
+
+          if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp))
+            {
+              emit_insn (uns_insns);
+              return uns_ret;
+            }
+          emit_insn (sgn_insns);
+          return sgn_ret;
+        }
     }
   return expand_divmod (mod_p, code, mode, treeop0, treeop1,
 			op0, op1, target, unsignedp);
diff --git a/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c b/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c
new file mode 100644
index 00000000000..dffb1506f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-not "__divdi3" } } */
+
+int
+foo (unsigned int len)
+{
+  return ((unsigned long)len) * 234 / 5;
+}
diff --git a/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c b/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c
new file mode 100644
index 00000000000..41e8e40c35c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-not "__moddi3" } } */
+
+int
+foo (unsigned int len)
+{
+  return ((unsigned long)len) * 234 % 5;
+}
-- 
2.30.2


             reply	other threads:[~2022-12-08 10:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-08 10:59 Jose E. Marchesi [this message]
2022-12-08 12:20 ` Jakub Jelinek
2022-12-08 13:02   ` Jose E. Marchesi
2022-12-08 13:19     ` Jakub Jelinek
2022-12-08 22:40       ` Jose E. Marchesi
2023-01-04  8:58         ` Jose E. Marchesi
2023-01-09  8:05           ` Richard Biener
2023-01-09  9:57             ` Jakub Jelinek
2023-01-09 13:04               ` Richard Biener
2023-01-09 13:25                 ` Jakub Jelinek
2023-01-09 14:01               ` Jeff Law
2022-12-08 13:42 ` Richard Biener
2022-12-08 16:03   ` Jose E. Marchesi
2023-01-30 18:45 ` Andrew Pinski
2023-01-30 18:55   ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221208105944.660323-1-jose.marchesi@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).