public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH] Avoid andn and generate shorter not;and with -Oz.
@ 2022-04-13  8:09 Roger Sayle
  2022-04-13 13:11 ` Michael Matz
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Sayle @ 2022-04-13  8:09 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]


The x86 instruction encoding for SImode andn is longer than the
equivalent notl/andl sequence when the source for the not operand
is the same register as the destination.  This patch adds post_reload
splitters to i386.md to avoid "-mbmi" (which enables andn) increasing
code size with "-Oz".

One minor subtlety with this patch is that the splitter for
*andn_si_ccno swaps the order of operands (match_dup 2 and match_dup 3)
as memory operands need to appear first in the *test<mode>_1 patterns.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures [and like the previous patch on CSiBE].
Ok for mainline?


2022-04-13  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (define_split):  Split *andsi_1 and
	*andn_si_ccno after reload with -Oz.

gcc/testsuite/ChangeLog
	* gcc.target/i386/bmi-and-3.c: New test case.


Thanks in advance,
Roger
--

[-- Attachment #2: patchoz3.txt --]
[-- Type: text/plain, Size: 2212 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c74edd1..5e5cdb7 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10385,6 +10385,37 @@
   [(set_attr "type" "bitmanip")
    (set_attr "btver2_decode" "direct, double")
    (set_attr "mode" "<MODE>")])
+
+;; Split *andnsi_1 after reload with -Oz when not;and is shorter.
+(define_split
+  [(set (match_operand:SI 0 "register_operand")
+	(and:SI (not:SI (match_operand:SI 1 "register_operand"))
+		(match_operand:SI 2 "nonimmediate_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "reload_completed
+   && optimize_insn_for_size_p () && optimize_size > 1
+   && REGNO (operands[0]) == REGNO (operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (not:SI (match_dup 1)))
+   (parallel [(set (match_dup 0) (and:SI (match_dup 0) (match_dup 2)))
+	      (clobber (reg:CC FLAGS_REG))])])
+
+;; Split *andn_si_ccno with -Oz when not;test is shorter.
+(define_peephole2
+  [(parallel[
+     (set (match_operand 0 "flags_reg_operand")
+	  (match_operator 1 "compare_operator"
+	    [(and:SI (not:SI (match_operand:SI 2 "general_reg_operand"))
+		     (match_operand:SI 3 "nonimmediate_operand"))
+	     (const_int 0)]))
+     (clobber (match_operand:SI 4 "general_reg_operand"))])]
+  "optimize_insn_for_size_p () && optimize_size > 1
+   && !reg_overlap_mentioned_p (operands[2], operands[3])
+   && peep2_reg_dead_p (1, operands[2])"
+  [(set (match_dup 2) (not:SI (match_dup 2)))
+   (set (match_dup 0) (match_op_dup 1
+                        [(and:SI (match_dup 3) (match_dup 2))
+			 (const_int 0)]))])
 \f
 ;; Logical inclusive and exclusive OR instructions
 
diff --git a/gcc/testsuite/gcc.target/i386/bmi-andn-3.c b/gcc/testsuite/gcc.target/i386/bmi-andn-3.c
new file mode 100644
index 0000000..21313b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/bmi-andn-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-Oz -mbmi" } */
+int m;
+
+int foo(int x, int y)
+{
+    return (x & ~y) != 0;
+}
+
+int bar(int x)
+{
+  return (~x & m) != 0;
+}
+/* { dg-final { scan-assembler-not "andn\[ \\t\]+" } } */
+

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [x86 PATCH] Avoid andn and generate shorter not;and with -Oz.
  2022-04-13  8:09 [x86 PATCH] Avoid andn and generate shorter not;and with -Oz Roger Sayle
@ 2022-04-13 13:11 ` Michael Matz
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Matz @ 2022-04-13 13:11 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches

Hello,

On Wed, 13 Apr 2022, Roger Sayle wrote:

> The x86 instruction encoding for SImode andn is longer than the
> equivalent notl/andl sequence when the source for the not operand
> is the same register as the destination.

_And_ when no REX prefixes are necessary for the notl,andn, which they are 
if the respective registers are %r8 or beyond.  As you seem to be fine 
with saving just a byte you ought to test that as well to not waste one 
again :-)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-04-13 13:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13  8:09 [x86 PATCH] Avoid andn and generate shorter not;and with -Oz Roger Sayle
2022-04-13 13:11 ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).