public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RL78 one_cmplhi2 improvement
@ 2018-02-20 12:15 Sebastian Perta
  2018-02-20 19:44 ` DJ Delorie
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Perta @ 2018-02-20 12:15 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'DJ Delorie'

Hello,

The following patch defines one_cmplhi2 pattern.  The improvement does not
come from the two xor instructions used 
(in fact for the second xor the pattern uses xor saddr , #byte is used which
is 1 bytes longer than xor a, #byte) it comes 
from the fact that that the number of move instructions is reduced (16 bit
movw is used instead of 8 bit mov).

This can be seen on a very simple test case:

uint16_t test_one_cmplhi(uint16_t x) 
{
 return ~x; 
}

_test_one_cmplhi:
	mov	a, [sp+4]
	xor	a, #-1
	mov	r8, a
	mov	a, [sp+5]
	xor	a, #-1
	mov	r9, a
	ret

_test_one_cmplhi:
	movw	ax, [sp+4]
	xor a, #-1 
	xor 0xFFEF8, #-1 ;one_cmplhi2 ax, ax
	movw	r8, ax
	ret

In "gcc.c-torture/execute/" I have seen the patch being effective in 18
cases with the biggest improvement is occurring in
gcc.c-torture/execute/pr68376-2 where this patch reduces the code size from
792 bytes to 704.

Unfortunately there's also a downside to this patch. In cases when the
second xor is redundant GCC never gets the chance to remove the second xor. 
This can be seen in gcc.c-torture/execute/pr42833.c for example line 57: 
vdest.v1 = (vsrc1.v1 + (1 << (-1 - tmp))) >> -tmp;
This is because all variables are 8 bit while the operation is being
promoted to 16 bit (because of the constants being of type int by default)
this is why the one_cmplhi2 is used in this case.

The relevant insns for this case are:

(insn 35 257 260 14 (set (reg:HI 0 x)
        (xor:HI (reg:HI 0 x)
            (const_int -1 [0xffffffffffffffff])))
"./xgcc_rw_trunk2/gcc/testsuite/gcc.c-torture/execute/pr42833.c":57 59
{*one_cmplhi2_real}
     (nil))
(insn 260 35 261 14 (set (reg:QI 1 a)
        (reg:QI 0 x [8]))
"./xgcc_rw_trunk2/gcc/testsuite/gcc.c-torture/execute/pr42833.c":57 44
{*movqi_real}
     (expr_list:REG_DEAD (reg:QI 8 r8)
        (nil)))

As it can be seen it is quite obvious that reg "a" is dead after insn 35.
I think I can write a simple rtl_opt_pass (similar to pass_rl78_move_elim)
to look for such cases and replace the one_cmplhi2 with one_cmplqi2.

Even without this new pass (which I plan to do in the near future) I still
think (based on testing) the advantages of this patch outweigh the
disadvantage described above.
Is OK to check-in? Or do I need to add the new pass as well for this to be
considered? Thank you!

Regression test is OK, tested with the following command:
make -k check-gcc RUNTESTFLAGS=--target_board=rl78-sim


Best Regards,
Sebastian

2018-02-20  Sebastian Perta  <sebastian.perta@renesas.com>

	* config/rl78/rl78-expand.md (one_cmplhi2): New define expand.
	* config/rl78/rl78-virt.md (one_cmplhi2_virt): New define insn.
	* config/rl78/rl78-real.md (one_cmplhi2_real): New define insn.


Index: rl78-expand.md
===================================================================
--- rl78-expand.md	(revision 257806)
+++ rl78-expand.md	(working copy)
@@ -211,6 +211,16 @@
      DONE;"
 )
 
+(define_expand "one_cmplhi2"
+ [(set (match_operand:HI 0 "nonimmediate_operand")
+  (xor:HI (match_operand:HI 1 "general_operand")
+  		(const_int -1)))
+  ]
+  ""
+  "if (rl78_force_nonfar_2 (operands, gen_one_cmplhi2))
+     DONE;"
+)
+
 ;;---------- Shifts ------------------------
 
 (define_expand "ashl<mode>3"
Index: rl78-real.md
===================================================================
--- rl78-real.md	(revision 257806)
+++ rl78-real.md	(working copy)
@@ -240,6 +240,16 @@
   [(set (attr "update_Z") (const_string "update_Z"))]
 )
 
+(define_insn "*one_cmplhi2_real"
+ [(set (match_operand:HI 0 "register_operand" "=A")
+       (xor:HI (match_operand:HI 1 "general_operand" "0")
+	     		(const_int -1)))
+ ]
+ "rl78_real_insns_ok ()"
+ "xor a, #-1 \;xor 0xFFEF8, #-1 ;one_cmplhi2 %0, %1"
+ [(set_attr "update_Z" "clobber")]
+)
+
 ;;---------- Shifts ------------------------
 
 (define_insn "*ashlqi3_real"
Index: rl78-virt.md
===================================================================
--- rl78-virt.md	(revision 257806)
+++ rl78-virt.md	(working copy)
@@ -165,6 +165,16 @@
   "v.xor\t%0, %1, %2"
 )
 
+(define_insn "*one_cmplhi2_virt"
+ [(set (match_operand:HI 0 "rl78_nonfar_nonimm_operand" "=v")
+  (xor:HI (match_operand:HI 1 "general_operand"  "ivU")
+    		(const_int -1)))
+  ]
+  "rl78_virt_insns_ok ()"
+  "v.one_cmplhi2\t%0, %1"
+ [(set_attr "valloc" "op1")]
+)
+
 ;;---------- Shifts ------------------------
 
 (define_insn "*ashl<mode>3_virt"

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] RL78 one_cmplhi2 improvement
  2018-02-20 12:15 [PATCH] RL78 one_cmplhi2 improvement Sebastian Perta
@ 2018-02-20 19:44 ` DJ Delorie
  2018-02-27 15:33   ` Sebastian Perta
  0 siblings, 1 reply; 4+ messages in thread
From: DJ Delorie @ 2018-02-20 19:44 UTC (permalink / raw)
  To: Sebastian Perta; +Cc: gcc-patches


Const type promotion is the bane of embedded developers...

One thing to try is to use (subreg:QI in a define_expand, so that
there's a one_cmplhi2 pattern that expands to two QImode insns that
operate on HImode input/outputs via SUBREGs.

I don't have high hopes of gcc optimizing this properly in all cases,
but it's worth trying.

If it doesn't work out, consider this patch approved, though.

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] RL78 one_cmplhi2 improvement
  2018-02-20 19:44 ` DJ Delorie
@ 2018-02-27 15:33   ` Sebastian Perta
  2018-02-27 15:52     ` DJ Delorie
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Perta @ 2018-02-27 15:33 UTC (permalink / raw)
  To: 'DJ Delorie'; +Cc: gcc-patches

HI DJ,

> One thing to try is to use (subreg:QI in a define_expand, so that
> there's a one_cmplhi2 pattern that expands to two QImode insns that
> operate on HImode input/outputs via SUBREGs.

Thank you for the suggestion! After several attempts the following is the
only successful one, however the code produced is identical with and without
the patch:

(define_expand "one_cmplhi2"
 [(set (subreg:QI (match_operand:HI 0 "nonimmediate_operand") 0)
  (xor:HI (subreg:QI (match_operand:HI 1 "general_operand") 0)
  		(const_int -1)))
  (set (subreg:QI (match_dup 0) 1)
  (xor:HI (subreg:QI (match_dup 1) 1)
  		(const_int -1)))
  ]
  ""
  "DONE;"
)

Is this similar to what you had in mind?

Output code (same as before the patch ... the patch makes no difference):
_test_one_cmplhi:
	mov	a, [sp+4]
	xor	a, #-1
	mov	r8, a
	mov	a, [sp+5]
	xor	a, #-1
	mov	r9, a
	ret

I also explored other options including define_split without any success.

> If it doesn't work out, consider this patch approved, though.
Can I checkin now?

Best Regards,
Sebastian


> -----Original Message-----
> From: DJ Delorie [mailto:dj@redhat.com]
> Sent: 20 February 2018 19:39
> To: Sebastian Perta <Sebastian.Perta@renesas.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] RL78 one_cmplhi2 improvement
> 
> 
> Const type promotion is the bane of embedded developers...
> 
> One thing to try is to use (subreg:QI in a define_expand, so that
> there's a one_cmplhi2 pattern that expands to two QImode insns that
> operate on HImode input/outputs via SUBREGs.
> 
> I don't have high hopes of gcc optimizing this properly in all cases,
> but it's worth trying.
> 
> If it doesn't work out, consider this patch approved, though.
> 
> Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] RL78 one_cmplhi2 improvement
  2018-02-27 15:33   ` Sebastian Perta
@ 2018-02-27 15:52     ` DJ Delorie
  0 siblings, 0 replies; 4+ messages in thread
From: DJ Delorie @ 2018-02-27 15:52 UTC (permalink / raw)
  To: Sebastian Perta; +Cc: gcc-patches

"Sebastian Perta" <sebastian.perta@renesas.com> writes:
> Is this similar to what you had in mind?

Yes.  Did it affect code size in any of the larger tests?  I was hoping
that it wouldn't force too much into 8-bit registers and cause more
moves to be needed elsewhere.

(and even if it didn't, I think this one feels "more correct" than the
other, as it retains more of the "I'm 16 bits"-ness of the operand)

>> If it doesn't work out, consider this patch approved, though.
> Can I checkin now?

Yes.  Thanks!

Make sure the indentation is correct, of course.  It wasn't in the
email, and that confused me at first.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-02-27 15:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-20 12:15 [PATCH] RL78 one_cmplhi2 improvement Sebastian Perta
2018-02-20 19:44 ` DJ Delorie
2018-02-27 15:33   ` Sebastian Perta
2018-02-27 15:52     ` DJ Delorie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).