Re: [PATCH take 2] middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: "Roger Sayle" <roger@nextmovesoftware.com>
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH take 2] middle-end: Optimize (A&C)^(B&C) to (A^B)&C in simplify_rtx.
Date: Mon, 22 Jun 2020 20:41:25 +0100	[thread overview]
Message-ID: <mptftamzy56.fsf@arm.com> (raw)
In-Reply-To: <000c01d643b2$845dfc80$8d19f580$@nextmovesoftware.com> (Roger Sayle's message of "Tue, 16 Jun 2020 08:48:24 +0100")

Hi Roger,

Thanks for the update and sorry for the slow reply.

"Roger Sayle" <roger@nextmovesoftware.com> writes:
> As suggested by Richard Sandiford, this patch splits out the
> previous RTL simplification, of (X&C)^(Y&C) to (X^Y)&C, to its
> own function, simplify_distributive_operation, and calls it when
> appropriate for IOR, XOR and AND.
>
> Instrumenting a bootstrap reveals this optimization triggers
> 393358 times during stage2, stage3 and building the libraries,
> and a further 263659 times during make check.  By order of
> occurrence the RTL transformation opportunities are:
>
>  284447 01 and ior
>  156798 00 xor and
>  131110 11 and ior
>   47253 00 and ior
>   28035 00 ior and
>    2804 01 ior and
>    2698 11 ior and
>    2262 01 xor and
>     602 11 xor and
>     312 00 xor xor
>     298 00 xor rotate
>     120 00 and ashift
>     108 00 ior lshiftrt
>      60 00 and ashiftrt
>      54 00 ior ashift
>      18 00 and lshiftrt
>      12 10 xor and
>      12 00 xor rotatert
>       8 00 xor lshiftrt
>       4 10 ior and
>       2 00 ior ashiftrt

That's an impressive number of hits :-)

> where the two binary digits denote the surviving inner unique operands,
> so "00 xor and" corresponds to the original (xor (and X Z) (and Y Z)),
> and "01 and ior" corresponds to (and (ior X Y) (ior Y Z)).
>
> Many thanks also to Richard for pointing out simplify_rtx_c_tests, the
> self-testing framework in simplify-rtx.c, which is new since my day.
> This patch supplements the existing vector mode testing, with a suite
> of scalar integer mode tests that confirm that many of the expected
> integer simplifications in simplify-rtx are being applied as expected.
> This includes three tests of the new simplify_distributive_operation.
>
> Before:
> xgcc -xc -fself-test: 59693 pass(es) in 0.820956 seconds
> xgcc -xc++ -fself-test: 59723 pass(es) in 0.786662 seconds
> After:
> xgcc -xc -fself-test: 60003 pass(es) in 0.860637 seconds
> xgcc -xc++ -fself-test: 60033 pass(es) in 0.794624 seconds
>
>
> I do have one thought/suggestion around test_scalar_ops for future
> generations.  These tests are extremely strict; instead of an
> unexpected failure in the testsuite, breaking a self-test stops
> the build.  Instead of reverting this patch, should anything go
> wrong (in future on a misbehaving platform), might I instead
> propose simply commenting out the call to test_scalar_ops in
> simplify_rtx_c_tests as a mitigation strategy whilst the build is
> restored.  In fact, removing the "static" from test_scalar_ops
> would avoid the "defined but not used" complication from this
> disaster recovery plan.

Yeah, we can work around it rather than revert the patch.

> This patch has been tested with "make bootstrap" and "make -k check"
> on x86_64-pc-linux-gnu with no regressions.
>
>
> 2020-06-16  Roger Sayle  <roger@nextmovesoftware.com>
>             Richard Sandiford  <richard.sandiford@arm.com>

Thanks for the gesture, but I don't think I should be co-author here.
I didn't write anything :-)

> […]
> @@ -3064,6 +3112,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
>  	    }
>  	}
>  
> +      /* Convert (ior (and A C) (and B C)) into (and (ior A B) C).  */
> +      if (GET_CODE (op0) == GET_CODE (op1)
> +	  && (GET_CODE (op0) == AND
> +	      || GET_CODE (op0) == IOR
> +	      || GET_CODE (op0) == LSHIFTRT
> +	      || GET_CODE (op0) == ASHIFTRT
> +	      || GET_CODE (op0) == ASHIFT
> +	      || GET_CODE (op0) == ROTATE
> +	      || GET_CODE (op0) == ROTATERT))
> +	{
> +	  tem = simplify_distributive_operation (code, mode, op0, op1);
> +	  if (tem)
> +	    return tem;
> +	}
> +
>        tem = simplify_byte_swapping_operation (code, mode, op0, op1);
>        if (tem)
>  	return tem;
> @@ -3302,6 +3365,20 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
>  	  && (reversed = reversed_comparison (op0, int_mode)))
>  	return reversed;
>  
> +      /* Convert (xor (and A C) (and B C)) into (and (xor A B) C).  */
> +      if (GET_CODE (op0) == GET_CODE (op1)
> +	  && (GET_CODE (op0) == AND
> +	      || GET_CODE (op0) == XOR
> +	      || GET_CODE (op0) == LSHIFTRT
> +	      || GET_CODE (op0) == ASHIFT
> +	      || GET_CODE (op0) == ROTATE
> +	      || GET_CODE (op0) == ROTATERT))
> +	{
> +	  tem = simplify_distributive_operation (code, mode, op0, op1);
> +	  if (tem)
> +	    return tem;
> +	}
> +
>        tem = simplify_byte_swapping_operation (code, mode, op0, op1);
>        if (tem)
>  	return tem;
> @@ -3500,6 +3577,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
>  	  && rtx_equal_p (op1, XEXP (XEXP (op0, 1), 0)))
>  	return simplify_gen_binary (AND, mode, op1, XEXP (op0, 0));
>  
> +      /* Convert (and (ior A C) (ior B C)) into (ior (and A B) C).  */
> +      if (GET_CODE (op0) == GET_CODE (op1)
> +	  && (GET_CODE (op0) == AND
> +	      || GET_CODE (op0) == IOR
> +	      || GET_CODE (op0) == LSHIFTRT
> +	      || GET_CODE (op0) == ASHIFTRT
> +	      || GET_CODE (op0) == ASHIFT
> +	      || GET_CODE (op0) == ROTATE
> +	      || GET_CODE (op0) == ROTATERT))
> +	{
> +	  tem = simplify_distributive_operation (code, mode, op0, op1);
> +	  if (tem)
> +	    return tem;
> +	}
> +
>        tem = simplify_byte_swapping_operation (code, mode, op0, op1);
>        if (tem)
>  	return tem;

Any reason to exclude ASHIFTRT from the XOR case?  AFAICT it distributes
in the same way: for equal shifts, the ASHIFTRT reproduces the same
number of sign bits, which are handled by all logical ops in the same
way as the original sign bit.  Thus doing the logical OP first on the
original sign bit and then shifting will have the same effect.

LGTM otherwise, thanks.  If you can update the patch then I'll push it.

Richard

     prev parent reply	other threads:[~2020-06-22 19:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-16  7:48 Roger Sayle
2020-06-22 19:41 ` Richard Sandiford [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptftamzy56.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).