public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <law@redhat.com>
To: "Maciej W. Rozycki" <macro@linux-mips.org>, gcc-patches@gcc.gnu.org
Cc: Anders Magnusson <ragge@tethuvudet.se>,
	Paul Koning <ni1d@arrl.net>, Matt Thomas <matt@3am-software.com>
Subject: Re: [PATCH 30/31] PR target/95294: VAX: Convert backend to MODE_CC representation
Date: Sat, 21 Nov 2020 20:27:02 -0700	[thread overview]
Message-ID: <ab50453e-b307-dcdd-b0e4-cd309c6f2d8e@redhat.com> (raw)
In-Reply-To: <alpine.LFD.2.21.2011200302400.656242@eddie.linux-mips.org>



On 11/19/20 8:36 PM, Maciej W. Rozycki wrote:
> In the VAX ISA INSV bitfield insert instruction is the only computational
> operation that keeps the condition codes, held in the PSL or Processor
> Status Longword register, intact.  The instruction is flexible enough it
> could potentially be used for data moves post-reload, but then reportedly
> it is not the best choice performance-wise, and then we have no addition
> operation available that would keep the condition codes unchanged.
>
> Futhermore, as usually with a complex CISC ISA, for many operations we
> have several machine instructions or instruction sequences to choose
> from that set condition codes in a different manner.
>
> Use the approach then where the condition codes only get introduced by
> reload, by definining instruction splitters for RTL insns that change
> condition codes in some way, by default considering them clobbered.
>
> Then to prevent code generated from regressing too much provide insns
> that include a `compare' operation setting the condition codes in
> parallel to the main operation.  The manner condition codes are set by
> each insn is supposed to be provided by the whatever the SELECT_CC_MODE
> macro expands to.
>
> Given that individual patterns provided for the same RTL basic operation
> may set the condion codes differently keeping the information away from
> the insn patterns themselves would cause a maintenance nightmare and
> would be bound to fail in a horrible way sooner or later.  Therefore
> instead let the patterns themselves choose which condition modes they
> support, by having one or more subst iterators applied and then have
> individual comparison operators require the specific condition mode each
> according to the codes used by the operation.
>
> While subst iterators only support one alternative each, there is
> actually no problem with applying multiple ones to a single insn with
> the result as intended, and if the corresponding subst attribute
> supplies an empty NO-SUBST-VALUE, then no mess results even.  Make use
> of this observation.
>
> Add appropriate subst iterators to all the computational patterns then,
> according to the condition codes they usably set, including DImode ones
> and a substitute DImode comparison instruction in the absence of a CMPQ
> machine instruction, however do not provide a `cbranchdi4' named pattern
> as without a further development it regresses code quality by resorting
> to the `__cmpdi2' libcall where a simpler operation would do, e.g. to
> check for negativity the TSTL machine instruction may be executed over
> the upper longword only.  This is good material for further work.
>
> Do not apply subst iterators to the increment- or decrement-and-branch
> patterns at this time; these may yet have to be reviewed, in particular
> whether `*jsobneq_minus_one' is still relevant in the context of the
> recent integer constant cost review.
>
> Also add a couple of peepholes to help eliminating comparisons in some
> problematic cases, such as with the BIT instruction which is bitwise-AND
> for condition codes only that has no direct counterpart for the actual
> calculation, because the BIC instruction which does do bitwise-AND and
> produces a result implements the operation with a bitwise negation of
> its input `mask' operand.  Or the FFS instruction which sets the Z
> condition code according to its `field' input operand rather than the
> result produced.  Or the bitfield comparisons we don't have generic
> middle-end support for.
>
> Code size stats are as follows, obtained from 17640 and 9086 executables
> built in `check-c' and `check-c++' GCC testing respectively:
>
>                       check-c                 check-c++
>               samples average  median  samples average  median
> ---------------------------------------------------------------
> regressions      1813  0.578%  0.198%      289  0.349%  0.175%
> unchanged       15160  0.000%  0.000%     8662  0.000%  0.000%
> progressions      667 -0.589% -0.194%      135 -0.944% -0.191%
> ----------------------------------------------------------------
> total           17640  0.037%  0.000%     9086 -0.003%  0.000%
>
> Outliers:
>
> old     new     change  %change filename
> ----------------------------------------------------
> 2406    2950    +544    +22.610 20111208-1.exe
> 4314    5329    +1015   +23.528 pr39417.exe
> 2235    3055    +820    +36.689 990404-1.exe
> 2631    4213    +1582   +60.129 pr57521.exe
> 3063    5579    +2516   +82.142 20000422-1.exe
>
> and:
>
> old     new     change  %change filename
> ----------------------------------------------------
> 6317    4845    -1472   -23.302 vector-compare-1.exe
> 6313    4845    -1468   -23.254 vector-compare-1.exe
> 6474    5002    -1472   -22.737 vector-compare-1.exe
> 6470    5002    -1468   -22.689 vector-compare-1.exe
>
> We have some code quality regressions like:
>
>     10861:	9e ef d9 12 	movab 11b40 <p>,r0
>     10865:	00 00 50
>     10868:	90 a0 03 a0 	movb 0x3(r0),0x2(r0)
>     1086c:	02
>     1086d:	d1 60 8f 61 	cmpl (r0),$0x64646261
>     10871:	62 64 64
>     10874:	13 07 		beql 1087d <main_test+0x21>
>
> to:
>
>     10861:	9e ef e1 12	movab 11b48 <p>,r0
>     10865:	00 00 50
>     10868:	90 a0 03 a0 	movb 0x3(r0),0x2(r0)
>     1086c:	02
>     1086d:	d1 ef d5 12 	cmpl 11b48 <p>,$0x64646261
>     10871:	00 00 8f 61
>     10875:	62 64 64
>     10878:	13 07 		beql 10881 <main_test+0x25>
>
> (from `memmove-2.x2') due to the constant propagation passes eagerly
> replacing pseudo registers with direct symbol references where possible,
> which does not happen with CC0 even though the passes do run regardless.
>
> There are further code quality regressions due to earlier compilation
> stages trying to push expression evaluation earlier where possible so
> as to make data dependencies further apart from each other.  This works
> well for computations and architectures that do not involve condition
> codes set as a side effect of calculations.  However for integer
> negation that makes assembly code produced like:
>
> 	movb *8(%ap),%r0
> 	mnegb %r0,%r1
> 	tstb %r0
> 	jeql .L2
>
> the RTL equibvalent of which the comparison elimination pass cannot
> really do anything about, because the comparison is made on the source
> rather than the target operand of the negation (we could add a peephole
> for this, but this seems futile an effort, as one'd have to iterate over
> all the possible such cases), even though this is really equivalent to:
>
> 	movb *8(%ap),%r0
> 	mnegb %r0,%r1
> 	jeql .L2
>
> or, if R0 is dead at the conclusion of the branch, even:
>
> 	mnegb *8(%ap),%r1
> 	jeql .L2
>
> Since the compiler insists on doing the comparison on the source of the
> negation it obviously has to load it into a temporary so as to avoid
> accessing the original memory location twice, hence the sequence of
> three instructions rather than just a single one.  A similar phenomenon
> can be observed with the XOR operation and in other cases.
>
> In some cases a comparison does get eliminated, however useless moves
> into registers done in preparation to it remain, such as with:
>
> 	movb *8(%ap),%r2
> 	movb *12(%ap),%r1
> 	subb3 %r1,%r2,%r0
> 	jlssu .L2
>
> where R1 and R2 are both dead at conclusion and therefore:
>
> 	subb3 *12(%ap),*8(%ap),%r0
> 	jlssu .L2
>
> would obviously do, but there was to be a comparison before the branch:
>
> 	cmpb %r2,%r1
>
> All this looks like material for future improvement.
>
> Test cases for comparison elimination and the peepholes will be supplied
> separately.
>
> 	gcc/
> 	PR target/95294
> 	* config/vax/elf.h (REGISTER_NAMES): Append `%psl'.
> 	* config/vax/vax-modes.def (CCN, CCNZ, CCZ): New modes.
> 	* config/vax/vax-protos.h (vax_select_cc_mode): New prototype.
> 	(vax_maybe_split_dimode_move): Likewise.
> 	(vax_notice_update_cc): Remove prototype.
> 	* config/vax/vax.c (TARGET_FLAGS_REGNUM): New macro.
> 	(TARGET_CC_MODES_COMPATIBLE): Likewise.
> 	(TARGET_MD_ASM_ADJUST): Likewise.
> 	(vax_select_cc_mode): New function
> 	(vax_cc_modes_compatible): Likewise.
> 	(vax_md_asm_adjust): Likewise.
> 	(vax_notice_update_cc): Remove function.
> 	(vax_output_int_move): Factor out code checking if a DImode move
> 	may have to be split...
> 	(vax_maybe_split_dimode_move): ... into this new function.
> 	* config/vax/vax.h (FIRST_PSEUDO_REGISTER): Bump up.
> 	(FIXED_REGISTERS): Append an entry for PSL.
> 	(CALL_USED_REGISTERS): Likewise.
> 	(NOTICE_UPDATE_CC, OUTPUT_JUMP): Remove macros.
> 	(SELECT_CC_MODE): New macro.
> 	(REGISTER_NAMES): Append `psl'.
> 	* config/vax/predicates.md (const_zero_operand)
> 	(vax_cc_comparison_operator, vax_ccn_comparison_operator)
> 	(vax_ccnz_comparison_operator, vax_ccz_comparison_operator):
> 	New predicates.
> 	* config/vax/builtins.md: Rewrite for MODE_CC representation.
> 	* config/vax/vax.md: Likewise.
So I just spot-checked this and the bits I looked at were quite sane. 
Given that and the fact that the port is on the chopping block if we
don't get the conversion done, I think it is fine to ACK this now and if
we want to iterate on things further we certainly can.

Jeff


  reply	other threads:[~2020-11-22  3:27 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20  3:38 [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included) Maciej W. Rozycki
2020-11-20  3:34 ` [PATCH 01/31] PR target/58901: reload: Handle SUBREG of MEM with a mode-dependent address Maciej W. Rozycki
2020-11-20 10:55   ` Eric Botcazou
2020-11-20 15:30     ` Maciej W. Rozycki
2020-11-24  6:19       ` [PATCH v2 " Maciej W. Rozycki
2020-11-24 11:03         ` Eric Botcazou
2020-11-26 17:22           ` Maciej W. Rozycki
2020-11-27  3:51             ` Maciej W. Rozycki
2020-11-27 10:52         ` Ulrich Weigand
2020-11-27 19:22           ` Maciej W. Rozycki
2020-11-27 20:47             ` Maciej W. Rozycki
2020-11-27 20:50               ` [PATCH v3 " Maciej W. Rozycki
2020-11-30 18:51                 ` Jeff Law
2020-11-29 17:31             ` [PATCH v2 " Jeff Law
2020-11-20  3:34 ` [PATCH 02/31] VAX: Remove `c' operand format specifier overload Maciej W. Rozycki
2020-11-20 23:16   ` Jeff Law
2020-11-24  1:12   ` Segher Boessenkool
2020-11-20  3:34 ` [PATCH 03/31] VAX: Define LEGITIMATE_PIC_OPERAND_P Maciej W. Rozycki
2020-11-21  3:17   ` Jeff Law
2020-11-20  3:34 ` [PATCH 04/31] VAX/testsuite: Run target testing over all the usual optimization levels Maciej W. Rozycki
2020-11-20 23:17   ` Jeff Law
2020-11-20  3:34 ` [PATCH 05/31] VAX: Rationalize expression and address costs Maciej W. Rozycki
2020-11-21  3:48   ` Jeff Law
2020-11-20  3:34 ` [PATCH 06/31] VAX: Correct fatal issues with the `ffs' builtin Maciej W. Rozycki
2020-11-20 23:19   ` Jeff Law
2020-11-20  3:34 ` [PATCH 07/31] RTL: Also support HOST_WIDE_INT with int iterators Maciej W. Rozycki
2020-11-21  4:19   ` Jeff Law
2020-11-20  3:34 ` [PATCH 08/31] jump: Also handle jumps wrapped in UNSPEC or UNSPEC_VOLATILE Maciej W. Rozycki
2020-11-21  4:25   ` Jeff Law
2020-12-03  3:50     ` [PATCH v2 " Maciej W. Rozycki
2020-12-03 22:20       ` Jeff Law
2020-11-20  3:34 ` [PATCH 09/31] VAX: Use a mode iterator to produce individual interlocked branches Maciej W. Rozycki
2020-11-20 23:20   ` Jeff Law
2020-11-20  3:34 ` [PATCH 10/31] VAX: Use an int " Maciej W. Rozycki
2020-11-20 23:20   ` Jeff Law
2020-11-20  3:35 ` [PATCH 11/31] VAX: Correct `sync_lock_test_and_set' and `sync_lock_release' builtins Maciej W. Rozycki
2020-11-21  4:26   ` Jeff Law
2020-11-20  3:35 ` [PATCH 12/31] VAX: Actually enable `builtins.md' now that it is fully functional Maciej W. Rozycki
2020-11-20 23:21   ` Jeff Law
2020-11-20  3:35 ` [PATCH 13/31] VAX: Add a test for the SImode `ffs' operation Maciej W. Rozycki
2020-11-20 23:22   ` Jeff Law
2020-11-20  3:35 ` [PATCH 14/31] VAX: Add tests for `sync_lock_test_and_set' and `sync_lock_release' Maciej W. Rozycki
2020-11-20 23:22   ` Jeff Law
2020-11-20  3:35 ` [PATCH 15/31] VAX: Provide the `ctz' operation Maciej W. Rozycki
2020-11-20 23:23   ` Jeff Law
2020-11-20  3:35 ` [PATCH 16/31] VAX: Also provide QImode and HImode `ctz' and `ffs' operations Maciej W. Rozycki
2020-11-20 23:24   ` Jeff Law
2020-11-20  3:35 ` [PATCH 17/31] VAX: Actually produce QImode and HImode `ctz' operations Maciej W. Rozycki
2020-11-20 23:24   ` Jeff Law
2020-11-20  3:35 ` [PATCH 18/31] VAX: Add a test for the `cpymemhi' instruction Maciej W. Rozycki
2020-11-20 23:25   ` Jeff Law
2020-11-20  3:35 ` [PATCH 19/31] VAX: Add the `movmemhi' instruction Maciej W. Rozycki
2020-11-20 23:25   ` Jeff Law
2020-11-20  3:35 ` [PATCH 20/31] VAX: Fix predicates and constraints for EXTV/EXTZV/INSV insns Maciej W. Rozycki
2020-11-21 17:01   ` Jeff Law
2020-11-20  3:35 ` [PATCH 21/31] VAX: Remove EXTV/EXTZV/INSV instruction use from aligned case insns Maciej W. Rozycki
2020-11-21 17:25   ` Jeff Law
2020-11-20  3:35 ` [PATCH 22/31] VAX: Ensure PIC mode address is adjustable with aligned bitfield insns Maciej W. Rozycki
2020-11-21 17:03   ` Jeff Law
2020-11-20  3:36 ` [PATCH 23/31] VAX: Make `extv' an expander matching the remaining bitfield operations Maciej W. Rozycki
2020-11-21 17:26   ` Jeff Law
2020-11-20  3:36 ` [PATCH 24/31] VAX: Fix predicates and constraints for bitfield comparison insns Maciej W. Rozycki
2020-11-21 17:27   ` Jeff Law
2020-11-20  3:36 ` [PATCH 25/31] VAX: Fix predicates for widening multiply and multiply-add insns Maciej W. Rozycki
2020-11-21  4:05   ` Jeff Law
2020-11-30 16:02     ` Maciej W. Rozycki
2020-11-30 18:29       ` Jeff Law
2020-11-20  3:36 ` [PATCH 26/31] VAX: Correct issues with commented-out insns Maciej W. Rozycki
2020-11-21  4:05   ` Jeff Law
2020-11-20  3:36 ` [PATCH 27/31] VAX: Make the `divmoddisi4' and `*amulsi4' comment notation consistent Maciej W. Rozycki
2020-11-21  4:06   ` Jeff Law
2020-11-24  1:37   ` Segher Boessenkool
2020-11-20  3:36 ` [PATCH 28/31] RTL: Add `const_double_zero' syntactic rtx Maciej W. Rozycki
2020-11-21 17:29   ` Jeff Law
2020-11-20  3:36 ` [PATCH 29/31] PDP11: Use `const_double_zero' to express double zero constant Maciej W. Rozycki
2020-11-21  4:07   ` Jeff Law
2020-12-15  8:26   ` Martin Liška
2020-12-15 14:06     ` Maciej W. Rozycki
2020-12-15 18:02       ` Paul Koning
2020-12-15 18:38         ` Maciej W. Rozycki
2020-11-20  3:36 ` [PATCH 30/31] PR target/95294: VAX: Convert backend to MODE_CC representation Maciej W. Rozycki
2020-11-22  3:27   ` Jeff Law [this message]
2020-12-09 16:09   ` Maciej W. Rozycki
2020-11-20  3:37 ` [PATCH 31/31] PR target/95294: VAX: Add test cases for " Maciej W. Rozycki
2020-11-21  4:08   ` Jeff Law
2020-12-05 18:40     ` Maciej W. Rozycki
2020-11-20  7:58 ` [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included) Anders Magnusson
2020-11-23 20:31   ` Maciej W. Rozycki
2020-11-21 21:02 ` Toon Moene
2020-11-23 21:51   ` Maciej W. Rozycki
2020-11-23 22:12     ` Thomas Koenig
2020-11-24  4:28       ` Maciej W. Rozycki
2020-11-24  5:27         ` Maciej W. Rozycki
2020-11-24  6:04           ` Maciej W. Rozycki
2020-11-24  6:16             ` Thomas Koenig
2020-11-25 19:22               ` Maciej W. Rozycki
2020-11-25 18:26             ` Maciej W. Rozycki
2020-11-25 22:20               ` Joseph Myers
2020-11-26 18:01                 ` Maciej W. Rozycki
2020-11-26 18:08                   ` Martin Husemann
2020-12-08 14:38                     ` Maciej W. Rozycki
2020-12-08 15:22                       ` Martin Husemann
2020-11-25 22:26           ` coypu
2020-11-26 17:59             ` Maciej W. Rozycki
2020-11-26 19:35               ` Maciej W. Rozycki
2020-11-23 15:48 ` Paul Koning
2020-11-25 17:07   ` Maciej W. Rozycki
2020-11-28 18:48     ` Paul Koning
2020-12-09 14:06       ` Maciej W. Rozycki
2020-12-10  1:33         ` Paul Koning
2020-12-11 14:54           ` Maciej W. Rozycki
2020-12-11 21:50             ` Paul Koning
2020-11-25 18:36 ` Maciej W. Rozycki
2020-11-26 14:46   ` Ian Lance Taylor
2020-11-26 18:07     ` Maciej W. Rozycki
2020-11-29 17:56   ` Martin Sebor
2020-12-07 14:25     ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab50453e-b307-dcdd-b0e4-cd309c6f2d8e@redhat.com \
    --to=law@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=macro@linux-mips.org \
    --cc=matt@3am-software.com \
    --cc=ni1d@arrl.net \
    --cc=ragge@tethuvudet.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).