From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id EA326388A435; Sat, 5 Dec 2020 18:30:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EA326388A435 From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/95294] [vax] Convert the backend to MODE_CC so it can be kept in future releases Date: Sat, 05 Dec 2020 18:30:22 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 10.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: macro@linux-mips.org X-Bugzilla-Target-Milestone: 11.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Dec 2020 18:30:23 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D95294 --- Comment #2 from CVS Commits --- The master branch has been updated by Maciej W. Rozycki : https://gcc.gnu.org/g:e552abe2ba27ccec0d7c0d050b4617f2f636dd63 commit r11-5796-ge552abe2ba27ccec0d7c0d050b4617f2f636dd63 Author: Maciej W. Rozycki Date: Sat Dec 5 18:26:27 2020 +0000 PR target/95294: VAX: Convert backend to MODE_CC representation In the VAX ISA INSV bit-field insert instruction is the only computatio= nal operation that keeps the condition codes, held in the PSL or Processor Status Longword register, intact. The instruction is flexible enough it could potentially be used for data moves post-reload, but then reported= ly it is not the best choice performance-wise, and then we have no addition operation available that would keep the condition codes unchanged. Futhermore, as usually with a complex CISC ISA, for many operations we have several machine instructions or instruction sequences to choose from that set condition codes in a different manner. Use the approach then where the condition codes only get introduced by reload, by definining instruction splitters for RTL insns that change condition codes in some way, by default considering them clobbered. Then to prevent code generated from regressing too much provide insns that include a `compare' operation setting the condition codes in parallel to the main operation. The manner condition codes are set by each insn is supposed to be provided by the whatever the SELECT_CC_MODE macro expands to. Given that individual patterns provided for the same RTL basic operation may set the condion codes differently keeping the information away from the insn patterns themselves would cause a maintenance nightmare and would be bound to fail in a horrible way sooner or later. Therefore instead let the patterns themselves choose which condition modes they support, by having one or more subst iterators applied and then have individual comparison operators require the specific condition mode each according to the codes used by the operation. While subst iterators only support one alternative each, there is actually no problem with applying multiple ones to a single insn with the result as intended, and if the corresponding subst attribute supplies an empty NO-SUBST-VALUE, then no mess results even. Make use of this observation. Add appropriate subst iterators to all the computational patterns then, according to the condition codes they usably set, including DImode ones and a substitute DImode comparison instruction in the absence of a CMPQ machine instruction, however do not provide a `cbranchdi4' named pattern as without a further development it regresses code quality by resorting to the `__cmpdi2' libcall where a simpler operation would do, e.g. to check for negativity the TSTL machine instruction may be executed over the upper longword only. This is good material for further work. Do not apply subst iterators to the increment- or decrement-and-branch patterns at this time; these may yet have to be reviewed, in particular whether `*jsobneq_minus_one' is still relevant in the context of the recent integer constant cost review. Also add a couple of peepholes to help eliminating comparisons in some problematic cases, such as with the BIT instruction which is bitwise-AND for condition codes only that has no direct counterpart for the actual calculation, because the BIC instruction which does do bitwise-AND and produces a result implements the operation with a bitwise negation of its input `mask' operand. Or the FFS instruction which sets the Z condition code according to its `field' input operand rather than the result produced. Or the bit-field comparisons we don't have generic middle-end support for. Code size stats are as follows, obtained from 17640 and 9086 executables built in `check-c' and `check-c++' GCC testing respectively: check-c check-c++ samples average median samples average median --------------------------------------------------------------- regressions 1813 0.578% 0.198% 289 0.349% 0.175% unchanged 15160 0.000% 0.000% 8662 0.000% 0.000% progressions 667 -0.589% -0.194% 135 -0.944% -0.191% ---------------------------------------------------------------- total 17640 0.037% 0.000% 9086 -0.003% 0.000% Outliers: old new change %change filename ---------------------------------------------------- 2406 2950 +544 +22.610 20111208-1.exe 4314 5329 +1015 +23.528 pr39417.exe 2235 3055 +820 +36.689 990404-1.exe 2631 4213 +1582 +60.129 pr57521.exe 3063 5579 +2516 +82.142 20000422-1.exe and: old new change %change filename ---------------------------------------------------- 6317 4845 -1472 -23.302 vector-compare-1.exe 6313 4845 -1468 -23.254 vector-compare-1.exe 6474 5002 -1472 -22.737 vector-compare-1.exe 6470 5002 -1468 -22.689 vector-compare-1.exe We have some code quality regressions like: 10861: 9e ef d9 12 movab 11b40

,r0 10865: 00 00 50 10868: 90 a0 03 a0 movb 0x3(r0),0x2(r0) 1086c: 02 1086d: d1 60 8f 61 cmpl (r0),$0x64646261 10871: 62 64 64 10874: 13 07 beql 1087d to: 10861: 9e ef e1 12 movab 11b48

,r0 10865: 00 00 50 10868: 90 a0 03 a0 movb 0x3(r0),0x2(r0) 1086c: 02 1086d: d1 ef d5 12 cmpl 11b48

,$0x64646261 10871: 00 00 8f 61 10875: 62 64 64 10878: 13 07 beql 10881 (from `memmove-2.x2') due to the constant propagation passes eagerly replacing pseudo registers with direct symbol references where possible, which does not happen with CC0 even though the passes do run regardless. There are further code quality regressions due to earlier compilation stages trying to push expression evaluation earlier where possible so as to make data dependencies further apart from each other. This works well for computations and architectures that do not involve condition codes set as a side effect of calculations. However for integer negation that makes assembly code produced like: movb *8(%ap),%r0 mnegb %r0,%r1 tstb %r0 jeql .L2 the RTL equibvalent of which the comparison elimination pass cannot really do anything about, because the comparison is made on the source rather than the target operand of the negation (we could add a peephole for this, but this seems futile an effort, as one'd have to iterate over all the possible such cases), even though this is really equivalent to: movb *8(%ap),%r0 mnegb %r0,%r1 jeql .L2 or, if R0 is dead at the conclusion of the branch, even: mnegb *8(%ap),%r1 jeql .L2 Since the compiler insists on doing the comparison on the source of the negation it obviously has to load it into a temporary so as to avoid accessing the original memory location twice, hence the sequence of three instructions rather than just a single one. A similar phenomenon can be observed with the XOR operation and in other cases. In some cases a comparison does get eliminated, however useless moves into registers done in preparation to it remain, such as with: movb *8(%ap),%r2 movb *12(%ap),%r1 subb3 %r1,%r2,%r0 jlssu .L2 where R1 and R2 are both dead at conclusion and therefore: subb3 *12(%ap),*8(%ap),%r0 jlssu .L2 would obviously do, but there was to be a comparison before the branch: cmpb %r2,%r1 All this looks like material for future improvement. Test cases for comparison elimination and the peepholes will be supplied separately. gcc/ PR target/95294 * config/vax/elf.h (REGISTER_NAMES): Append `%psl'. * config/vax/vax-modes.def (CCN, CCNZ, CCZ): New modes. * config/vax/vax-protos.h (vax_select_cc_mode): New prototype. (vax_maybe_split_dimode_move): Likewise. (vax_notice_update_cc): Remove prototype. * config/vax/vax.c (TARGET_FLAGS_REGNUM): New macro. (TARGET_CC_MODES_COMPATIBLE): Likewise. (TARGET_MD_ASM_ADJUST): Likewise. (vax_select_cc_mode): New function (vax_cc_modes_compatible): Likewise. (vax_md_asm_adjust): Likewise. (vax_notice_update_cc): Remove function. (vax_output_int_move): Factor out code checking if a DImode move may have to be split... (vax_maybe_split_dimode_move): ... into this new function. * config/vax/vax.h (FIRST_PSEUDO_REGISTER): Bump up. (FIXED_REGISTERS): Append an entry for PSL. (CALL_USED_REGISTERS): Likewise. (NOTICE_UPDATE_CC, OUTPUT_JUMP): Remove macros. (SELECT_CC_MODE): New macro. (REGISTER_NAMES): Append `psl'. * config/vax/predicates.md (const_zero_operand) (vax_cc_comparison_operator, vax_ccn_comparison_operator) (vax_ccnz_comparison_operator, vax_ccz_comparison_operator): New predicates. * config/vax/builtins.md: Rewrite for MODE_CC representation. * config/vax/vax.md: Likewise.=