From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BBDF1385802B; Mon, 21 Feb 2022 15:51:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BBDF1385802B From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test Date: Mon, 21 Feb 2022 15:51:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: testsuite-fail X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Feb 2022 15:51:16 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104440 --- Comment #12 from CVS Commits --- The master branch has been updated by Tom de Vries : https://gcc.gnu.org/g:02aedc6f269b5e3c1f354edcf5b84d27b0a15946 commit r12-7312-g02aedc6f269b5e3c1f354edcf5b84d27b0a15946 Author: Tom de Vries Date: Wed Feb 16 17:09:11 2022 +0100 [nvptx] Initialize ptx regs With nvptx target, driver version 510.47.03 and board GT 1030 I, we run into: ... FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test ... while the test-cases pass with nvptx-none-run -O0. The problem is that the generated ptx contains a read from an uninitial= ized ptx register, and the driver JIT doesn't handle this well. For -O2 and -O3, we can get rid of the FAIL using --param logical-op-non-short-circuit=3D0. But not for -O1. At -O1, the test-case minimizes to: ... void __attribute__((noinline, noclone)) foo (int y) { int c; for (int i =3D 0; i < y; i++) { int d =3D i + 1; if (i && d <=3D c) __builtin_abort (); c =3D d; } } int main () { foo (2); return 0; } ... Note that the test-case does not contain an uninitialized use. In the first iteration, i is 0 and consequently c is not read. In the second iterat= ion, c is read, but by that time it's already initialized by 'c =3D d' from the first iteration. AFAICT the problem is introduced as follows: the conditional use of c in the loop body is translated into an unconditional use of c in the loop head= er: ... # c_1 =3D PHI ... which forwprop1 propagates the 'c_9 =3D d_7' assignment into: ... # c_1 =3D PHI ... which ends up being translated by expand into an unconditional: ... (insn 13 12 0 (set (reg/v:SI 22 [ c ]) (reg/v:SI 23 [ d ])) -1 (nil)) ... at the start of the loop body, creating an uninitialized read of d on t= he path from loop entry. By disabling coalesce_ssa_name, we get the more usual copies on the incoming edges. The copy on the loop entry path still does an uninitialized rea= d, but that one's now initialized by init-regs. The test-case passes, also wh= en disabling init-regs, so it's possible that the JIT driver doesn't objec= t to this type of uninitialized read. Now that we characterized the problem to some degree, we need to fix th= is, because either: - we're violating an undocumented ptx invariant, and this is a compiler bug, or - this is is a driver JIT bug and we need to work around it. There are essentially two strategies to address this: - stop the compiler from creating uninitialized reads - patch up uninitialized reads using additional initialization The former will probably involve: - making some optimizations more conservative in the presence of uninitialized reads, and - disabling some other optimizations (where making them more conservati= ve is not possible, or cannot easily be achieved). This will probably will have a cost penalty for code that does not suff= er from the original problem. The latter has the problem that it may paper over uninitialized reads in the source code, or indeed over ones that were incorrectly introduced by the compiler. But it has the advantage that it allows for the probl= em to be addressed at a single location. There's an existing pass, init-regs, which implements a form of the lat= ter, but it doesn't work for this example because it only inserts additional initialization for uses that have not a single reaching definition. Fix this by adding initialization of uninitialized ptx regs in reorg. Control the new functionality using -minit-regs=3D<0|1|2|3>, meaning: - 0: disabled. - 1: add initialization of all regs at the entry bb - 2: add initialization of uninitialized regs at the entry bb - 3: add initialization of uninitialized regs close to the use and defaulting to 3. Tested on nvptx. gcc/ChangeLog: 2022-02-17 Tom de Vries PR target/104440 * config/nvptx/nvptx.cc (workaround_uninit_method_1) (workaround_uninit_method_2, workaround_uninit_method_3) (workaround_uninit): New function. (nvptx_reorg): Use workaround_uninit. * config/nvptx/nvptx.opt (minit-regs): New option.=