From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id E40C13858D28 for ; Tue, 12 Apr 2022 15:07:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E40C13858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.90,254,1643702400"; d="scan'208";a="74381870" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 12 Apr 2022 07:07:49 -0800 IronPort-SDR: dwJGIFGbfM7CVWtg/byz+I4uJHGzHTpOENNAjpZgi6yHwy15+UhA6+TZ7MKXuAeB+/1nmZehBh Hp9DNbCAEkNXCiNBYkFLypg9UqmW4jBqhNWRoqsc3WNfhJGfTqN7i0rZlKihkF9d48/huioyk9 bweUOYykC5MzMDhO8QDcUFn9OsV9vnKH2WwE6iMYq6KrSqJ/I+kIz2AOWU3b0QFzEyR8vTSWTr Tefj+LD7TLBC2WA2BYIGyDZ9IrO5Tiq0pP5u+HPbfqUpdu5EtJFbJLKYT+nGmVzhaUSXU049FG eII= From: Thomas Schwinge To: Richard Biener CC: Jan Hubicka , , "Tom de Vries" Subject: Re: Fix wrong code in gnatmake In-Reply-To: <6r6qr236-163-9qo4-8086-2oq46737p9@fhfr.qr> References: <1o79r0qs-6323-5o1q-494s-q0s41168rp4p@fhfr.qr> <87wnfu8mzo.fsf@euler.schwinge.homeip.net> <6r6qr236-163-9qo4-8086-2oq46737p9@fhfr.qr> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Tue, 12 Apr 2022 17:07:38 +0200 Message-ID: <87h76yfih1.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2022 15:07:54 -0000 Hi! On 2022-04-12T15:45:03+0200, Richard Biener wrote: > On Tue, 12 Apr 2022, Thomas Schwinge wrote: >> On 2022-04-07T15:04:15+0200, Richard Biener via Gcc-patches wrote: >> > On Thu, 7 Apr 2022, Jan Hubicka wrote: >> >> > On Thu, 7 Apr 2022, Jan Hubicka wrote: >> >> > > this patch fixes miscompilation of gnatmake. Modref attempts to = track memory >> >> > > accesses relative to the base pointers which are parameters of fu= nctions. >> >> > > If it fails, it still makes difference between unknown memory acc= ess and >> >> > > global memory access. The second makes it possible to disambigua= te with >> >> > > memory that is not accessible from outside world (i.e. everything= that does >> >> > > not escape from the caller function). This is useful so we do no= t punt >> >> > > when unknown function is called. >> >> > > >> >> > > Now I added ref_may_access_global_memory_p to tree-ssa-alias whic= is using >> >> > > ptr_deref_may_alias_global_p. There is however a shift in meanin= g of this >> >> > > predicate: the second tests that the dereference may alias with g= lobal variable. >> >> > > >> >> > > In the testcase we are disambiguating heap allocated escaping mem= ory which is >> >> > > not a global variable but it is still a global memory in the modr= ef's sense. >> >> > > So we need to test in addition contains_escaped. >> >> > > >> >> > > The patch simply copies logic from the predicate and adds the che= ck. >> >> > > I am not sure if there is better way to handle this? >> >> > >> >> > I'm testing the following variant which exposes this detail >> >> > (escaped local memory global or not) in the APIs that say "global" >> >> > which allows to remove ref_may_access_global_memory_p. >> >> >> >> Thank you. Indeed it is better to have an explicit flag, since the >> >> clash of names is bit sensitive. >> > >> > OK - bootstrapped / tested on x86_64-unknown-linux-gnu including Ada >> > and now pushed. >> >> This commit r12-8048-g8c0ebaf9f586100920a3c0849fb10e9985d7ae58 >> "ipa/104303 - miscompilation of gnatmake" is causing one regression in >> nvptx offload testing: >> >> [...] >> [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/private-variables.f90 -DACC_= DEVICE_TYPE_nvidia=3D1 -DACC_MEM_SHARED=3D0 -foffload=3Dnvptx-none -O1 a= t line 142 (test for bogus messages, line 131) >> [...] >> >> I've done a before/after 'diff' of >> '-fdump-tree-all -foffload-options=3Dnvptx-none=3D-fdump-tree-all' >> with all functions and calls other than 't4' commented out. > > I suppose the > > diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc > index 2a13ea34829..34ce8abe33a 100644 > --- a/gcc/tree-ssa-dce.cc > +++ b/gcc/tree-ssa-dce.cc > @@ -315,7 +315,7 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool > aggressive) > } > > if ((gimple_vdef (stmt) && keep_all_vdefs_p ()) > - || stmt_may_clobber_global_p (stmt)) > + || stmt_may_clobber_global_p (stmt, true)) > { > mark_stmt_necessary (stmt, true); > return; > > was overly conservative (I was probably misled by keep_all_vdefs_p ()), > passing false to stmt_may_clobber_global_p should fix the regression? It does, thanks. Per more 'diff'ing, this change again enables the '-O1' host compilation 'a-private-variables.f90.117t.dce2' to clean this up, and likewise for nvptx offload compilation 'a.xnvptx-none.mkoffload.117t.dce2', thus the extra diagnostic again disappears. In fact, for all optimization flags variants that I've tried, we've again got the exactly same dumps as before commit r12-8048-g8c0ebaf9f586100920a3c0849fb10e9985d7ae58 "ipa/104303 - miscompilation of gnatmake"! So I'll run that through standard bootstrap/regression testing, and push? Got a suggestion for rationale to put into the commit log? Gr=C3=BC=C3=9Fe Thomas >> For '-O0', there's no difference at all. >> >> For '-O1', for host compilation we see: >> >> diff -ru 0-O1/a-private-variables.f90.117t.dce2 ./a-private-variable= s.f90.117t.dce2 >> --- 0-O1/a-private-variables.f90.117t.dce2 2022-04-12 08:36:54.= 525302868 +0200 >> +++ ./a-private-variables.f90.117t.dce2 2022-04-12 12:51:43.72630410= 9 +0200 >> @@ -30,9 +30,13 @@ >> >> [local count: 87490071]: >> # .offset.15_2 =3D PHI <0(2), .offset.15_63(5)> >> + pt.x =3D .offset.15_2; >> _25 =3D .offset.15_2 * 2; >> + pt.y =3D _25; >> _27 =3D .offset.15_2 * 4; >> + pt.z =3D _27; >> _29 =3D .offset.15_2 * 6; >> + pt.attr[4] =3D _29; >> >> [local count: 1073741824]: >> # .offset.10_4 =3D PHI <0(3), .offset.10_56(4)> >> diff -ru 0-O1/a-private-variables.f90.118t.stdarg ./a-private-variab= les.f90.118t.stdarg >> --- 0-O1/a-private-variables.f90.118t.stdarg 2022-04-12 08:36:54.= 525302868 +0200 >> +++ ./a-private-variables.f90.118t.stdarg 2022-04-12 12:51:43.= 726304109 +0200 >> @@ -4,6 +4,7 @@ >> __attribute__((oacc function (1, 1, 1), oacc parallel, omp target e= ntrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> + struct vec3 pt; >> integer(kind=3D4) .offset.15_2; >> integer(kind=3D4) .offset.10_4; >> integer(kind=3D4) _25; >> @@ -25,9 +26,13 @@ >> >> [local count: 87490071]: >> # .offset.15_2 =3D PHI <0(2), .offset.15_63(5)> >> + pt.x =3D .offset.15_2; >> _25 =3D .offset.15_2 * 2; >> + pt.y =3D _25; >> _27 =3D .offset.15_2 * 4; >> + pt.z =3D _27; >> _29 =3D .offset.15_2 * 6; >> + pt.attr[4] =3D _29; >> >> [local count: 1073741824]: >> # .offset.10_4 =3D PHI <0(3), .offset.10_56(4)> >> [Similar for following passes/dumps.] >> diff -ru 0-O1/a-private-variables.f90.141t.lim2 ./a-private-variable= s.f90.141t.lim2 >> --- 0-O1/a-private-variables.f90.141t.lim2 2022-04-12 08:36:54.= 525302868 +0200 >> +++ ./a-private-variables.f90.141t.lim2 2022-04-12 12:51:43.73030412= 5 +0200 >> @@ -24,11 +24,42 @@ >> ;; 5 succs { 7 6 } >> ;; 7 succs { 3 } >> ;; 6 succs { 1 } >> + >> +Symbols to be put in SSA form >> +{ D.4340 D.4356 D.4357 D.4358 D.4359 D.4360 D.4361 D.4362 D.4363 } >> +Incremental SSA update started at block: 0 >> +Number of blocks in CFG: 9 >> +Number of blocks to update: 8 ( 89%) >> + >> + >> + >> +SSA replacement table >> +N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j >> + >> +pt_x_lsm.22_1 -> { pt_x_lsm.22_72 } >> +pt_z_lsm.24_20 -> { pt_z_lsm.24_9 } >> +pt_attr_I_lsm.25_21 -> { pt_attr_I_lsm.25_10 } >> +pt_y_lsm.23_22 -> { pt_y_lsm.23_73 } >> +Incremental SSA update started at block: 3 >> +Number of blocks in CFG: 9 >> +Number of blocks to update: 3 ( 33%) >> + >> + >> __attribute__((oacc function (1, 1, 1), oacc parallel, omp target e= ntrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> + integer(kind=3D4) D.4363; >> + integer(kind=3D4) pt_attr_I_lsm.25; >> + integer(kind=3D4) D.4361; >> + integer(kind=3D4) pt_z_lsm.24; >> + integer(kind=3D4) D.4359; >> + integer(kind=3D4) pt_y_lsm.23; >> + integer(kind=3D4) D.4357; >> + integer(kind=3D4) pt_x_lsm.22; >> + struct vec3 pt; >> integer(kind=3D4) .offset.15_2; >> integer(kind=3D4) .offset.10_4; >> + integer(kind=3D4) _7(D); >> integer(kind=3D4) _25; >> integer(kind=3D4) _27; >> integer(kind=3D4) _29; >> @@ -37,21 +68,32 @@ >> integer(kind=3D8) _43; >> integer(kind=3D4)[1025] * _45; >> integer(kind=3D4) _46; >> + integer(kind=3D4) _47(D); >> integer(kind=3D4) _48; >> integer(kind=3D4) _50; >> + integer(kind=3D4) _51(D); >> integer(kind=3D4) _52; >> integer(kind=3D4) _54; >> integer(kind=3D4) .offset.10_56; >> integer(kind=3D4) .offset.15_63; >> + integer(kind=3D4) _70(D); >> >> [local count: 7128820]: >> + pt_x_lsm.22_8 =3D _7(D); >> + pt_y_lsm.23_49 =3D _47(D); >> + pt_z_lsm.24_53 =3D _51(D); >> + pt_attr_I_lsm.25_71 =3D _70(D); >> _45 =3D *.omp_data_i_44(D).arr; >> >> [local count: 87490071]: >> # .offset.15_2 =3D PHI <0(2), .offset.15_63(7)> >> + pt_x_lsm.22_72 =3D .offset.15_2; >> _25 =3D .offset.15_2 * 2; >> + pt_y_lsm.23_73 =3D _25; >> _27 =3D .offset.15_2 * 4; >> + pt_z_lsm.24_9 =3D _27; >> _29 =3D .offset.15_2 * 6; >> + pt_attr_I_lsm.25_10 =3D _29; >> _41 =3D .offset.15_2 * 32; >> >> [local count: 1073741824]: >> @@ -84,6 +126,14 @@ >> goto ; [100.00%] >> >> [local count: 35644102]: >> + # pt_z_lsm.24_20 =3D PHI >> + # pt_attr_I_lsm.25_21 =3D PHI >> + # pt_x_lsm.22_1 =3D PHI >> + # pt_y_lsm.23_22 =3D PHI >> + pt.attr[4] =3D pt_attr_I_lsm.25_21; >> + pt.z =3D pt_z_lsm.24_20; >> + pt.y =3D pt_y_lsm.23_22; >> + pt.x =3D pt_x_lsm.22_1; >> return; >> >> } >> [Similar for following passes/dumps.] >> diff -ru 0-O1/a-private-variables.f90.148t.dse3 ./a-private-variable= s.f90.148t.dse3 >> --- 0-O1/a-private-variables.f90.148t.dse3 2022-04-12 08:36:54.= 525302868 +0200 >> +++ ./a-private-variables.f90.148t.dse3 2022-04-12 12:51:43.73030412= 5 +0200 >> @@ -1,11 +1,24 @@ >> >> ;; Function t4_._omp_fn.0 (t4_._omp_fn.0, funcdef_no=3D3, decl_uid= =3D4278, cgraph_uid=3D4, symbol_order=3D3) >> >> +Removing basic block 7 >> +Removing basic block 8 >> +Removing basic block 9 >> __attribute__((oacc function (1, 1, 1), oacc parallel, omp target e= ntrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> + integer(kind=3D4) D.4363; >> + integer(kind=3D4) pt_attr_I_lsm.25; >> + integer(kind=3D4) D.4361; >> + integer(kind=3D4) pt_z_lsm.24; >> + integer(kind=3D4) D.4359; >> + integer(kind=3D4) pt_y_lsm.23; >> + integer(kind=3D4) D.4357; >> + integer(kind=3D4) pt_x_lsm.22; >> + struct vec3 pt; >> integer(kind=3D4) .offset.15_2; >> integer(kind=3D4) .offset.10_4; >> + integer(kind=3D4) _7(D); >> integer(kind=3D4) _25; >> integer(kind=3D4) _27; >> integer(kind=3D4) _29; >> @@ -14,25 +27,28 @@ >> integer(kind=3D8) _43; >> integer(kind=3D4)[1025] * _45; >> integer(kind=3D4) _46; >> + integer(kind=3D4) _47(D); >> integer(kind=3D4) _48; >> integer(kind=3D4) _50; >> + integer(kind=3D4) _51(D); >> integer(kind=3D4) _52; >> integer(kind=3D4) _54; >> integer(kind=3D4) .offset.10_56; >> integer(kind=3D4) .offset.15_63; >> + integer(kind=3D4) _70(D); >> >> [local count: 7128820]: >> _45 =3D *.omp_data_i_44(D).arr; >> >> [local count: 87490071]: >> - # .offset.15_2 =3D PHI <0(2), .offset.15_63(7)> >> + # .offset.15_2 =3D PHI <0(2), .offset.15_63(5)> >> _25 =3D .offset.15_2 * 2; >> _27 =3D .offset.15_2 * 4; >> _29 =3D .offset.15_2 * 6; >> _41 =3D .offset.15_2 * 32; >> >> [local count: 1073741824]: >> - # .offset.10_4 =3D PHI <0(3), .offset.10_56(8)> >> + # .offset.10_4 =3D PHI <0(3), .offset.10_56(4)> >> _42 =3D .offset.10_4 + _41; >> _43 =3D (integer(kind=3D8)) _42; >> _46 =3D (*_45)[_43]; >> @@ -43,23 +59,17 @@ >> (*_45)[_43] =3D _54; >> .offset.10_56 =3D .offset.10_4 + 1; >> if (.offset.10_56 <=3D 31) >> - goto ; [89.00%] >> + goto ; [89.00%] >> else >> goto ; [11.00%] >> >> - [local count: 955630224]: >> - goto ; [100.00%] >> - >> [local count: 437450365]: >> .offset.15_63 =3D .offset.15_2 + 1; >> if (.offset.15_63 <=3D 31) >> - goto ; [89.00%] >> + goto ; [89.00%] >> else >> goto ; [11.00%] >> >> - [local count: 389330825]: >> - goto ; [100.00%] >> - >> [local count: 35644102]: >> return; >> >> [Similar for following passes/dumps.] >> >> ..., so in 'a-private-variables.f90.148t.dse3', the 'pt.{x,y,z,attr}' >> assignments for the new 'struct vec3 pt;' get cleaned out, so that shoul= d >> all be fine; no actual changes in the end. >> >> Comparing '-O1' nvptx offload target compilation before/after, the first >> difference is in 'a.xnvptx-none.mkoffload.117t.dce2': similar to host >> compilation. But then, in the following things do not get cleaned up as >> they do for the host compilation; the 'pt.{x,y,z,attr}' assignments for >> the new 'struct vec3 pt;' persist: >> >> diff -ru 0-O1/a.xnvptx-none.mkoffload.252t.optimized ./a.xnvptx-none= .mkoffload.252t.optimized >> --- 0-O1/a.xnvptx-none.mkoffload.252t.optimized 2022-04-12 08:36:54.= 569303204 +0200 >> +++ ./a.xnvptx-none.mkoffload.252t.optimized 2022-04-12 12:51:43.= 774304292 +0200 >> @@ -7,34 +7,36 @@ >> __attribute__((oacc function (32, 8, 32), oacc parallel, omp target= entrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> - unsigned int ivtmp$6; >> + unsigned int ivtmp$8; >> unsigned int ivtmp$5; >> + unsigned int ivtmp$3; >> int D.1527; >> int D.1524; >> + struct vec3 pt; >> int _2; >> + int[1025] * _4; >> + int _22; >> int _23; >> int _25; >> int _27; >> int _29; >> int _34; >> - int[1025] * _43; >> - sizetype _45; >> - sizetype _46; >> - int[1025] * _47; >> - unsigned int _48; >> + sizetype _41; >> + unsigned int _42; >> + unsigned int _43; >> + unsigned int _45; >> + int _46; >> int _49; >> - unsigned int _50; >> int _51; >> - int _52; >> - int _53; >> - int _63; >> - int _82; >> - int _83; >> - int _87; >> + int _54; >> + sizetype _63; >> int _95; >> + int _96; >> + int _99; >> int _102; >> - int _103; >> + int[1025] * _103; >> int _104; >> + int _105; >> int _107; >> >> [local count: 7128820]: >> @@ -47,43 +49,50 @@ >> goto ; [73.00%] >> >> [local count: 1924781]: >> - _87 =3D _104 * 32; >> - ivtmp$5_80 =3D (unsigned int) _87; >> - _52 =3D _104 * 2; >> - ivtmp$6_54 =3D (unsigned int) _52; >> + ivtmp$3_61 =3D (unsigned int) _104; >> + _54 =3D _104 * 2; >> + ivtmp$5_56 =3D (unsigned int) _54; >> + _46 =3D _104 * 32; >> + ivtmp$8_48 =3D (unsigned int) _46; >> + _42 =3D (unsigned int) _23; >> >> [local count: 87490071]: >> - # _2 =3D PHI <_104(3), _63(6)> >> - # ivtmp$5_22 =3D PHI >> - # ivtmp$6_72 =3D PHI >> - _25 =3D (int) ivtmp$6_72; >> - _50 =3D ivtmp$6_72 * 2; >> - _27 =3D (int) _50; >> - _48 =3D ivtmp$6_72 * 3; >> - _29 =3D (int) _48; >> + # ivtmp$3_106 =3D PHI >> + # ivtmp$5_87 =3D PHI >> + # ivtmp$8_52 =3D PHI >> + _2 =3D (int) ivtmp$3_106; >> + pt.x =3D _2; >> + _25 =3D (int) ivtmp$5_87; >> + pt.y =3D _25; >> + _45 =3D ivtmp$5_87 * 2; >> + _27 =3D (int) _45; >> + pt.z =3D _27; >> + _43 =3D ivtmp$5_87 * 3; >> + _29 =3D (int) _43; >> + pt.attr[4] =3D _29; >> _34 =3D .UNIQUE (OACC_FORK, 0, 2); >> >> [local count: 437450365]: >> _107 =3D .GOACC_DIM_POS (2); >> - _82 =3D (int) ivtmp$5_22; >> - _83 =3D _82 + _107; >> - _47 =3D *.omp_data_i_44(D).arr; >> - _46 =3D (sizetype) _83; >> - _45 =3D _46 * 4; >> - _43 =3D _47 + _45; >> - _49 =3D MEM [(int[1025] *)_43]; >> - _51 =3D _2 + _49; >> - _53 =3D _25 + _51; >> - _103 =3D _27 + _53; >> - _95 =3D _29 + _103; >> - MEM [(int[1025] *)_43] =3D _95; >> + _49 =3D (int) ivtmp$8_52; >> + _51 =3D _49 + _107; >> + _103 =3D *.omp_data_i_44(D).arr; >> + _63 =3D (sizetype) _51; >> + _41 =3D _63 * 4; >> + _4 =3D _103 + _41; >> + _95 =3D MEM [(int[1025] *)_4]; >> + _96 =3D _2 + _95; >> + _22 =3D _25 + _96; >> + _105 =3D _22 + _27; >> + _99 =3D _29 + _105; >> + MEM [(int[1025] *)_4] =3D _99; >> .UNIQUE (OACC_JOIN, _34, 2); >> >> [local count: 87490071]: >> - _63 =3D _2 + 1; >> - ivtmp$5_81 =3D ivtmp$5_22 + 32; >> - ivtmp$6_56 =3D ivtmp$6_72 + 2; >> - if (_23 !=3D _63) >> + ivtmp$3_47 =3D ivtmp$3_106 + 1; >> + ivtmp$5_72 =3D ivtmp$5_87 + 2; >> + ivtmp$8_50 =3D ivtmp$8_52 + 32; >> + if (_42 !=3D ivtmp$3_47) >> goto ; [89.00%] >> else >> goto ; [11.00%] >> >> ..., and thus the change in diagnostics: >> >> [...] >> source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables= .f90:131:63: note: variable =E2=80=98pt=E2=80=99 ought to be adjusted for O= penACC privatization level: =E2=80=98gang=E2=80=99 >> source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables= .f90:131:63: note: variable =E2=80=98pt=E2=80=99 ought to be adjusted for O= penACC privatization level: =E2=80=98gang=E2=80=99 >> +source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables= .f90: In function =E2=80=98t4_._omp_fn.0=E2=80=99: >> +source-gcc/libgomp/testsuite/libgomp.oacc-fortran/private-variables= .f90:131:63: note: variable =E2=80=98pt=E2=80=99 adjusted for OpenACC priva= tization level: =E2=80=98gang=E2=80=99 >> + 131 | !$acc loop gang private(pt) ! { dg-line l_loop[incr c_loo= p] } >> + | = ^ >> >> For '-O2', host compilation begins same as '-O1', and again in >> 'a-private-variables.f90.148t.dse3', the 'pt.{x,y,z,attr}' assignments >> for the new 'struct vec3 pt;' get cleaned out: >> >> --- ./a-private-variables.f90.144t.sink1 2022-04-12 14:28:19.= 173520425 +0200 >> +++ ./a-private-variables.f90.148t.dse3 2022-04-12 14:28:19.17352042= 5 +0200 >> [...] >> __attribute__((oacc function (1, 1, 1), oacc parallel, omp target e= ntrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> @@ -97,10 +74,6 @@ >> goto ; [100.00%] >> >> [local count: 35644102]: >> - pt.attr[4] =3D _29; >> - pt.z =3D _27; >> - pt.y =3D _25; >> - pt.x =3D .offset.15_2; >> return; >> >> } >> [...] >> >> For '-O2', nvptx offload target compilation looks very similar to host >> compilation, and again in 'a.xnvptx-none.mkoffload.148t.dse3', the >> 'pt.{x,y,z,attr}' assignments for the new 'struct vec3 pt;' get cleaned >> out: >> >> --- ./a.xnvptx-none.mkoffload.144t.sink1 2022-04-12 14:28:19.= 213520366 +0200 >> +++ ./a.xnvptx-none.mkoffload.148t.dse3 2022-04-12 14:28:19.21352036= 6 +0200 >> [...] >> __attribute__((oacc function (32, 8, 32), oacc parallel, omp target= entrypoint)) >> void t4_._omp_fn.0 (const struct .omp_data_t.1 & restrict .omp_data= _i) >> { >> @@ -34,13 +25,9 @@ >> >> [local count: 7128820]: >> _104 =3D .GOACC_DIM_POS (0); >> - pt.x =3D _104; >> _25 =3D _104 * 2; >> - pt.y =3D _25; >> _27 =3D _104 * 4; >> - pt.z =3D _27; >> _29 =3D _104 * 6; >> - pt.attr[4] =3D _29; >> _34 =3D .UNIQUE (OACC_FORK, 0, 2); >> >> [local count: 437450365]: >> >> ..., so no actual changes in the end. >> >> I have not verified other ("higher") optimization levels, but given no >> change in diagnostics, I suppose the same ("no actual changes") happens >> for those. >> >> Is the '-O1' change/regression unexpected, and should be analyzed, or >> should we just accept the slightly worse code generation (for '-O1' >> only), and I accordingly adjust the test case for the change in >> diagnostics? >> >> >> Gr=C3=BC=C3=9Fe >> Thomas >> ----------------- >> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe = 201, 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch= =C3=A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellsc= haft: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 >> > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg) ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955