From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 340B83836C28; Tue, 26 Jan 2021 13:35:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 340B83836C28 From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/98563] [10/11 Regression] vectorization fails while it worked on gcc 9 and earlier since since r10-2271-gd81ab49d0586fca0 Date: Tue, 26 Jan 2021 13:34:59 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 10.1.0 X-Bugzilla-Keywords: missed-optimization, openmp X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jan 2021 13:35:00 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98563 --- Comment #9 from Jakub Jelinek --- (In reply to Richard Biener from comment #8) > (In reply to Jakub Jelinek from comment #7) > > I'm afraid no. > > The vectorization can handle addresses into the simd arrays, but right = now > > only if it accesses the whole element, i.e. when we can turn the simd a= rray > > into a vector register (or set thereof) that hold the variable. > > In this case that is not the case, as in the end it uses the real and i= mag > > parts separately. > > So, either it can be handled in SRA, or we'd need to teach the vectoriz= er to > > permute those fur us. >=20 > Hmm, I see. The vectorizer can in theory handle "existing" vectors > (currently only enabled for basic-block SLP though). But of course the > first hurdle is > to not treat those as memory accesses (thus ignore the data-ref analysis > failure or somehow make that treat the SIMD_LANE indexing "nicely"). >=20 > When we see >=20 > _13 =3D .GOMP_SIMD_LANE (simduid.0_12(D), 0); >=20 > can we compute how _13 evolves with loop iteration? Thus, can we > SCEV analyze it? Isn't it sth like { .GOMP_SIMD_LANE_START > (simduid.0_12(D), .GOMP_SIMD_LANE_STEP (simduid.0_12(D), 0) } thus an aff= ine > evolution > in the end? _13 has modulo semantics in the loop, it gets values 0, 1, ... vf-1, 0, 1, = ... vf-1 etc., where vf is the vectorization factor of the loop. The intent is that after successful vectorization, the array can be promote= d to a vector containing those (or a set of vectors, it is a software vector rat= her than necessarily hardware vector) and on unsuccessful vectorization it will shrink into a single array variable (scalar). > Simplified C testcase: >=20 > typedef _Complex double cplx; > void foo (cplx *); > void test(cplx* __restrict__ a, const cplx* b, double c, int N) > { > cplx tem; > #pragma omp simd private (tem) > for (int i=3D0; i<8*N; i++) { > __real tem =3D __real b[i]; > __imag tem =3D __imag b[i]; > __real a[i] =3D __real tem; > __imag a[i] =3D __imag tem; > } > foo (&tem); private clause means undefined at the end of construct, if you want to insp= ect the value afterwards, the possible clauses are lastprivate (the scalar vari= able receives the value from the last iteration), or reduction (in that case it = will reduce it using some base language reduction operator or user defined funct= ion from all the vector elements).=