From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5C2343971835; Fri, 4 Dec 2020 09:48:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5C2343971835 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter Date: Fri, 04 Dec 2020 09:48:28 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Dec 2020 09:48:28 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D98137 Bug ID: 98137 Summary: Could use SLP to vectorize if split_constant_offset were smarter Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- void gemm_m10_n9_k17_ldA20_ldB20_ldC10_beta0_alignedA1_alignedC1_pfsigonly(const double* __restrict__ A, const double* __restrict__ B, double* __restrict__ = C, const double* A_prefetch, const double* B_prefetch, const double* C_prefetc= h) { unsigned int l_m =3D 0; unsigned int l_n =3D 0; unsigned int l_k =3D 0; for ( l_n =3D 0; l_n < 9; l_n++ ) { for ( l_m =3D 0; l_m < 10; l_m++ ) { C[(l_n*10)+l_m] =3D 0.0; } for ( l_k =3D 0; l_k < 17; l_k++ ) { for ( l_m =3D 0; l_m < 10; l_m++ ) { C[(l_n*10)+l_m] +=3D A[(l_k*20)+l_m] * B[(l_n*20)+l_k]; } } } } is nicely vectorized with BB SLP when you make l_{m,n,k} signed but when unsigned as above then split_constant_offset gives up when seeing C + ((unsigned long)(_286 + 1) * 8) but we even have nice range-info: # RANGE [0, 80] NONZERO 126 _286 =3D l_n_189 * 10; # RANGE [0, 80] NONZERO 126 _288 =3D (long unsigned int) _286; # RANGE [0, 640] NONZERO 1008 _289 =3D _288 * 8; # PT =3D null { D.2428 } (nonlocal, restrict) _290 =3D C_37(D) + _289; ^^ C + ((unsigned long)(_286) * 8) # RANGE [1, 81] NONZERO 127 _296 =3D _286 + 1; # RANGE [1, 81] NONZERO 127 _297 =3D (long unsigned int) _296; # RANGE [8, 648] NONZERO 1016 _298 =3D _297 * 8; # PT =3D { D.2428 } (nonlocal, restrict) _299 =3D C_37(D) + _298; ^^ C + ((unsigned long)(_286 + 1) * 8 giving up means DR group analysis doesn't relate them and we do not consider SLP vectorization.=