From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2881E385840B; Tue, 5 Oct 2021 10:19:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2881E385840B From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/102591] Failure to optimize search for value in vector-sized area to use SIMD Date: Tue, 05 Oct 2021 10:19:17 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status component blocked Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Oct 2021 10:19:18 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102591 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |NEW Component|target |tree-optimization Blocks| |53947 --- Comment #3 from Richard Biener --- (In reply to Gabriel Ravier from comment #2) > memcpy can fail on unaligned memory ??? I used it specifically to avoid t= his > problem ! >=20 > (also, LLVM's code, I am pretty sure, does not have any issue with > alignment, as it uses either AVX instructions which care not for it, or > specifically does a movdqu (i.e. unaligned load) of the memory) Ah, sorry - I was reading the loop as for (int at =3D 0; at < 16; at++) if (tpl[at] =3D=3D 0) { found =3D 1; break; } thus as if the suggested transform would eventually access storage that is not accessed originally... Btw, we vectorize bool match8(char *tpl)=20 { char found =3D 0; for (int at =3D 0; at < 16; at++) if (tpl[at] =3D=3D 0) found =3D 1; return found; } but use vector(16) char vect_found_4.8; vect__3.7_29 =3D MEM [(char *)tpl_10(D)]; _32 =3D vect__3.7_29 !=3D { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, = 0 }; vect_found_4.8_33 =3D VEC_COND_EXPR <_32, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,= 0, 0, 0, 0, 0, 0 }, { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }>; _35 =3D .REDUC_MAX (vect_found_4.8_33); _8 =3D (bool) _35; return _8; where we fail to apply "magic" to the .REDUC_MAX as we know the values are all 0 or 1. The conditional reduction support doesn't support producing 'int' from char compares and we fail to narrow the reduction vector. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations=