From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D3B843858403; Mon, 21 Mar 2022 13:08:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D3B843858403 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58 Date: Mon, 21 Mar 2022 13:08:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Mar 2022 13:08:49 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104912 --- Comment #7 from Richard Biener --- I'm noting that for skylake cost we have _28 * _33 1 times scalar_stmt costs 16 in prologue and _28 * _33 1 times vector_stmt costs 16 in body but the load/store costs are just 12, compared to znver2 this tips the bias over to allow vectorization while for znver2 I currently see no vectorizati= on. For generic I also see vectorization. Note that costing currently assumes that the cost model niter check is performed first and short-cuts all the versioning conditions. But since we emit _248 =3D (unsigned int) mk_113; _247 =3D _248 + 4294967295; _246 =3D _247 > 2; _245 =3D stride.4_74 !=3D 0; _244 =3D _245 & _246; ... _183 =3D _184 | _211; _182 =3D _183 & _244; if (_182 !=3D 0) goto ; [80.00%] else goto ; [20.00%] on GIMPLE how things are expanded depends on some luck and with the standal= one testcase and -Ofast with generic tuning we emit the > 2 cost model check quite late: addq $1, %rdi imulq %r13, %rdi leaq (%rax,%rdi), %rcx movq 32(%rsp), %rax leaq (%rax,%rcx), %rsi movq (%rsp), %rax leaq 0(,%rsi,8), %rdx addq %rax, %rcx leaq 0(,%rcx,8), %rax addq %r13, %rcx salq $3, %rcx cmpq %rcx, %rdx setg %cl addq %r13, %rsi salq $3, %rsi cmpq %rsi, %rax setg %sil orb %cl, %sil je .L8 movl -100(%rsp), %esi leal -1(%rsi), %ecx cmpl $2, %ecx <----- movl 112(%rsp), %ecx seta %sil testl %ecx, %ecx setg %cl testb %cl, %sil je .L8 let me try to hack^Wfix this.=