From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D360B3858D3C; Wed, 19 Oct 2022 10:06:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D360B3858D3C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1666173973; bh=ft1Yv2NuKT4ahvl9pmpCBmR7d/3+ahutfZ2tBU1j/DI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=OS3Kdo03SajRgdXAnBw6yF1tsRQu3GJ97oGhFpojZSQO/c6wekejVBhblCf5ZtVpr lSTYxw8egfc7HoCM3I1WFpbxPvylBxWQHj/8qLV0740j8H86n/+8FuzcmLm26/2Hl1 wENB3XuAj6CRJlNluHOzPjpoXbf8eCECFcUwfqGc= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/105546] [11/12/13 Regression] ifconversion introduces many compares with loads Date: Wed, 19 Oct 2022 10:06:12 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 11.3.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: priority Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105546 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P2 --- Comment #4 from Richard Biener --- And it's sinking (of common stores) that turns [local count: 1073741824]: g_344.0_1 =3D g_344; if (g_344.0_1 !=3D 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: .f0 =3D 2738; .f1 =3D 27943; .f2 =3D -1; .f3 =3D 171; .f4 =3D 3; .f5 =3D 4499926296329723445; goto ; [100.00%] [local count: 536870913]: .f0 =3D 65526; .f1 =3D 1; .f2 =3D -8; .f3 =3D 161; .f4 =3D 3409572933270154779; .f5 =3D -6; [local count: 1073741824]: return ; into [local count: 1073741824]: g_344.0_1 =3D g_344; if (g_344.0_1 !=3D 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: [local count: 1073741824]: # _16 =3D PHI <4499926296329723445(2), -6(3)> # _18 =3D PHI <3(2), 3409572933270154779(3)> # _20 =3D PHI <171(2), 161(3)> # _22 =3D PHI <-1(2), -8(3)> # _24 =3D PHI <27943(2), 1(3)> # _26 =3D PHI <2738(2), 65526(3)> .f0 =3D _26; .f1 =3D _24; .f2 =3D _22; .f3 =3D _20; .f4 =3D _18; .f5 =3D _16; return ; without that (-fno-tree-sink) we'd get func_1: .LFB0: .cfi_startproc movq %rdi, %rax cmpw $0, g_344(%rip) je .L2 movw $2738, (%rdi) movw $27943, 2(%rdi) movw $-1, 4(%rdi) movb $-85, 6(%rdi) movq $3, 8(%rdi) movabsq $4499926296329723445, %rdx movq %rdx, 16(%rdi) ret .L2: movw $-10, (%rdi) movw $1, 2(%rdi) movw $-8, 4(%rdi) movb $-95, 6(%rdi) movabsq $3409572933270154779, %rcx movq %rcx, 8(%rdi) movq $-6, 16(%rdi) ret or at -O2 now with vectorizing func_1: .LFB0: .cfi_startproc cmpw $0, g_344(%rip) movq %rdi, %rax je .L2 movdqa .LC1(%rip), %xmm0 movl $-1, %ecx movl $1831275186, (%rdi) movw %cx, 4(%rdi) movb $-85, 6(%rdi) movups %xmm0, 8(%rdi) ret .p2align 4,,10 .p2align 3 .L2: movdqa .LC3(%rip), %xmm0 movl $-8, %edx movl $131062, (%rdi) movw %dx, 4(%rdi) movb $-95, 6(%rdi) movups %xmm0, 8(%rdi) ret we could probably improve things by storing into the padding but GIMPLE doesn't know it is allowed to do that. sinking notes /* Insert a PHI to merge differing stored values if necessary. Note that in general inserting PHIs isn't a very good idea as it makes the job of coalescing and register allocation harder. Even common SSA uses on the rhs/lhs might extend their lifetime across multiple edges by this code motion which makes register allocation harder. */ but we don't limit ourselves in the number of PHI nodes to create. Of course since we have two sinking passes now we'd get inconsistent results here, also since vectorization sits inbetween the two.=