From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BE81139450DA; Mon, 9 Mar 2020 15:19:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE81139450DA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1583767168; bh=h4YifzVEU2FlirO1zi1ijUGRbp132TTcSUpF7ZBP2xE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=WHIYqzFAl/ulJbFq1ilFf5kw4vA0H4vM5oZzGUx3dpWoKxtLa6i4UMdoOTY7e11ar mETmFKua+N0rHmcr/Z07VbK77KOcZulVlCCvCNWmB0mn39vCyd0pq5b8ZXP73sfG6T rgpTKJbZjlvM7cZwdZ6hd6UnuQum6ie5rWcNhCps= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/93930] [8/9/10 Regression] Unnecessary broadcast instructions for AVX512 Date: Mon, 09 Mar 2020 15:19:28 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 9.2.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 8.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Mar 2020 15:19:28 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D93930 --- Comment #3 from Jakub Jelinek --- The cost changes affect the RTL LIM.-Set in insn 22 is invariant (0), cost = 32, depends on=20 -Set in insn 27 is invariant (1), cost 32, depends on=20 -Set in insn 32 is invariant (2), cost 32, depends on=20 -Set in insn 37 is invariant (3), cost 32, depends on=20 -Set in insn 61 is invariant (4), cost 32, depends on=20 -Set in insn 66 is invariant (5), cost 32, depends on=20 -Set in insn 71 is invariant (6), cost 32, depends on=20 -Set in insn 76 is invariant (7), cost 32, depends on=20 -Set in insn 101 is invariant (8), cost 32, depends on=20 -Set in insn 106 is invariant (9), cost 32, depends on=20 -Set in insn 111 is invariant (10), cost 32, depends on=20 -Set in insn 116 is invariant (11), cost 32, depends on=20 -Decided to move invariant 0 -- gain 32 -Decided to move invariant 1 -- gain 32 -Decided to move invariant 2 -- gain 32 -Decided to move invariant 3 -- gain 32 -Decided to move invariant 4 -- gain 32 -Decided to move invariant 5 -- gain 32 -Decided to move invariant 6 -- gain 32 -Decided to move invariant 7 -- gain 32 -Decided to move invariant 8 -- gain 32 -Decided to move invariant 9 -- gain 32 -Decided to move invariant 10 -- gain 10 -Decided to move invariant 11 -- gain 30 +Set in insn 22 is invariant (0), cost 4, depends on=20 +Set in insn 27 is invariant (1), cost 4, depends on=20 +Set in insn 32 is invariant (2), cost 4, depends on=20 +Set in insn 37 is invariant (3), cost 4, depends on=20 +Set in insn 61 is invariant (4), cost 4, depends on=20 +Set in insn 66 is invariant (5), cost 4, depends on=20 +Set in insn 71 is invariant (6), cost 4, depends on=20 +Set in insn 76 is invariant (7), cost 4, depends on=20 +Set in insn 101 is invariant (8), cost 4, depends on=20 +Set in insn 106 is invariant (9), cost 4, depends on=20 +Set in insn 111 is invariant (10), cost 4, depends on=20 +Set in insn 116 is invariant (11), cost 4, depends on=20 +Decided to move invariant 0 -- gain 4 +Decided to move invariant 1 -- gain 4 +Decided to move invariant 2 -- gain 4 +Decided to move invariant 3 -- gain 4 +Decided to move invariant 4 -- gain 4 +Decided to move invariant 5 -- gain 4 +Decided to move invariant 6 -- gain 4 +Decided to move invariant 7 -- gain 4 +Decided to move invariant 8 -- gain 4 +Decided to move invariant 9 -- gain 4 which means invariant 10 and 11 aren't moved anymore. Those two are: -(insn 111 106 116 3 (set (reg:V16SF 210) - (vec_duplicate:V16SF (vec_select:SF (reg:V4SF 234) - (parallel [ - (const_int 0 [0]) - ])))) "include/avx512fintrin.h":207 4206 {avx512f_vec_dupv16sf} - (expr_list:REG_EQUAL (const_vector:V16SF [ - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - (const_double:SF 2.3e+1 [0x0.b8p+5]) - ]) - (nil))) -(insn 116 111 139 3 (set (reg:V16SF 214) - (vec_duplicate:V16SF (vec_select:SF (reg:V4SF 235) - (parallel [ - (const_int 0 [0]) - ])))) "include/avx512fintrin.h":207 4206 {avx512f_vec_dupv16sf} - (expr_list:REG_EQUAL (const_vector:V16SF [ - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - (const_double:SF 2.4e+1 [0x0.cp+5]) - ]) - (nil))) and I bet the reason they are using the const costs are the REG_EQUAL notes. The setters of their sources are: (insn 169 168 170 3 (set (reg:V4SF 234) (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC10") [flags 0x2]) [2 S16 A128= ])) -1 (expr_list:REG_EQUAL (const_vector:V4SF [ (const_double:SF 2.3e+1 [0x0.b8p+5]) (const_double:SF 0.0 [0x0.0p+0]) (const_double:SF 0.0 [0x0.0p+0]) (const_double:SF 0.0 [0x0.0p+0]) ]) (nil))) (insn 170 169 22 3 (set (reg:V4SF 235) (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC11") [flags 0x2]) [2 S16 A128= ])) -1 (expr_list:REG_EQUAL (const_vector:V4SF [ (const_double:SF 2.4e+1 [0x0.cp+5]) (const_double:SF 0.0 [0x0.0p+0]) (const_double:SF 0.0 [0x0.0p+0]) (const_double:SF 0.0 [0x0.0p+0]) ]) (nil)))=