From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BF44E3894C1A; Thu, 24 Jun 2021 01:44:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BF44E3894C1A From: "crazylht at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/101185] pr96814 failed after r12-1669 on non-avx512 platform Date: Thu, 24 Jun 2021 01:44:50 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: crazylht at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jun 2021 01:44:50 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101185 --- Comment #1 from Hongtao.liu --- Alloc order is just another kind of cost which can be compensated by increa= sing cost of mask->integer and integer->mask. With below patch , pr96814 wouldn't generate any mask intructions execept f= or=20 kmovd %eax, %k1 vpcmpeqd %ymm1, %ymm1, %ymm1 vmovdqu8 %ymm1, %ymm0{%k1}{z} which is what we want. modified gcc/config/i386/i386.md @@ -1335,7 +1335,7 @@ (define_insn "*cmp_ccz_1" [(set (reg FLAGS_REG) (compare (match_operand:SWI1248_AVX512BWDQ_64 0 - "nonimmediate_operand" ",?m,$k") + "nonimmediate_operand" ",?m,*k") (match_operand:SWI1248_AVX512BWDQ_64 1 "const0_operand")))] "TARGET_AVX512F && ix86_match_ccmode (insn, CCZmode)" "@ modified gcc/config/i386/x86-tune-costs.h @@ -2768,7 +2768,7 @@ struct processor_costs intel_cost =3D { {6, 6, 6, 6, 6}, /* cost of storing SSE registers in 32,64,128,256 and 512-bit */ 4, 4, /* SSE->integer and integer->SSE mo= ves */ - 4, 4, /* mask->integer and integer->mask moves */ + 6, 6, /* mask->integer and integer->mask moves */ {4, 4, 4}, /* cost of loading mask register in QImode, HImode, SImode. */ {6, 6, 6}, /* cost if storing mask register @@ -2882,7 +2882,7 @@ struct processor_costs generic_cost =3D { {6, 6, 6, 10, 15}, /* cost of storing SSE registers in 32,64,128,256 and 512-bit */ 6, 6, /* SSE->integer and integer->SSE mo= ves */ - 6, 6, /* mask->integer and integer->mask moves */ + 8, 8, /* mask->integer and integer->mask moves */ {6, 6, 6}, /* cost of loading mask register in QImode, HImode, SImode. */ {6, 6, 6}, /* cost if storing mask register So would the solution of increasing one more unit(or maybe more) for cost of mask->integer and integer->mask as compensation for changing alloca order be acceptable for you? or do you insist on reverting the x86_order_regs_for_local_alloc part?=