From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 067273858CDB; Sat, 9 Dec 2023 07:48:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 067273858CDB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1702108082; bh=/OyEHZ4oPap3AqRZyxcoUw5pMYQCYVwJxLOXNFNVQQM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Tf8U5/6IrqHNh1qdnJFPbRjKw0zElJjYZwrdBO/UewGEL8TSYTnDjzDIE5ZTf97B+ HYVKtyQFS9ZVPoYgpGwgz3/yYIv+mpe1+jlBoLUZf54nBLg1EKcdLhYLLlPnqcKnAl QopyVovgGqO/a0PtFLUy5WlakNyLuSXh3ta6k/PE= From: "xry111 at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function Date: Sat, 09 Dec 2023 07:48:01 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: xry111 at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112935 --- Comment #8 from Xi Ruoyao --- (In reply to Andrew Pinski from comment #7) > (In reply to Xi Ruoyao from comment #5) > >=20 > > so we still slightly penalty multiplication. To me we should code > > COSTS_N_INSNS (1) + 1 into loongarch_rtx_cost_optimize_size instead of > > special casing it in loongarch_rtx_costs. >=20 > Oh yes slightly penalty is definitely not going make a huge difference if > the cost of an mult instruction is worse than an and and an neg. >=20 > >=20 > > For the default value (used when -O2) I'll do some micro-benchmark... I've changed it to /* Default RTX cost initializer. */ loongarch_rtx_cost_data::loongarch_rtx_cost_data () : fp_add (COSTS_N_INSNS (5)), fp_mult_sf (COSTS_N_INSNS (5)), fp_mult_df (COSTS_N_INSNS (5)), fp_div_sf (COSTS_N_INSNS (8)), fp_div_df (COSTS_N_INSNS (8)), int_mult_si (COSTS_N_INSNS (4)), int_mult_di (COSTS_N_INSNS (4)), int_div_si (COSTS_N_INSNS (5)), int_div_di (COSTS_N_INSNS (5)), branch_cost (6), memory_latency (4) {} based on micro-benchmark results. This fixes the int * _Bool case and int = * 17 case. But for the original test case I still get a multiplication instruct= ion.=