From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id D0E053858D1E for ; Tue, 18 Apr 2023 12:45:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D0E053858D1E Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1681821942; bh=vjN8Eyqv2ijviDZ+TBuTpFDZkyFnwAzT5w0d1L/RtWE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=C1DysxxyyplDCRx5XU7jcNHsfzeqcEthBPmZAG/cxcPZaFX+XS0EkUCn09kbwUDk5 G8JDI+3kaGniMSiHnipoGdSHRLQHRfwFG49339ocMc9LiprhqK6sGqIE6Kc32645+4 r6EHXj6AG7QSL0ibF1ea+EtjIQMcxcMFGS4y2M58= Received: from [192.168.124.11] (unknown [115.155.1.124]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id E14E665C83; Tue, 18 Apr 2023 08:45:40 -0400 (EDT) Message-ID: <63443c167caee4a0f0631cf5188de778868fb452.camel@xry111.site> Subject: Re: [PATCH] LoongArch: Set 4 * (issue rate) as the default for -falign-functions and -falign-loops From: Xi Ruoyao To: WANG Xuerui , gcc-patches@gcc.gnu.org Cc: Lulu Cheng , Chenghua Xu Date: Tue, 18 Apr 2023 20:45:31 +0800 In-Reply-To: <2957dad7-a211-06b8-168c-8649c508c399@xen0n.name> References: <20230418121753.50830-1-xry111@xry111.site> <2957dad7-a211-06b8-168c-8649c508c399@xen0n.name> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.0 MIME-Version: 1.0 X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,LIKELY_SPAM_FROM,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 2023-04-18 at 20:39 +0800, WANG Xuerui wrote: > Hi, >=20 > Thanks for helping confirming on GCC and porting this! I'd never know=20 > even GCC lacked this adaptation without someone actually checking... Too= =20 > many things are taken for granted these days. >=20 > On 2023/4/18 20:17, Xi Ruoyao wrote: > > According to Xuerui's LLVM changeset [1], doing so can make a > > significant performace gain. > >=20 > > Bootstrapped and regtested on loongarch64-linux-gnu.=C2=A0 Ok for GCC 1= 4? > >=20 > > [1]:https://reviews.llvm.org/D148622 > >=20 > > gcc/ChangeLog: > >=20 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loon= garch.cc > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(loongarch_option_overr= ide_internal): If -falign-functions is > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0used but the alignment = is not explicitly specified, set it to > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A04 * loongarch_issue_rat= e ().=C2=A0 Likewise for -falign-loops. > > --- > > =C2=A0 gcc/config/loongarch/loongarch.cc | 11 +++++++++++ > > =C2=A0 1 file changed, 11 insertions(+) > >=20 > > diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/l= oongarch.cc > > index 06fc1cd0604..6552484de7c 100644 > > --- a/gcc/config/loongarch/loongarch.cc > > +++ b/gcc/config/loongarch/loongarch.cc > > @@ -6236,6 +6236,17 @@ loongarch_option_override_internal (struct gcc_o= ptions *opts) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 && !opts->x_optimize_size) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 opts->x_flag_prefetch_loop_arrays =3D 1; > > =C2=A0=20 > > +=C2=A0 /* Align functions and loops to (issue rate) * (insn size) to i= mprove > > +=C2=A0=C2=A0=C2=A0=C2=A0 the throughput of the fetching units.=C2=A0 *= / > What about gating all of these on !opts->x_optimize_size, similar to=20 > what aarch64 does? opts->x_flag_align_functions and opts->x_flag_align_loops are only set with -O2 or above unless the user manually uses -falign-functions or - falign-loops. If the user uses "-Os -falign-functions" as CFLAGS I'd assume s(he) wants to optimize for size but keep the optimized function alignment. > > +=C2=A0 char *align =3D XNEWVEC (char, 16); > > +=C2=A0 sprintf (align, "%d", loongarch_issue_rate () * 4); > > + > > +=C2=A0 if (opts->x_flag_align_functions && !opts->x_str_align_function= s) > > +=C2=A0=C2=A0=C2=A0 opts->x_str_align_functions =3D align; > > + > > +=C2=A0 if (opts->x_flag_align_loops && !opts->x_str_align_loops) > > +=C2=A0=C2=A0=C2=A0 opts->x_str_align_loops =3D align; > > + > > =C2=A0=C2=A0=C2=A0 if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 error ("%qs cannot be used for compiling= a shared library", > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "-mdirect-= extern-access"); > Otherwise LGTM, thanks! --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University