From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id DB3023858412 for ; Sat, 12 Nov 2022 09:45:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB3023858412 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1668246325; bh=g5AYaKm6lyFHSxoeyhC4oiYwmpLAYfwR5ZFuB0/yCFw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=m4gvufOPdldXqC0TVUMR5eG65uMSho/SomjmN+E06CkF0e3Vo+NVUHOdWHbDZGFn6 URWGtRv//14ch91HIT86oriGBg3RDc6odNPCsGe5O+uECCO20EX4LejS5IK69oeHyD 4z8NFIUzTLqmtTMydLBfc+HfBrP8Z8RcblDoVqt0= Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 921746688D; Sat, 12 Nov 2022 04:45:24 -0500 (EST) Message-ID: <19d51ba96b1e0d509bedc2ee20ffc72a342f5d02.camel@xry111.site> Subject: Re: [PATCH v2] LoongArch: Add prefetch instructions. From: Xi Ruoyao To: Lulu Cheng , gcc-patches@gcc.gnu.org Cc: i@xen0n.name, xuchenghua@loongson.cn, xujiahao Date: Sat, 12 Nov 2022 17:45:22 +0800 In-Reply-To: <20221112073756.912800-1-chenglulu@loongson.cn> References: <20221112073756.912800-1-chenglulu@loongson.cn> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.0 MIME-Version: 1.0 X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,LIKELY_SPAM_FROM,PDS_OTHER_BAD_TLD,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, 2022-11-12 at 15:37 +0800, Lulu Cheng wrote: > Co-Authored-By: xujiahao >=20 > gcc/ChangeLog: >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loonga= rch-def.c: Initial number of parallel > prefetch. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loonga= rch-tune.h (struct loongarch_cache): > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Define number of parallel= prefetch. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loonga= rch.cc > (loongarch_option_override_internal): > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Set up parameters to be u= sed in prefetching algorithm. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(loongarch_prefetch_cooki= e): Select load or store based on the > value of write. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loonga= rch.md (prefetch): New template. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(*prefetch_indexed_= ): New template. Missing config/loongarch/constraints.md. /* snip */ > =C2=A0rtx > =C2=A0loongarch_prefetch_cookie (rtx write, rtx locality) > =C2=A0{ > -=C2=A0 /* store_streamed / load_streamed.=C2=A0 */ > -=C2=A0 if (INTVAL (locality) <=3D 0) > -=C2=A0=C2=A0=C2=A0 return GEN_INT (INTVAL (write) + 4); > +=C2=A0 if (INTVAL (locality) =3D=3D 1 && INTVAL (write) =3D=3D 0) > +=C2=A0=C2=A0=C2=A0 return GEN_INT (INTVAL (write) + 2); So __builtin_prefetch(ptr, 0, 1) will produce "preld 2,$r4,0", while the document says hint has 32 optional values (0 to 31), 0 represents load to level 1 Cache, and 8 represents store to level 1 Cache. The remaining hint values are not defined and are processed for nop instructions when the processor executes. =20 OTOH hint 2 is documented in preldx. So does preld also support hint 2? /* snip */ > +(define_insn "prefetch" > +=C2=A0 [(prefetch (match_operand 0 "address_operand" "ZD,ZE") > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (matc= h_operand 1 "const_int_operand" "n,n") > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (matc= h_operand 2 "const_int_operand" "n,n"))] > +=C2=A0 "" > +{ > +=C2=A0 operands[1] =3D loongarch_prefetch_cookie (operands[1], operands[= 2]); > + > +=C2=A0 switch (which_alternative) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0 case 0: > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return "preld\t%1,%a0"; > +=C2=A0=C2=A0=C2=A0 case 1: > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return "preldx\t%1,%a0"; void prefetch(char *ptr, int off) { return __builtin_prefetch(ptr + off); } It's compiled to "preldx 0,$r4,$r5". I don't think it's correct because according to the doc, rk should contains several bit-fields instead of an offset. --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University