From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 1E8BE3858425 for ; Sat, 12 Nov 2022 07:38:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E8BE3858425 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.5]) by gateway (Coremail) with SMTP id _____8Axz7ddTW9jJV4GAA--.14899S3; Sat, 12 Nov 2022 15:38:06 +0800 (CST) Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxZ1dYTW9jJ1YRAA--.29246S2; Sat, 12 Nov 2022 15:38:04 +0800 (CST) From: Lulu Cheng To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, xuchenghua@loongson.cn, Lulu Cheng , xujiahao Subject: [PATCH v2] LoongArch: Add prefetch instructions. Date: Sat, 12 Nov 2022 15:37:56 +0800 Message-Id: <20221112073756.912800-1-chenglulu@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:AQAAf8CxZ1dYTW9jJ1YRAA--.29246S2 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBjvJXoW3XF1rtr4UAr1DCF1xuF1Utrb_yoWxtw4fpr Z7uw43Jr48XFs3W3yDt345Wws8Jr97K3W2vFW3KryfCa17Zry7ZF10yrZxWFWUXws5Jrya gF1fCa15ZF4UAaUanT9S1TB71UUUUUDqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU b7xYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVWUJVWUCwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwA2z4 x0Y4vEx4A2jsIE14v26r4j6F4UM28EF7xvwVC2z280aVCY1x0267AKxVW8JVW8Jr1le2I2 62IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27wAqx4xG64xvF2IEw4 CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jrv_JF1lYx0Ex4A2jsIE14v26r1j6r4UMcvj eVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwCF04k20xvY0x0EwIxGrwCFx2IqxV CFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r10 6r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxV WUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG 6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr 1UYxBIdaVFxhVjvjDU0xZFpf9x07UNvtZUUUUU= X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Co-Authored-By: xujiahao gcc/ChangeLog: * config/loongarch/loongarch-def.c: Initial number of parallel prefetch. * config/loongarch/loongarch-tune.h (struct loongarch_cache): Define number of parallel prefetch. * config/loongarch/loongarch.cc (loongarch_option_override_internal): Set up parameters to be used in prefetching algorithm. (loongarch_prefetch_cookie): Select load or store based on the value of write. * config/loongarch/loongarch.md (prefetch): New template. (*prefetch_indexed_): New template. --- gcc/config/loongarch/constraints.md | 20 +++++++++++ gcc/config/loongarch/loongarch-def.c | 2 ++ gcc/config/loongarch/loongarch-tune.h | 1 + gcc/config/loongarch/loongarch.cc | 50 +++++++++++++++++++++------ gcc/config/loongarch/loongarch.md | 20 +++++++++++ 5 files changed, 83 insertions(+), 10 deletions(-) diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md index 43cb7b5f0f5..9ac5e4c00fb 100644 --- a/gcc/config/loongarch/constraints.md +++ b/gcc/config/loongarch/constraints.md @@ -86,6 +86,14 @@ ;; "ZB" ;; "An address that is held in a general-purpose register. ;; The offset is zero" +;; "ZD" +;; "An address operand whose address is formed by a base register +;; and offset that is suitable for use in instructions with the same +;; addressing mode as @code{preld}." +;; "ZE" +;; "An address operand whose address is formed by a base register +;; and index register that is suitable for use in instructions +;; with the same addressing mode as @code{preldx}." ;; "<" "Matches a pre-dec or post-dec operand." (Global non-architectural) ;; ">" "Matches a pre-inc or post-inc operand." (Global non-architectural) @@ -190,3 +198,15 @@ (define_memory_constraint "ZB" The offset is zero" (and (match_code "mem") (match_test "REG_P (XEXP (op, 0))"))) + +(define_address_constraint "ZD" + "An address operand whose address is formed by a base register + and offset that is suitable for use in instructions with the same + addressing mode as @code{preld}." + (match_test "loongarch_12bit_offset_address_p (op, mode)")) + +(define_address_constraint "ZE" + "An address operand whose address is formed by a base register + and index register that is suitable for use in instructions + with the same addressing mode as @code{preldx}." + (match_test "loongarch_base_index_address_p (op, mode)")) diff --git a/gcc/config/loongarch/loongarch-def.c b/gcc/config/loongarch/loongarch-def.c index cbf995d81b5..80ab10a52a8 100644 --- a/gcc/config/loongarch/loongarch-def.c +++ b/gcc/config/loongarch/loongarch-def.c @@ -62,11 +62,13 @@ loongarch_cpu_cache[N_TUNE_TYPES] = { .l1d_line_size = 64, .l1d_size = 64, .l2d_size = 256, + .simultaneous_prefetches = 4, }, [CPU_LA464] = { .l1d_line_size = 64, .l1d_size = 64, .l2d_size = 256, + .simultaneous_prefetches = 4, }, }; diff --git a/gcc/config/loongarch/loongarch-tune.h b/gcc/config/loongarch/loongarch-tune.h index 6f3530f5c02..8e3eb29472b 100644 --- a/gcc/config/loongarch/loongarch-tune.h +++ b/gcc/config/loongarch/loongarch-tune.h @@ -45,6 +45,7 @@ struct loongarch_cache { int l1d_line_size; /* bytes */ int l1d_size; /* KiB */ int l2d_size; /* kiB */ + int simultaneous_prefetches; /* number of parallel prefetch */ }; #endif /* LOONGARCH_TUNE_H */ diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 8d5d8d965dd..a36802fbbf2 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see #include "context.h" #include "builtins.h" #include "rtl-iter.h" +#include "opts.h" /* This file should be included last. */ #include "target-def.h" @@ -5958,22 +5959,24 @@ loongarch_variable_issue (FILE *file ATTRIBUTE_UNUSED, return more; } -/* Given that we have an rtx of the form (prefetch ... WRITE LOCALITY), - return the first operand of the associated PREF or PREFX insn. */ +/* LoongArch only implements preld hint=0 (prefetch for load) and hint=8 + (prefetch for store), other hint just scale to hint = 0 and hint = 1. */ rtx loongarch_prefetch_cookie (rtx write, rtx locality) { - /* store_streamed / load_streamed. */ - if (INTVAL (locality) <= 0) - return GEN_INT (INTVAL (write) + 4); + if (INTVAL (locality) == 1 && INTVAL (write) == 0) + return GEN_INT (INTVAL (write) + 2); - /* store / load. */ - if (INTVAL (locality) <= 2) - return write; + /* store. */ + if (INTVAL (write) == 1) + return GEN_INT (INTVAL (write) + 7); - /* store_retained / load_retained. */ - return GEN_INT (INTVAL (write) + 6); + /* load. */ + if (INTVAL (write) == 0) + return GEN_INT (INTVAL (write)); + + gcc_unreachable (); } /* Implement TARGET_ASM_OUTPUT_MI_THUNK. Generate rtl rather than asm text @@ -6100,6 +6103,33 @@ loongarch_option_override_internal (struct gcc_options *opts) if (loongarch_branch_cost == 0) loongarch_branch_cost = loongarch_cost->branch_cost; + /* Set up parameters to be used in prefetching algorithm. */ + int simultaneous_prefetches + = loongarch_cpu_cache[LARCH_ACTUAL_TUNE].simultaneous_prefetches; + + SET_OPTION_IF_UNSET (opts, &global_options_set, + param_simultaneous_prefetches, + simultaneous_prefetches); + + SET_OPTION_IF_UNSET (opts, &global_options_set, + param_l1_cache_line_size, + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l1d_line_size); + + SET_OPTION_IF_UNSET (opts, &global_options_set, + param_l1_cache_size, + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l1d_size); + + SET_OPTION_IF_UNSET (opts, &global_options_set, + param_l2_cache_size, + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l2d_size); + + + /* Enable sw prefetching at -O3 and higher. */ + if (opts->x_flag_prefetch_loop_arrays < 0 + && (opts->x_optimize >= 3 || opts->x_flag_profile_use) + && !opts->x_optimize_size) + opts->x_flag_prefetch_loop_arrays = 1; + if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib) error ("%qs cannot be used for compiling a shared library", "-mdirect-extern-access"); diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 682ab961741..fea6bf57239 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3282,6 +3282,26 @@ (define_expand "untyped_call" ;; .................... ;; +(define_insn "prefetch" + [(prefetch (match_operand 0 "address_operand" "ZD,ZE") + (match_operand 1 "const_int_operand" "n,n") + (match_operand 2 "const_int_operand" "n,n"))] + "" +{ + operands[1] = loongarch_prefetch_cookie (operands[1], operands[2]); + + switch (which_alternative) + { + case 0: + return "preld\t%1,%a0"; + case 1: + return "preldx\t%1,%a0"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "prefetch,prefetchx")]) + (define_insn "nop" [(const_int 0)] "" -- 2.31.1