From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id E308C3952521 for ; Wed, 16 Nov 2022 03:19:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E308C3952521 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.20.4.52]) by gateway (Coremail) with SMTP id _____8Bx37e9VnRjsJQHAA--.17401S3; Wed, 16 Nov 2022 11:19:25 +0800 (CST) Received: from [10.20.4.52] (unknown [10.20.4.52]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxTuC8VnRje1sUAA--.54328S2; Wed, 16 Nov 2022 11:19:24 +0800 (CST) Subject: Re: [PATCH v3] LoongArch: Add prefetch instructions. To: WANG Xuerui , gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, xuchenghua@loongson.cn, xujiahao References: <20221116021027.519897-1-chenglulu@loongson.cn> From: Lulu Cheng Message-ID: <05ac7d65-3f1a-92ff-401c-71fcc36dd422@loongson.cn> Date: Wed, 16 Nov 2022 11:19:23 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------DBD0133457D0E573324381F9" Content-Language: en-US X-CM-TRANSID:AQAAf8BxTuC8VnRje1sUAA--.54328S2 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBjvJXoW3XF1rWrWUZr48Gr17Zry7ZFb_yoWxtw4Dpr yfCw43JrW8Jrn7G3yDt345W345JryxGw17XFySgFy8Cr47Zr1jvF18XrZIgFyUXws5Jr1a qF18Ga1UZF1UJw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj DUYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUUbxxYFVCjjxCrM7AC8VAFwI0_Jr0_ Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s1l1IIY67AEw4v_JrI_Jryl8cAvFV AK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWUCVW8JwA2 z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwA2z4x0Y4vEx4A2jsIE14v26r4UJVWxJr 1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1le2I262IYc4CY6c8Ij28IcVAaY2xG 8wAqjxCEc2xF0cIa020Ex4CE44I27wAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aV AFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcVAKI48JMx8GjcxK6IxK0xII j40E5I8CrwCYjI0SjxkI62AI1cAE67vIY487MxAIw28IcxkI7VAKI48JMxC20s026xCaFV Cjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_JrI_JrWlx2IqxVCjr7xvwVAFwI0_JrI_JrWl x4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r 1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4v20xvaj40_Jr0_ JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVWUJVW8JbIYCT nIWIevJa73UjIFyTuYvjxUz4SrUUUUU X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00,BODY_8BITS,GIT_PATCH_0,HTML_MESSAGE,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------DBD0133457D0E573324381F9 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit 在 2022/11/16 上午11:06, WANG Xuerui 写道: > > On 2022/11/16 10:10, Lulu Cheng wrote: >> v2 -> v3: >> 1. Remove preldx support. >> >> --------------------------------------- >> Enable sw prefetching at -O3 and higher. >> >> Co-Authored-By: xujiahao >> >> gcc/ChangeLog: >> >>     * config/loongarch/constraints.md (ZD): New constraint. >>     * config/loongarch/loongarch-def.c: Initial number of parallel >> prefetch. >>     * config/loongarch/loongarch-tune.h (struct loongarch_cache): >>     Define number of parallel prefetch. >>     * config/loongarch/loongarch.cc >> (loongarch_option_override_internal): >>     Set up parameters to be used in prefetching algorithm. >>     * config/loongarch/loongarch.md (prefetch): New template. >> --- >>   gcc/config/loongarch/constraints.md   | 10 ++++++++++ >>   gcc/config/loongarch/loongarch-def.c  |  2 ++ >>   gcc/config/loongarch/loongarch-tune.h |  1 + >>   gcc/config/loongarch/loongarch.cc     | 28 +++++++++++++++++++++++++++ >>   gcc/config/loongarch/loongarch.md     | 14 ++++++++++++++ >>   5 files changed, 55 insertions(+) >> >> diff --git a/gcc/config/loongarch/constraints.md >> b/gcc/config/loongarch/constraints.md >> index 43cb7b5f0f5..46f7f63ae31 100644 >> --- a/gcc/config/loongarch/constraints.md >> +++ b/gcc/config/loongarch/constraints.md >> @@ -86,6 +86,10 @@ >>   ;;    "ZB" >>   ;;      "An address that is held in a general-purpose register. >>   ;;      The offset is zero" >> +;;    "ZD" >> +;;    "An address operand whose address is formed by a base register >> +;;     and offset that is suitable for use in instructions with the >> same >> +;;     addressing mode as @code{preld}." >>   ;; "<" "Matches a pre-dec or post-dec operand." (Global >> non-architectural) >>   ;; ">" "Matches a pre-inc or post-inc operand." (Global >> non-architectural) >>   @@ -190,3 +194,9 @@ (define_memory_constraint "ZB" >>     The offset is zero" >>     (and (match_code "mem") >>          (match_test "REG_P (XEXP (op, 0))"))) >> + >> +(define_address_constraint "ZD" >> +  "An address operand whose address is formed by a base register >> +   and offset that is suitable for use in instructions with the same >> +   addressing mode as @code{preld}." >> +   (match_test "loongarch_12bit_offset_address_p (op, mode)")) > > How is this different with the "m" constraint? AFAIK preld and ld > share the same addressing mode (i.e. base register + 12-bit signed > immediate offset). The "m" constraint is defined as follows: (define_memory_constraint "m" *  (and (match_code "mem")*        (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))) This setting must be a memory operand. ''ZD" constraint is a address operand. I think (mem:mode (address operand)) = memory operand. > >> diff --git a/gcc/config/loongarch/loongarch-def.c >> b/gcc/config/loongarch/loongarch-def.c >> index cbf995d81b5..80ab10a52a8 100644 >> --- a/gcc/config/loongarch/loongarch-def.c >> +++ b/gcc/config/loongarch/loongarch-def.c >> @@ -62,11 +62,13 @@ loongarch_cpu_cache[N_TUNE_TYPES] = { >>         .l1d_line_size = 64, >>         .l1d_size = 64, >>         .l2d_size = 256, >> +      .simultaneous_prefetches = 4, >>     }, >>     [CPU_LA464] = { >>         .l1d_line_size = 64, >>         .l1d_size = 64, >>         .l2d_size = 256, >> +      .simultaneous_prefetches = 4, >>     }, >>   }; >>   diff --git a/gcc/config/loongarch/loongarch-tune.h >> b/gcc/config/loongarch/loongarch-tune.h >> index 6f3530f5c02..8e3eb29472b 100644 >> --- a/gcc/config/loongarch/loongarch-tune.h >> +++ b/gcc/config/loongarch/loongarch-tune.h >> @@ -45,6 +45,7 @@ struct loongarch_cache { >>       int l1d_line_size;  /* bytes */ >>       int l1d_size;       /* KiB */ >>       int l2d_size;       /* kiB */ >> +    int simultaneous_prefetches; /* number of parallel prefetch */ > nit: "prefetches" or "prefetch ops" or "int prefetch_width"? >>   }; >>     #endif /* LOONGARCH_TUNE_H */ >> diff --git a/gcc/config/loongarch/loongarch.cc >> b/gcc/config/loongarch/loongarch.cc >> index 8d5d8d965dd..8ee32c90573 100644 >> --- a/gcc/config/loongarch/loongarch.cc >> +++ b/gcc/config/loongarch/loongarch.cc >> @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see >>   #include "context.h" >>   #include "builtins.h" >>   #include "rtl-iter.h" >> +#include "opts.h" >>     /* This file should be included last.  */ >>   #include "target-def.h" >> @@ -6100,6 +6101,33 @@ loongarch_option_override_internal (struct >> gcc_options *opts) >>     if (loongarch_branch_cost == 0) >>       loongarch_branch_cost = loongarch_cost->branch_cost; >>   +  /* Set up parameters to be used in prefetching algorithm. */ >> +  int simultaneous_prefetches >> +    = loongarch_cpu_cache[LARCH_ACTUAL_TUNE].simultaneous_prefetches; >> + >> +  SET_OPTION_IF_UNSET (opts, &global_options_set, >> +               param_simultaneous_prefetches, >> +               simultaneous_prefetches); >> + >> +  SET_OPTION_IF_UNSET (opts, &global_options_set, >> +               param_l1_cache_line_size, >> + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l1d_line_size); >> + >> +  SET_OPTION_IF_UNSET (opts, &global_options_set, >> +               param_l1_cache_size, >> + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l1d_size); >> + >> +  SET_OPTION_IF_UNSET (opts, &global_options_set, >> +               param_l2_cache_size, >> + loongarch_cpu_cache[LARCH_ACTUAL_TUNE].l2d_size); >> + >> + >> +  /* Enable sw prefetching at -O3 and higher.  */ >> +  if (opts->x_flag_prefetch_loop_arrays < 0 >> +      && (opts->x_optimize >= 3 || opts->x_flag_profile_use) >> +      && !opts->x_optimize_size) >> +    opts->x_flag_prefetch_loop_arrays = 1; >> + >>     if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib) >>       error ("%qs cannot be used for compiling a shared library", >>          "-mdirect-extern-access"); >> diff --git a/gcc/config/loongarch/loongarch.md >> b/gcc/config/loongarch/loongarch.md >> index 682ab961741..2fda5381904 100644 >> --- a/gcc/config/loongarch/loongarch.md >> +++ b/gcc/config/loongarch/loongarch.md >> @@ -3282,6 +3282,20 @@ (define_expand "untyped_call" >>   ;;  .................... >>   ;; >>   +(define_insn "prefetch" >> +  [(prefetch (match_operand 0 "address_operand" "ZD") >> +         (match_operand 1 "const_int_operand" "n") >> +         (match_operand 2 "const_int_operand" "n"))] >> +  "" >> +{ >> +  switch (INTVAL (operands[1])) >> +  { >> +    case 0: return "preld\t0,%a0"; >> +    case 1: return "preld\t8,%a0"; >> +    default: gcc_unreachable (); >> +  } >> +}) >> + >>   (define_insn "nop" >>     [(const_int 0)] >>     "" --------------DBD0133457D0E573324381F9--