public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>, richard.sandiford@arm.com
Cc: Jim Wilson <jimw@sifive.com>,
	Richard Biener <richard.guenther@gmail.com>,
	Bill Schmidt <wschmidt@linux.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	Jim Wilson <wilson@tuliptree.org>
Subject: [PATCH 1/7 v8] ifn/optabs: Support vector load/store with length
Date: Wed, 1 Jul 2020 21:35:15 +0800	[thread overview]
Message-ID: <7ea2bf61-5710-d3d4-26f0-665504e061c4@linux.ibm.com> (raw)
In-Reply-To: <mptimf84lja.fsf@arm.com>

[-- Attachment #1: Type: text/plain, Size: 3708 bytes --]

Hi Richard,

on 2020/6/30 下午11:32, Richard Sandiford wrote:
> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>> Hi Richard,
>>
>> Thanks for the comments!
>>
>> on 2020/6/29 下午6:07, Richard Sandiford wrote:
>>> Thanks for the update.  I agree with the summary of the IRC discussion
>>> except for…
>>>
>>> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>>>> Hi Richard S./Richi/Jim/Segher,
>>>>
>>>> Thanks a lot for your comments to make this patch more solid.
>>>>
>>>> Based on our discussion, for the vector load/store with length
>>>> optab, the length unit would be measured in lanes by default.
>>>> For the targets which support length measured in bytes like Power,
>>>> they should only define VnQI modes to wrap the other same size
>>>> vector modes.  If the length is larger than total lane/byte count
>>>> of the given mode, it's taken to load all lanes/bytes implicitly.
>>>
>>> …this last bit.  IMO the behaviour of the optab should be undefined
>>> when the supplied length is greater than the number of lanes.
>>>
>>> I think that also makes things better for the lxvl implementation,
>>> which ignores the upper 56 bits of the length.  It sounds like the
>>> above semantics would instead require Power to saturate the value
>>> at 255 before shifting it.
>>>
>>
>> Good catch, I just realized that this part is inconsistent to what I
>> implemented in patch 5/7, where the function vect_gen_len still does
>> the min operation between the given length and length_limit.
>>
>> This patch is updated accordingly to state the behavior to be undefined.
>> The others aren't required to change.
>>
>> Could you have a further look? Thanks in advance!
>>
>> v6/v7: Updated optab descriptions.
>>
>> v5:
>>   - Updated lenload/lenstore optab to len_load/len_store and the docs.
>>   - Rename expand_mask_{load,store}_optab_fn to expand_partial_{load,store}_optab_fn
>>   - Added/updated macros for expand_mask_{load,store}_optab_fn
>>     and expand_len_{load,store}_optab_fn
>>
>> v4: Update len_load_direct/len_store_direct to align with direct optab.
>>
>> v3: Get rid of length mode hook.
> 
> Thanks, mostly looks good, just some comments about the documentation…
> 

Thanks here again!!!

V8 attached with updates according to your comments!  

Could you have a check again?  Thanks!

-----

v6/v7/v8: Updated optab descriptions.

v5:
  - Updated lenload/lenstore optab to len_load/len_store and the docs.
  - Rename expand_mask_{load,store}_optab_fn to expand_partial_{load,store}_optab_fn
  - Added/updated macros for expand_mask_{load,store}_optab_fn
    and expand_len_{load,store}_optab_fn

v4: Update len_load_direct/len_store_direct to align with direct optab.

v3: Get rid of length mode hook.

BR,
Kewen
-----
gcc/ChangeLog:

2020-MM-DD  Kewen Lin  <linkw@gcc.gnu.org>

	* doc/md.texi (len_load_@var{m}): Document.
	(len_store_@var{m}): Likewise.
	* internal-fn.c (len_load_direct): New macro.
	(len_store_direct): Likewise.
	(expand_len_load_optab_fn): Likewise.
	(expand_len_store_optab_fn): Likewise.
	(direct_len_load_optab_supported_p): Likewise.
	(direct_len_store_optab_supported_p): Likewise.
	(expand_mask_load_optab_fn): New macro.  Original renamed to ...
	(expand_partial_load_optab_fn): ... here.  Add handlings for
	len_load_optab.
	(expand_mask_store_optab_fn): New macro.  Original renamed to ...
	(expand_partial_store_optab_fn): ... here. Add handlings for
	len_store_optab.
	(internal_load_fn_p): Handle IFN_LEN_LOAD.
	(internal_store_fn_p): Handle IFN_LEN_STORE.
	(internal_fn_stored_value_index): Handle IFN_LEN_STORE.
	* internal-fn.def (LEN_LOAD): New internal function.
	(LEN_STORE): Likewise.
	* optabs.def (len_load_optab, len_store_optab): New optab.

[-- Attachment #2: ifn_v8.diff --]
[-- Type: text/plain, Size: 9580 bytes --]

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2c67c818da5..2b462869437 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5167,6 +5167,32 @@ mode @var{n}.
 
 This pattern is not allowed to @code{FAIL}.
 
+@cindex @code{len_load_@var{m}} instruction pattern
+@item @samp{len_load_@var{m}}
+Load the number of vector elements specified by operand 2 from memory
+operand 1 into vector register operand 0, setting the other elements of
+operand 0 to undefined values.  Operands 0 and 1 have mode @var{m},
+which must be a vector mode.  Operand 2 has whichever integer mode the
+target prefers.  If operand 2 exceeds the number of elements in mode
+@var{m}, the behavior is undefined.  If the target prefers the length
+to be measured in bytes rather than elements, it should only implement
+this pattern for vectors of @code{QI} elements.
+
+This pattern is not allowed to @code{FAIL}.
+
+@cindex @code{len_store_@var{m}} instruction pattern
+@item @samp{len_store_@var{m}}
+Store the number of vector elements specified by operand 2 from vector
+register operand 1 into memory operand 0, leaving the other elements of
+operand 0 unchanged.  Operands 0 and 1 have mode @var{m}, which must be
+a vector mode.  Operand 2 has whichever integer mode the target prefers.
+If operand 2 exceeds the number of elements in mode @var{m}, the behavior
+is undefined.  If the target prefers the length to be measured in bytes
+rather than elements, it should only implement this pattern for vectors
+of @code{QI} elements.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_perm@var{m}} instruction pattern
 @item @samp{vec_perm@var{m}}
 Output a (variable) vector permutation.  Operand 0 is the destination
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 4f088de48d5..1e53ced60eb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -104,10 +104,12 @@ init_internal_fns ()
 #define load_lanes_direct { -1, -1, false }
 #define mask_load_lanes_direct { -1, -1, false }
 #define gather_load_direct { 3, 1, false }
+#define len_load_direct { -1, -1, false }
 #define mask_store_direct { 3, 2, false }
 #define store_lanes_direct { 0, 0, false }
 #define mask_store_lanes_direct { 0, 0, false }
 #define scatter_store_direct { 3, 1, false }
+#define len_store_direct { 3, 3, false }
 #define unary_direct { 0, 0, true }
 #define binary_direct { 0, 0, true }
 #define ternary_direct { 0, 0, true }
@@ -2478,10 +2480,10 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
   return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));
 }
 
-/* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */
+/* Expand MASK_LOAD{,_LANES} or LEN_LOAD call STMT using optab OPTAB.  */
 
 static void
-expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   class expand_operand ops[3];
   tree type, lhs, rhs, maskt;
@@ -2497,6 +2499,8 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 
   if (optab == vec_mask_load_lanes_optab)
     icode = get_multi_vector_move (type, optab);
+  else if (optab == len_load_optab)
+    icode = direct_optab_handler (optab, TYPE_MODE (type));
   else
     icode = convert_optab_handler (optab, TYPE_MODE (type),
 				   TYPE_MODE (TREE_TYPE (maskt)));
@@ -2507,18 +2511,24 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   create_output_operand (&ops[0], target, TYPE_MODE (type));
   create_fixed_operand (&ops[1], mem);
-  create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
+  if (optab == len_load_optab)
+    create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
+				 TYPE_UNSIGNED (TREE_TYPE (maskt)));
+  else
+    create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
   expand_insn (icode, 3, ops);
   if (!rtx_equal_p (target, ops[0].value))
     emit_move_insn (target, ops[0].value);
 }
 
+#define expand_mask_load_optab_fn expand_partial_load_optab_fn
 #define expand_mask_load_lanes_optab_fn expand_mask_load_optab_fn
+#define expand_len_load_optab_fn expand_partial_load_optab_fn
 
-/* Expand MASK_STORE{,_LANES} call STMT using optab OPTAB.  */
+/* Expand MASK_STORE{,_LANES} or LEN_STORE call STMT using optab OPTAB.  */
 
 static void
-expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+expand_partial_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   class expand_operand ops[3];
   tree type, lhs, rhs, maskt;
@@ -2532,6 +2542,8 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 
   if (optab == vec_mask_store_lanes_optab)
     icode = get_multi_vector_move (type, optab);
+  else if (optab == len_store_optab)
+    icode = direct_optab_handler (optab, TYPE_MODE (type));
   else
     icode = convert_optab_handler (optab, TYPE_MODE (type),
 				   TYPE_MODE (TREE_TYPE (maskt)));
@@ -2542,11 +2554,17 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   reg = expand_normal (rhs);
   create_fixed_operand (&ops[0], mem);
   create_input_operand (&ops[1], reg, TYPE_MODE (type));
-  create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
+  if (optab == len_store_optab)
+    create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
+				 TYPE_UNSIGNED (TREE_TYPE (maskt)));
+  else
+    create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
   expand_insn (icode, 3, ops);
 }
 
+#define expand_mask_store_optab_fn expand_partial_store_optab_fn
 #define expand_mask_store_lanes_optab_fn expand_mask_store_optab_fn
+#define expand_len_store_optab_fn expand_partial_store_optab_fn
 
 static void
 expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
@@ -3128,10 +3146,12 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_gather_load_optab_supported_p convert_optab_supported_p
+#define direct_len_load_optab_supported_p direct_optab_supported_p
 #define direct_mask_store_optab_supported_p convert_optab_supported_p
 #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_scatter_store_optab_supported_p convert_optab_supported_p
+#define direct_len_store_optab_supported_p direct_optab_supported_p
 #define direct_while_optab_supported_p convert_optab_supported_p
 #define direct_fold_extract_optab_supported_p direct_optab_supported_p
 #define direct_fold_left_optab_supported_p direct_optab_supported_p
@@ -3498,6 +3518,7 @@ internal_load_fn_p (internal_fn fn)
     case IFN_MASK_LOAD_LANES:
     case IFN_GATHER_LOAD:
     case IFN_MASK_GATHER_LOAD:
+    case IFN_LEN_LOAD:
       return true;
 
     default:
@@ -3517,6 +3538,7 @@ internal_store_fn_p (internal_fn fn)
     case IFN_MASK_STORE_LANES:
     case IFN_SCATTER_STORE:
     case IFN_MASK_SCATTER_STORE:
+    case IFN_LEN_STORE:
       return true;
 
     default:
@@ -3577,6 +3599,7 @@ internal_fn_stored_value_index (internal_fn fn)
     case IFN_MASK_STORE:
     case IFN_SCATTER_STORE:
     case IFN_MASK_SCATTER_STORE:
+    case IFN_LEN_STORE:
       return 3;
 
     default:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 1d190d492ff..17dac128e83 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -49,11 +49,13 @@ along with GCC; see the file COPYING3.  If not see
    - load_lanes: currently just vec_load_lanes
    - mask_load_lanes: currently just vec_mask_load_lanes
    - gather_load: used for {mask_,}gather_load
+   - len_load: currently just len_load
 
    - mask_store: currently just maskstore
    - store_lanes: currently just vec_store_lanes
    - mask_store_lanes: currently just vec_mask_store_lanes
    - scatter_store: used for {mask_,}scatter_store
+   - len_store: currently just len_store
 
    - unary: a normal unary optab, such as vec_reverse_<mode>
    - binary: a normal binary optab, such as vec_interleave_lo_<mode>
@@ -127,6 +129,8 @@ DEF_INTERNAL_OPTAB_FN (GATHER_LOAD, ECF_PURE, gather_load, gather_load)
 DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE,
 		       mask_gather_load, gather_load)
 
+DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, len_load, len_load)
+
 DEF_INTERNAL_OPTAB_FN (SCATTER_STORE, 0, scatter_store, scatter_store)
 DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0,
 		       mask_scatter_store, scatter_store)
@@ -136,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, store_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_STORE_LANES, 0,
 		       vec_mask_store_lanes, mask_store_lanes)
 
+DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
+
 DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
 DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW,
 		       check_raw_ptrs, check_ptrs)
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 0c64eb52a8d..78409aa1453 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -435,3 +435,5 @@ OPTAB_D (check_war_ptrs_optab, "check_war_ptrs$a")
 OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
 OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES)
 OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a")
+OPTAB_D (len_load_optab, "len_load_$a")
+OPTAB_D (len_store_optab, "len_store_$a")

  reply	other threads:[~2020-07-01 13:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26  5:49 [PATCH 0/7] " Kewen.Lin
2020-05-26  5:51 ` [PATCH 1/7] ifn/optabs: " Kewen.Lin
2020-06-10  6:41   ` [PATCH 1/7 V2] " Kewen.Lin
2020-06-10  9:22     ` Richard Sandiford
2020-06-10 12:36       ` [PATCH 1/7 V3] " Kewen.Lin
2020-06-22  8:51         ` [PATCH 1/7 V4] " Kewen.Lin
2020-06-22 19:59           ` Richard Sandiford
2020-06-22 22:19             ` Segher Boessenkool
2020-06-23  3:54             ` [PATCH 1/7 v5] " Kewen.Lin
2020-06-23  9:52               ` Richard Sandiford
2020-06-23 11:25                 ` Richard Biener
2020-06-23 12:20                   ` Richard Sandiford
2020-06-24  2:40                     ` Jim Wilson
2020-06-24  7:34                       ` Richard Sandiford
2020-06-29  6:32                         ` [PATCH 1/7 v6] " Kewen.Lin
2020-06-29 10:07                           ` Richard Sandiford
2020-06-29 10:39                             ` [PATCH 1/7 v7] " Kewen.Lin
2020-06-30 15:32                               ` Richard Sandiford
2020-07-01 13:35                                 ` Kewen.Lin [this message]
2020-07-07  9:24                                   ` [PATCH 1/7 v8] " Richard Sandiford
2020-06-24 23:56                     ` [PATCH 1/7 v5] " Segher Boessenkool
2020-06-23  6:47             ` [PATCH 1/7 V4] " Richard Biener
2020-05-26  5:53 ` [PATCH 2/7] rs6000: lenload/lenstore optab support Kewen.Lin
2020-06-10  6:43   ` [PATCH 2/7 V2] " Kewen.Lin
2020-06-10 12:39     ` [PATCH 2/7 V3] " Kewen.Lin
2020-06-11 22:55       ` Segher Boessenkool
2020-06-12  3:02         ` Kewen.Lin
2020-06-23  3:58       ` [PATCH 2/7 v4] " Kewen.Lin
2020-06-29  6:32         ` [PATCH 2/7 v5] " Kewen.Lin
2020-06-29 17:57           ` Segher Boessenkool
2020-05-26  5:54 ` [PATCH 3/7] vect: Factor out codes for niters smaller than vf check Kewen.Lin
2020-05-26  5:55 ` [PATCH 4/7] hook/rs6000: Add vectorize length mode for vector with length Kewen.Lin
2020-06-10  6:44   ` [PATCH 4/7 V2] " Kewen.Lin
2020-05-26  5:57 ` [PATCH 5/7] vect: Support vector load/store with length in vectorizer Kewen.Lin
2020-05-26 12:49   ` Richard Sandiford
2020-05-26 12:52     ` Richard Sandiford
2020-05-27  8:25     ` Kewen.Lin
2020-05-27 10:02       ` Richard Sandiford
2020-05-28  1:21         ` Kewen.Lin
2020-05-29  8:32           ` Richard Sandiford
2020-05-29 12:38             ` Segher Boessenkool
2020-06-02  9:03             ` [PATCH 5/7 v3] " Kewen.Lin
2020-06-02 11:50               ` Richard Sandiford
2020-06-02 17:01                 ` Segher Boessenkool
2020-06-03  6:33                 ` Kewen.Lin
2020-06-10  9:19                   ` [PATCH 5/7 v4] " Kewen.Lin
2020-06-22  8:33                     ` [PATCH 5/7 v5] " Kewen.Lin
2020-06-29  6:33                       ` [PATCH 5/7 v6] " Kewen.Lin
2020-06-30 19:53                         ` Richard Sandiford
2020-07-01 13:23                           ` Kewen.Lin
2020-07-01 15:17                             ` Richard Sandiford
2020-07-02  5:20                               ` Kewen.Lin
2020-07-07  9:26                                 ` Kewen.Lin
2020-07-07 10:44                                   ` Richard Sandiford
2020-07-08  6:52                                     ` Kewen.Lin
2020-07-08 12:50                                       ` Richard Sandiford
2020-07-10  7:40                                         ` Kewen.Lin
2020-07-07 10:15                                 ` Richard Sandiford
2020-07-08  7:01                                   ` Kewen.Lin
2020-07-10  9:55                           ` [PATCH 5/7 v7] " Kewen.Lin
2020-07-17  9:54                             ` Richard Sandiford
2020-07-20  2:25                               ` Kewen.Lin
2020-05-26  5:58 ` [PATCH 6/7] ivopts: Add handlings for vector with length IFNs Kewen.Lin
2020-07-22 12:51   ` Richard Sandiford
2020-05-26  5:59 ` [PATCH 7/7] rs6000/testsuite: Vector with length test cases Kewen.Lin
2020-07-10 10:07   ` [PATCH 7/7 v2] " Kewen.Lin
2020-07-20 16:58     ` Segher Boessenkool
2020-07-21  2:53       ` Kewen.Lin
2020-05-26  7:12 ` [PATCH 0/7] Support vector load/store with length Richard Biener
2020-05-26  8:51   ` Kewen.Lin
2020-05-26  9:44     ` Richard Biener
2020-05-26 10:10       ` Kewen.Lin
2020-05-26 12:29         ` Richard Sandiford
2020-05-27  0:09           ` Segher Boessenkool
2020-05-27  7:25             ` Richard Biener
2020-05-27  8:50               ` Kewen.Lin
2020-05-27 14:08               ` Segher Boessenkool
2020-05-26 22:34   ` Jim Wilson
2020-05-27  7:21     ` Richard Biener
2020-05-27  7:46       ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ea2bf61-5710-d3d4-26f0-665504e061c4@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jimw@sifive.com \
    --cc=richard.guenther@gmail.com \
    --cc=richard.sandiford@arm.com \
    --cc=segher@kernel.crashing.org \
    --cc=wilson@tuliptree.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).