From: Richard Biener <richard.guenther@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>,
GCC Patches <gcc-patches@gcc.gnu.org>,
Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [11/n] Support vectorisation with mixed vector sizes
Date: Tue, 12 Nov 2019 09:22:00 -0000 [thread overview]
Message-ID: <CAFiYyc1Tx-T7oqJbFUcz0k0aV1oNbut6J86khrGCtDf=Rpx=Sg@mail.gmail.com> (raw)
In-Reply-To: <mpt4kzhp4cw.fsf@arm.com>
On Wed, Nov 6, 2019 at 1:38 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> After previous patches, it's now possible to make the vectoriser
> >> support multiple vector sizes in the same vector region, using
> >> related_vector_mode to pick the right vector mode for a given
> >> element mode. No port yet takes advantage of this, but I have
> >> a follow-on patch for AArch64.
> >>
> >> This patch also seemed like a good opportunity to add some more dump
> >> messages: one to make it clear which vector size/mode was being used
> >> when analysis passed or failed, and another to say when we've decided
> >> to skip a redundant vector size/mode.
> >
> > OK.
> >
> > I wonder if, when we requested a specific size previously, we now
> > have to verify we got that constraint satisfied after the change.
> > Esp. the epilogue vectorization cases want to get V2DI
> > from V4DI.
> >
> > sz /= 2;
> > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> > + scalar_type,
> > + sz / scalar_bytes);
> >
> > doesn't look like an improvement in readability to me there.
>
> Yeah, guess it isn't great.
>
> > Maybe re-formulating the whole code in terms of lanes instead of size
> > would make it easier to follow?
>
> OK, how about this version? It still won't win awards, but it's at
> least a bit more readable.
>
> Tested as before.
OK (and sorry for the delay, looking for leftovers of the series now).
Thanks,
Richard.
> Richard
>
>
> 2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
>
> gcc/
> * machmode.h (opt_machine_mode::operator==): New function.
> (opt_machine_mode::operator!=): Likewise.
> * tree-vectorizer.h (vec_info::vector_mode): Update comment.
> (get_related_vectype_for_scalar_type): Delete.
> (get_vectype_for_scalar_type_and_size): Declare.
> * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
> whether analysis passed or failed, and with what vector modes.
> Use related_vector_mode to check whether trying a particular
> vector mode would be redundant with the autodetected mode,
> and print a dump message if we decide to skip it.
> * tree-vect-loop.c (vect_analyze_loop): Likewise.
> (vect_create_epilog_for_reduction): Use
> get_related_vectype_for_scalar_type instead of
> get_vectype_for_scalar_type_and_size.
> * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
> with...
> (get_related_vectype_for_scalar_type): ...this new function.
> Take a starting/"prevailing" vector mode rather than a vector size.
> Take an optional nunits argument, with the same meaning as for
> related_vector_mode. Use related_vector_mode when not
> auto-detecting a mode, falling back to mode_for_vector if no
> target mode exists.
> (get_vectype_for_scalar_type): Update accordingly.
> (get_same_sized_vectype): Likewise.
> * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.
>
> Index: gcc/machmode.h
> ===================================================================
> --- gcc/machmode.h 2019-11-06 12:35:12.460201615 +0000
> +++ gcc/machmode.h 2019-11-06 12:35:27.972093472 +0000
> @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS)
> bool exists () const;
> template<typename U> bool exists (U *) const;
>
> + bool operator== (const T &m) const { return m_mode == m; }
> + bool operator!= (const T &m) const { return m_mode != m; }
> +
> private:
> machine_mode m_mode;
> };
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h 2019-11-06 12:35:12.764199495 +0000
> +++ gcc/tree-vectorizer.h 2019-11-06 12:35:27.976093444 +0000
> @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object
> /* Cost data used by the target cost model. */
> void *target_cost_data;
>
> - /* If we've chosen a vector size for this vectorization region,
> - this is one mode that has such a size, otherwise it is VOIDmode. */
> + /* The argument we should pass to related_vector_mode when looking up
> + the vector mode for a scalar mode, or VOIDmode if we haven't yet
> + made any decisions about which vector modes to use. */
> machine_mode vector_mode;
>
> private:
> @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop
> extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code);
>
> /* In tree-vect-stmts.c. */
> +extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
> + poly_uint64 = 0);
> extern tree get_vectype_for_scalar_type (vec_info *, tree);
> -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64);
> extern tree get_mask_type_for_scalar_type (vec_info *, tree);
> extern tree get_same_sized_vectype (tree, tree);
> extern bool vect_get_loop_mask_type (loop_vec_info);
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000
> +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000
> @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator
> && dbg_cnt (vect_slp))
> {
> if (dump_enabled_p ())
> - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> + {
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Analysis succeeded with vector mode"
> + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode));
> + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> + }
>
> bb_vinfo->shared->check_datarefs ();
> vect_schedule_slp (bb_vinfo);
> @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator
>
> vectorized = true;
> }
> + else
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Analysis failed with vector mode %s\n",
> + GET_MODE_NAME (bb_vinfo->vector_mode));
> + }
>
> if (mode_i == 0)
> autodetected_vector_mode = bb_vinfo->vector_mode;
> @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator
> delete bb_vinfo;
>
> if (mode_i < vector_modes.length ()
> - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> - GET_MODE_SIZE (autodetected_vector_mode)))
> - mode_i += 1;
> + && VECTOR_MODE_P (autodetected_vector_mode)
> + && (related_vector_mode (vector_modes[mode_i],
> + GET_MODE_INNER (autodetected_vector_mode))
> + == autodetected_vector_mode)
> + && (related_vector_mode (autodetected_vector_mode,
> + GET_MODE_INNER (vector_modes[mode_i]))
> + == vector_modes[mode_i]))
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Skipping vector mode %s, which would"
> + " repeat the analysis for %s\n",
> + GET_MODE_NAME (vector_modes[mode_i]),
> + GET_MODE_NAME (autodetected_vector_mode));
> + mode_i += 1;
> + }
>
> if (vectorized
> || mode_i == vector_modes.length ()
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c 2019-11-06 12:35:12.756199552 +0000
> +++ gcc/tree-vect-loop.c 2019-11-06 12:35:27.972093472 +0000
> @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec
> res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
> if (mode_i == 0)
> autodetected_vector_mode = loop_vinfo->vector_mode;
> + if (dump_enabled_p ())
> + {
> + if (res)
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Analysis succeeded with vector mode %s\n",
> + GET_MODE_NAME (loop_vinfo->vector_mode));
> + else
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Analysis failed with vector mode %s\n",
> + GET_MODE_NAME (loop_vinfo->vector_mode));
> + }
>
> loop->aux = NULL;
> if (res)
> @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec
> }
>
> if (mode_i < vector_modes.length ()
> - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> - GET_MODE_SIZE (autodetected_vector_mode)))
> - mode_i += 1;
> + && VECTOR_MODE_P (autodetected_vector_mode)
> + && (related_vector_mode (vector_modes[mode_i],
> + GET_MODE_INNER (autodetected_vector_mode))
> + == autodetected_vector_mode)
> + && (related_vector_mode (autodetected_vector_mode,
> + GET_MODE_INNER (vector_modes[mode_i]))
> + == vector_modes[mode_i]))
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location,
> + "***** Skipping vector mode %s, which would"
> + " repeat the analysis for %s\n",
> + GET_MODE_NAME (vector_modes[mode_i]),
> + GET_MODE_NAME (autodetected_vector_mode));
> + mode_i += 1;
> + }
>
> if (mode_i == vector_modes.length ()
> || autodetected_vector_mode == VOIDmode)
> @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v
> in a vector mode of smaller size and first reduce upper/lower
> halves against each other. */
> enum machine_mode mode1 = mode;
> - unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
> - unsigned sz1 = sz;
> + unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant ();
> + unsigned nunits1 = nunits;
> if (!slp_reduc
> && (mode1 = targetm.vectorize.split_reduction (mode)) != mode)
> - sz1 = GET_MODE_SIZE (mode1).to_constant ();
> + nunits1 = GET_MODE_NUNITS (mode1).to_constant ();
>
> - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1);
> + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> + scalar_type,
> + nunits1);
> reduce_with_shift = have_whole_vector_shift (mode1);
> if (!VECTOR_MODE_P (mode1))
> reduce_with_shift = false;
> @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v
> /* First reduce the vector to the desired vector size we should
> do shift reduction on by combining upper and lower halves. */
> new_temp = new_phi_result;
> - while (sz > sz1)
> + while (nunits > nunits1)
> {
> gcc_assert (!slp_reduc);
> - sz /= 2;
> - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> + nunits /= 2;
> + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> + scalar_type, nunits);
> + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1));
>
> /* The target has to make sure we support lowpart/highpart
> extraction, either via direct vector extract or through
> @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v
> = gimple_build_assign (dst2, BIT_FIELD_REF,
> build3 (BIT_FIELD_REF, vectype1,
> new_temp, TYPE_SIZE (vectype1),
> - bitsize_int (sz * BITS_PER_UNIT)));
> + bitsize_int (bitsize)));
> gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
> }
> else
> {
> /* Extract via punning to appropriately sized integer mode
> vector. */
> - tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT,
> - 1);
> + tree eltype = build_nonstandard_integer_type (bitsize, 1);
> tree etype = build_vector_type (eltype, 2);
> gcc_assert (convert_optab_handler (vec_extract_optab,
> TYPE_MODE (etype),
> @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v
> = gimple_build_assign (tem, BIT_FIELD_REF,
> build3 (BIT_FIELD_REF, eltype,
> new_temp, TYPE_SIZE (eltype),
> - bitsize_int (sz * BITS_PER_UNIT)));
> + bitsize_int (bitsize)));
> gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
> dst2 = make_ssa_name (vectype1);
> epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR,
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c 2019-11-06 12:35:12.796199272 +0000
> +++ gcc/tree-vect-stmts.c 2019-11-06 12:35:27.976093444 +0000
> @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_
> }
> }
>
> -/* Function get_vectype_for_scalar_type_and_size.
> -
> - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported
> - by the target. */
> +/* If NUNITS is nonzero, return a vector type that contains NUNITS
> + elements of type SCALAR_TYPE, or null if the target doesn't support
> + such a type.
> +
> + If NUNITS is zero, return a vector type that contains elements of
> + type SCALAR_TYPE, choosing whichever vector size the target prefers.
> +
> + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode
> + for this vectorization region and want to "autodetect" the best choice.
> + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE
> + and we want the new type to be interoperable with it. PREVAILING_MODE
> + in this case can be a scalar integer mode or a vector mode; when it
> + is a vector mode, the function acts like a tree-level version of
> + related_vector_mode. */
>
> tree
> -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
> +get_related_vectype_for_scalar_type (machine_mode prevailing_mode,
> + tree scalar_type, poly_uint64 nunits)
> {
> tree orig_scalar_type = scalar_type;
> scalar_mode inner_mode;
> machine_mode simd_mode;
> - poly_uint64 nunits;
> tree vectype;
>
> if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
> @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr
> if (scalar_type == NULL_TREE)
> return NULL_TREE;
>
> - /* If no size was supplied use the mode the target prefers. Otherwise
> - lookup a vector mode of the specified size. */
> - if (known_eq (size, 0U))
> + /* If no prevailing mode was supplied, use the mode the target prefers.
> + Otherwise lookup a vector mode based on the prevailing mode. */
> + if (prevailing_mode == VOIDmode)
> {
> + gcc_assert (known_eq (nunits, 0U));
> simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
> if (SCALAR_INT_MODE_P (simd_mode))
> {
> @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr
> return NULL_TREE;
> }
> }
> - else if (!multiple_p (size, nbytes, &nunits)
> - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> - return NULL_TREE;
> + else if (SCALAR_INT_MODE_P (prevailing_mode)
> + || !related_vector_mode (prevailing_mode,
> + inner_mode, nunits).exists (&simd_mode))
> + {
> + /* Fall back to using mode_for_vector, mostly in the hope of being
> + able to use an integer mode. */
> + if (known_eq (nunits, 0U)
> + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits))
> + return NULL_TREE;
> +
> + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> + return NULL_TREE;
> + }
>
> vectype = build_vector_type_for_mode (scalar_type, simd_mode);
>
> @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr
> tree
> get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type)
> {
> - tree vectype;
> - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode);
> - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size);
> + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode,
> + scalar_type);
> if (vectype && vinfo->vector_mode == VOIDmode)
> vinfo->vector_mode = TYPE_MODE (vectype);
> return vectype;
> @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type
> if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
> return truth_type_for (vector_type);
>
> - return get_vectype_for_scalar_type_and_size
> - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type)));
> + poly_uint64 nunits;
> + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits))
> + return NULL_TREE;
> +
> + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type),
> + scalar_type, nunits);
> }
>
> /* Function vect_is_simple_use.
> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c 2019-11-06 12:35:12.764199495 +0000
> +++ gcc/tree-vectorizer.c 2019-11-06 12:35:27.976093444 +0000
> @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t
> poly_uint64 array_size, vector_size;
>
> tree scalar_type = strip_array_types (type);
> - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0);
> + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type);
> if (!vectype
> || !poly_int_tree_p (TYPE_SIZE (type), &array_size)
> || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
next prev parent reply other threads:[~2019-11-12 9:22 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-25 12:32 [0/n] Support multiple vector sizes for vectorisation Richard Sandiford
2019-10-25 12:34 ` [7/n] Use consistent compatibility checks in vectorizable_shift Richard Sandiford
2019-10-30 14:33 ` Richard Biener
2019-10-25 12:34 ` [6/n] Use build_vector_type_for_mode in get_vectype_for_scalar_type_and_size Richard Sandiford
2019-10-30 14:32 ` Richard Biener
2019-10-25 12:39 ` [8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes Richard Sandiford
2019-10-30 14:48 ` Richard Biener
2019-10-30 16:33 ` Richard Sandiford
2019-11-11 10:30 ` Richard Sandiford
2019-11-11 14:33 ` Richard Biener
2019-11-12 17:55 ` Richard Sandiford
2019-11-13 14:32 ` Richard Biener
2019-11-13 16:16 ` Richard Sandiford
2019-10-25 12:41 ` [9/n] Replace vec_info::vector_size with vec_info::vector_mode Richard Sandiford
2019-11-05 12:47 ` Richard Biener
2019-10-25 12:43 ` [10/n] Make less use of get_same_sized_vectype Richard Sandiford
2019-11-05 12:50 ` Richard Biener
2019-11-05 15:34 ` Richard Sandiford
2019-11-05 16:09 ` Richard Biener
2019-10-25 12:44 ` [11/n] Support vectorisation with mixed vector sizes Richard Sandiford
2019-11-05 12:57 ` Richard Biener
2019-11-06 12:38 ` Richard Sandiford
2019-11-12 9:22 ` Richard Biener [this message]
2019-10-25 12:49 ` [12/n] [AArch64] Support vectorising with multiple " Richard Sandiford
2019-10-25 12:51 ` [13/n] Allow mixed vector sizes within a single vectorised stmt Richard Sandiford
2019-11-05 12:58 ` Richard Biener
2019-10-25 13:00 ` [14/n] Vectorise conversions between differently-sized integer vectors Richard Sandiford
2019-11-05 13:02 ` Richard Biener
2019-11-06 12:45 ` Richard Sandiford
2019-11-12 9:40 ` Richard Biener
2019-10-29 17:05 ` [15/n] Consider building nodes from scalars in vect_slp_analyze_node_operations Richard Sandiford
2019-11-05 13:07 ` Richard Biener
2019-10-29 17:14 ` [16/n] Apply maximum nunits for BB SLP Richard Sandiford
2019-11-05 13:22 ` Richard Biener
2019-11-05 14:09 ` Richard Sandiford
2019-11-14 12:22 ` Richard Biener
2019-11-05 20:10 ` [10a/n] Require equal type sizes for vectorised calls Richard Sandiford
2019-11-06 9:44 ` Richard Biener
2019-11-05 20:25 ` [11a/n] Avoid retrying with the same vector modes Richard Sandiford
2019-11-06 9:49 ` Richard Biener
2019-11-06 10:21 ` Richard Sandiford
2019-11-06 10:27 ` Richard Biener
2019-11-06 11:02 ` Richard Sandiford
2019-11-06 11:22 ` Richard Biener
2019-11-06 12:47 ` Richard Sandiford
2019-11-12 9:25 ` Richard Biener
2019-11-05 20:45 ` [17/17] Extend can_duplicate_and_interleave_p to mixed-size vectors Richard Sandiford
2019-11-14 12:23 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc1Tx-T7oqJbFUcz0k0aV1oNbut6J86khrGCtDf=Rpx=Sg@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).