From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 102423 invoked by alias); 5 Nov 2019 12:57:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 102410 invoked by uid 89); 5 Nov 2019 12:57:03 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=sk:build_v X-HELO: mail-lj1-f193.google.com Received: from mail-lj1-f193.google.com (HELO mail-lj1-f193.google.com) (209.85.208.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 05 Nov 2019 12:57:01 +0000 Received: by mail-lj1-f193.google.com with SMTP id y23so10793874ljh.10 for ; Tue, 05 Nov 2019 04:57:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9q+SrATeUQ8JjvQnuLJQTIdAQWtLZDMpts1wFYTsNcw=; b=W4CCco4UyBb9zon6lYshMpU1whblSiwQEdjk3AovjQ21gaTmLAVhajZZWzcinWanFK wYykwDKBu5019j0xRUcEV5PSToOOTjsKAR1kzN772H2BUFN4XjmsN8HUrPWSnEMDfPKa 8OUbh1tJ4Kbim+E22rhpf8buR/90PdpGtWtMhB55Qem7MgIC3a7VDgTnyZHCg0sVSSkQ gdOrSLF9pkbcioPxrfX7Vwci5vPIfEDexaSDK/KDgtYaSkmCJ763kKPV0QaIVWIOtsjR SIdODzzbhDOZvknUS389puS5x/kQKwtLAK8wn8AylPvb3oTLbBqU1aAVds7TroyPWYlm dtaQ== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Tue, 05 Nov 2019 12:57:00 -0000 Message-ID: Subject: Re: [11/n] Support vectorisation with mixed vector sizes To: Richard Sandiford Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-11/txt/msg00269.txt.bz2 On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford wrote: > > After previous patches, it's now possible to make the vectoriser > support multiple vector sizes in the same vector region, using > related_vector_mode to pick the right vector mode for a given > element mode. No port yet takes advantage of this, but I have > a follow-on patch for AArch64. > > This patch also seemed like a good opportunity to add some more dump > messages: one to make it clear which vector size/mode was being used > when analysis passed or failed, and another to say when we've decided > to skip a redundant vector size/mode. OK. I wonder if, when we requested a specific size previously, we now have to verify we got that constraint satisfied after the change. Esp. the epilogue vectorization cases want to get V2DI from V4DI. sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz / scalar_bytes); doesn't look like an improvement in readability to me there. Maybe re-formulating the whole code in terms of lanes instead of size would make it easier to follow? Thanks, Richard. > > 2019-10-24 Richard Sandiford > > gcc/ > * machmode.h (opt_machine_mode::operator==): New function. > (opt_machine_mode::operator!=): Likewise. > * tree-vectorizer.h (vec_info::vector_mode): Update comment. > (get_related_vectype_for_scalar_type): Delete. > (get_vectype_for_scalar_type_and_size): Declare. > * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say > whether analysis passed or failed, and with what vector modes. > Use related_vector_mode to check whether trying a particular > vector mode would be redundant with the autodetected mode, > and print a dump message if we decide to skip it. > * tree-vect-loop.c (vect_analyze_loop): Likewise. > (vect_create_epilog_for_reduction): Use > get_related_vectype_for_scalar_type instead of > get_vectype_for_scalar_type_and_size. > * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace > with... > (get_related_vectype_for_scalar_type): ...this new function. > Take a starting/"prevailing" vector mode rather than a vector size. > Take an optional nunits argument, with the same meaning as for > related_vector_mode. Use related_vector_mode when not > auto-detecting a mode, falling back to mode_for_vector if no > target mode exists. > (get_vectype_for_scalar_type): Update accordingly. > (get_same_sized_vectype): Likewise. > * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. > > Index: gcc/machmode.h > =================================================================== > --- gcc/machmode.h 2019-10-25 13:26:59.053879364 +0100 > +++ gcc/machmode.h 2019-10-25 13:27:26.201687539 +0100 > @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) > bool exists () const; > template bool exists (U *) const; > > + bool operator== (const T &m) const { return m_mode == m; } > + bool operator!= (const T &m) const { return m_mode != m; } > + > private: > machine_mode m_mode; > }; > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2019-10-25 13:27:19.317736181 +0100 > +++ gcc/tree-vectorizer.h 2019-10-25 13:27:26.209687483 +0100 > @@ -329,8 +329,9 @@ typedef std::pair vec_object > /* Cost data used by the target cost model. */ > void *target_cost_data; > > - /* If we've chosen a vector size for this vectorization region, > - this is one mode that has such a size, otherwise it is VOIDmode. */ > + /* The argument we should pass to related_vector_mode when looking up > + the vector mode for a scalar mode, or VOIDmode if we haven't yet > + made any decisions about which vector modes to use. */ > machine_mode vector_mode; > > private: > @@ -1595,8 +1596,9 @@ extern dump_user_location_t find_loop_lo > extern bool vect_can_advance_ivs_p (loop_vec_info); > > /* In tree-vect-stmts.c. */ > +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, > + poly_uint64 = 0); > extern tree get_vectype_for_scalar_type (vec_info *, tree); > -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); > extern tree get_mask_type_for_scalar_type (vec_info *, tree); > extern tree get_same_sized_vectype (tree, tree); > extern bool vect_get_loop_mask_type (loop_vec_info); > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c 2019-10-25 13:27:19.313736209 +0100 > +++ gcc/tree-vect-slp.c 2019-10-25 13:27:26.205687511 +0100 > @@ -3118,7 +3118,12 @@ vect_slp_bb_region (gimple_stmt_iterator > && dbg_cnt (vect_slp)) > { > if (dump_enabled_p ()) > - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode" > + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); > + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + } > > bb_vinfo->shared->check_datarefs (); > vect_schedule_slp (bb_vinfo); > @@ -3138,6 +3143,13 @@ vect_slp_bb_region (gimple_stmt_iterator > > vectorized = true; > } > + else > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (bb_vinfo->vector_mode)); > + } > > if (mode_i == 0) > autodetected_vector_mode = bb_vinfo->vector_mode; > @@ -3145,9 +3157,22 @@ vect_slp_bb_region (gimple_stmt_iterator > delete bb_vinfo; > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (vectorized > || mode_i == vector_modes.length () > Index: gcc/tree-vect-loop.c > =================================================================== > --- gcc/tree-vect-loop.c 2019-10-25 13:27:19.309736237 +0100 > +++ gcc/tree-vect-loop.c 2019-10-25 13:27:26.201687539 +0100 > @@ -2367,6 +2367,17 @@ vect_analyze_loop (class loop *loop, loo > opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); > if (mode_i == 0) > autodetected_vector_mode = loop_vinfo->vector_mode; > + if (dump_enabled_p ()) > + { > + if (res) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + else > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + } > > if (res) > { > @@ -2400,9 +2411,22 @@ vect_analyze_loop (class loop *loop, loo > } > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (mode_i == vector_modes.length () > || autodetected_vector_mode == VOIDmode) > @@ -4763,7 +4787,10 @@ vect_create_epilog_for_reduction (stmt_v > && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) > sz1 = GET_MODE_SIZE (mode1).to_constant (); > > - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); > + unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type)); > + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz1 / scalar_bytes); > reduce_with_shift = have_whole_vector_shift (mode1); > if (!VECTOR_MODE_P (mode1)) > reduce_with_shift = false; > @@ -4781,7 +4808,9 @@ vect_create_epilog_for_reduction (stmt_v > { > gcc_assert (!slp_reduc); > sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz / scalar_bytes); > > /* The target has to make sure we support lowpart/highpart > extraction, either via direct vector extract or through > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2019-10-25 13:27:22.985710263 +0100 > +++ gcc/tree-vect-stmts.c 2019-10-25 13:27:26.205687511 +0100 > @@ -11111,18 +11111,28 @@ vect_remove_stores (stmt_vec_info first_ > } > } > > -/* Function get_vectype_for_scalar_type_and_size. > - > - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported > - by the target. */ > +/* If NUNITS is nonzero, return a vector type that contains NUNITS > + elements of type SCALAR_TYPE, or null if the target doesn't support > + such a type. > + > + If NUNITS is zero, return a vector type that contains elements of > + type SCALAR_TYPE, choosing whichever vector size the target prefers. > + > + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode > + for this vectorization region and want to "autodetect" the best choice. > + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE > + and we want the new type to be interoperable with it. PREVAILING_MODE > + in this case can be a scalar integer mode or a vector mode; when it > + is a vector mode, the function acts like a tree-level version of > + related_vector_mode. */ > > tree > -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) > +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, > + tree scalar_type, poly_uint64 nunits) > { > tree orig_scalar_type = scalar_type; > scalar_mode inner_mode; > machine_mode simd_mode; > - poly_uint64 nunits; > tree vectype; > > if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) > @@ -11162,10 +11172,11 @@ get_vectype_for_scalar_type_and_size (tr > if (scalar_type == NULL_TREE) > return NULL_TREE; > > - /* If no size was supplied use the mode the target prefers. Otherwise > - lookup a vector mode of the specified size. */ > - if (known_eq (size, 0U)) > + /* If no prevailing mode was supplied, use the mode the target prefers. > + Otherwise lookup a vector mode based on the prevailing mode. */ > + if (prevailing_mode == VOIDmode) > { > + gcc_assert (known_eq (nunits, 0U)); > simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); > if (SCALAR_INT_MODE_P (simd_mode)) > { > @@ -11181,9 +11192,19 @@ get_vectype_for_scalar_type_and_size (tr > return NULL_TREE; > } > } > - else if (!multiple_p (size, nbytes, &nunits) > - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > - return NULL_TREE; > + else if (SCALAR_INT_MODE_P (prevailing_mode) > + || !related_vector_mode (prevailing_mode, > + inner_mode, nunits).exists (&simd_mode)) > + { > + /* Fall back to using mode_for_vector, mostly in the hope of being > + able to use an integer mode. */ > + if (known_eq (nunits, 0U) > + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) > + return NULL_TREE; > + > + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > + return NULL_TREE; > + } > > vectype = build_vector_type_for_mode (scalar_type, simd_mode); > > @@ -11211,9 +11232,8 @@ get_vectype_for_scalar_type_and_size (tr > tree > get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) > { > - tree vectype; > - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); > - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); > + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, > + scalar_type); > if (vectype && vinfo->vector_mode == VOIDmode) > vinfo->vector_mode = TYPE_MODE (vectype); > return vectype; > @@ -11246,8 +11266,13 @@ get_same_sized_vectype (tree scalar_type > if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > return truth_type_for (vector_type); > > - return get_vectype_for_scalar_type_and_size > - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); > + poly_uint64 nunits; > + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), > + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) > + return NULL_TREE; > + > + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), > + scalar_type, nunits); > } > > /* Function vect_is_simple_use. > Index: gcc/tree-vectorizer.c > =================================================================== > --- gcc/tree-vectorizer.c 2019-10-25 13:27:19.317736181 +0100 > +++ gcc/tree-vectorizer.c 2019-10-25 13:27:26.209687483 +0100 > @@ -1348,7 +1348,7 @@ get_vec_alignment_for_array_type (tree t > poly_uint64 array_size, vector_size; > > tree scalar_type = strip_array_types (type); > - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); > + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); > if (!vectype > || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)