From: Richard Biener <richard.guenther@gmail.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [5/6] Account for the cost of generating loop masks
Date: Wed, 06 Nov 2019 12:16:00 -0000 [thread overview]
Message-ID: <CAFiYyc3LO8CDm6DLuReZ8vMmRCkwcuXLup=xjZHYyF9OK0UaLg@mail.gmail.com> (raw)
In-Reply-To: <mpto8xqs8d9.fsf@arm.com>
On Tue, Nov 5, 2019 at 3:31 PM Richard Sandiford
<Richard.Sandiford@arm.com> wrote:
>
> We didn't take the cost of generating loop masks into account, and so
> tended to underestimate the cost of loops that need multiple masks.
OK.
>
> 2019-11-05 Richard Sandiford <richard.sandiford@arm.com>
>
> gcc/
> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Include
> the cost of generating loop masks.
>
> gcc/testsuite/
> * gcc.target/aarch64/sve/mask_struct_store_3.c: Add
> -fno-vect-cost-model.
> * gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise.
> * gcc.target/aarch64/sve/peel_ind_3.c: Likewise.
> * gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise.
>
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c 2019-11-05 14:19:58.781197820 +0000
> +++ gcc/tree-vect-loop.c 2019-11-05 14:20:40.188909187 +0000
> @@ -3435,6 +3435,32 @@ vect_estimate_min_profitable_iters (loop
> si->kind, si->stmt_info, si->misalign,
> vect_epilogue);
> }
> +
> + /* Calculate how many masks we need to generate. */
> + unsigned int num_masks = 0;
> + rgroup_masks *rgm;
> + unsigned int num_vectors_m1;
> + FOR_EACH_VEC_ELT (LOOP_VINFO_MASKS (loop_vinfo), num_vectors_m1, rgm)
> + if (rgm->mask_type)
> + num_masks += num_vectors_m1 + 1;
> + gcc_assert (num_masks > 0);
> +
> + /* In the worst case, we need to generate each mask in the prologue
> + and in the loop body. One of the loop body mask instructions
> + replaces the comparison in the scalar loop, and since we don't
> + count the scalar comparison against the scalar body, we shouldn't
> + count that vector instruction against the vector body either.
> +
> + Sometimes we can use unpacks instead of generating prologue
> + masks and sometimes the prologue mask will fold to a constant,
> + so the actual prologue cost might be smaller. However, it's
> + simpler and safer to use the worst-case cost; if this ends up
> + being the tie-breaker between vectorizing or not, then it's
> + probably better not to vectorize. */
> + (void) add_stmt_cost (target_cost_data, num_masks, vector_stmt,
> + NULL, 0, vect_prologue);
> + (void) add_stmt_cost (target_cost_data, num_masks - 1, vector_stmt,
> + NULL, 0, vect_body);
> }
> else if (npeel < 0)
> {
> Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3.c 2019-03-08 18:14:29.768994780 +0000
> +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3.c 2019-11-05 14:20:40.184909216 +0000
> @@ -1,5 +1,5 @@
> /* { dg-do compile } */
> -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
> +/* { dg-options "-O2 -ftree-vectorize -ffast-math -fno-vect-cost-model" } */
>
> #include <stdint.h>
>
> Index: gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3_run.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3_run.c 2019-03-08 18:14:29.772994767 +0000
> +++ gcc/testsuite/gcc.target/aarch64/sve/mask_struct_store_3_run.c 2019-11-05 14:20:40.184909216 +0000
> @@ -1,5 +1,5 @@
> /* { dg-do run { target aarch64_sve_hw } } */
> -/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
> +/* { dg-options "-O2 -ftree-vectorize -ffast-math -fno-vect-cost-model" } */
>
> #include "mask_struct_store_3.c"
>
> Index: gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3.c 2019-03-08 18:14:29.776994751 +0000
> +++ gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3.c 2019-11-05 14:20:40.184909216 +0000
> @@ -1,7 +1,7 @@
> /* { dg-do compile } */
> /* Pick an arbitrary target for which unaligned accesses are more
> expensive. */
> -/* { dg-options "-O3 -msve-vector-bits=256 -mtune=thunderx" } */
> +/* { dg-options "-O3 -msve-vector-bits=256 -mtune=thunderx -fno-vect-cost-model" } */
>
> #define N 32
> #define MAX_START 8
> Index: gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3_run.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3_run.c 2019-03-08 18:14:29.784994721 +0000
> +++ gcc/testsuite/gcc.target/aarch64/sve/peel_ind_3_run.c 2019-11-05 14:20:40.184909216 +0000
> @@ -1,6 +1,6 @@
> /* { dg-do run { target aarch64_sve_hw } } */
> -/* { dg-options "-O3 -mtune=thunderx" } */
> -/* { dg-options "-O3 -mtune=thunderx -msve-vector-bits=256" { target aarch64_sve256_hw } } */
> +/* { dg-options "-O3 -mtune=thunderx -fno-vect-cost-model" } */
> +/* { dg-options "-O3 -mtune=thunderx -msve-vector-bits=256 -fno-vect-cost-model" { target aarch64_sve256_hw } } */
>
> #include "peel_ind_3.c"
>
next prev parent reply other threads:[~2019-11-06 12:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-05 14:24 [0/6] Optionally pick the cheapest loop_vec_info Richard Sandiford
2019-11-05 14:25 ` [1/6] Fix vectorizable_conversion costs Richard Sandiford
2019-11-06 12:01 ` Richard Biener
2019-11-07 15:14 ` Richard Sandiford
2019-11-07 16:13 ` Richard Biener
2019-11-05 14:27 ` [2/6] Don't assign a cost to vectorizable_assignment Richard Sandiford
2019-11-06 12:04 ` Richard Biener
2019-11-06 15:58 ` Richard Sandiford
2019-11-07 9:35 ` Richard Biener
2019-11-07 16:40 ` Richard Sandiford
2019-11-08 11:24 ` Richard Biener
2019-11-05 14:28 ` [3/6] Avoid accounting for non-existent vector loop versioning Richard Sandiford
2019-11-06 12:05 ` Richard Biener
2019-11-05 14:29 ` [4/6] Optionally pick the cheapest loop_vec_info Richard Sandiford
2019-11-06 12:09 ` Richard Biener
2019-11-06 14:01 ` Richard Sandiford
2019-11-06 14:50 ` Richard Biener
2019-11-07 17:15 ` Richard Sandiford
2019-11-08 11:27 ` Richard Biener
2019-11-08 12:15 ` Richard Sandiford
2019-11-05 14:31 ` [5/6] Account for the cost of generating loop masks Richard Sandiford
2019-11-06 12:16 ` Richard Biener [this message]
2019-11-05 14:32 ` [6/6][AArch64] Enable vect-compare-loop-costs by default for SVE Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc3LO8CDm6DLuReZ8vMmRCkwcuXLup=xjZHYyF9OK0UaLg@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=Richard.Sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).