* [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm @ 2023-08-18 7:46 Richard Biener 2023-08-18 8:39 ` Richard Sandiford 0 siblings, 1 reply; 7+ messages in thread From: Richard Biener @ 2023-08-18 7:46 UTC (permalink / raw) To: gcc-patches; +Cc: richard.sandiford The following avoids running into somehow flawed logic in fold_vec_perm for non-VLA vectors. Bootstrap & regtest running on x86_64-unknown-linux-gnu. Richard. PR tree-optimization/111048 * fold-const.cc (fold_vec_perm_cst): Check for non-VLA vectors first. * gcc.dg/torture/pr111048.c: New testcase. --- gcc/fold-const.cc | 12 ++++++------ gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 5c51c9d91be..144fd7481b3 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, unsigned res_npatterns, res_nelts_per_pattern; unsigned HOST_WIDE_INT res_nelts; + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) + { + res_npatterns = res_nelts; + res_nelts_per_pattern = 1; + } /* (1) If SEL is a suitable mask as determined by valid_mask_for_fold_vec_perm_cst_p, then: res_npatterns = max of npatterns between ARG0, ARG1, and SEL @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, res_npatterns = nelts in result vector. res_nelts_per_pattern = 1. This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) { res_npatterns = std::max (VECTOR_CST_NPATTERNS (arg0), @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, res_nelts = res_npatterns * res_nelts_per_pattern; } - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) - { - res_npatterns = res_nelts; - res_nelts_per_pattern = 1; - } else return NULL_TREE; diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c new file mode 100644 index 00000000000..475978aae2b --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ + +typedef unsigned char u8; + +__attribute__((noipa)) +static void check(const u8 * v) { + if (*v != 15) __builtin_trap(); +} + +__attribute__((noipa)) +static void bug(void) { + u8 in_lanes[32]; + for (unsigned i = 0; i < 32; i += 2) { + in_lanes[i + 0] = 0; + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); + } + + check(&in_lanes[13]); + } + +int main() { + bug(); +} -- 2.35.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-18 7:46 [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm Richard Biener @ 2023-08-18 8:39 ` Richard Sandiford 2023-08-18 9:22 ` Richard Biener 0 siblings, 1 reply; 7+ messages in thread From: Richard Sandiford @ 2023-08-18 8:39 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches, prathamesh.kulkarni Richard Biener <rguenther@suse.de> writes: > The following avoids running into somehow flawed logic in fold_vec_perm > for non-VLA vectors. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > Richard. > > PR tree-optimization/111048 > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > vectors first. > > * gcc.dg/torture/pr111048.c: New testcase. Please don't do this as a permanent thing. It was a deliberate choice to have the is_constant be the fallback, so that the "generic" (VLA+VLS) logic gets more coverage. Like you say, if something is wrong for VLS then the chances are that it's also wrong for VLA. Thanks, Richard > --- > gcc/fold-const.cc | 12 ++++++------ > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > 2 files changed, 30 insertions(+), 6 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 5c51c9d91be..144fd7481b3 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > unsigned res_npatterns, res_nelts_per_pattern; > unsigned HOST_WIDE_INT res_nelts; > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > + { > + res_npatterns = res_nelts; > + res_nelts_per_pattern = 1; > + } > /* (1) If SEL is a suitable mask as determined by > valid_mask_for_fold_vec_perm_cst_p, then: > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > res_npatterns = nelts in result vector. > res_nelts_per_pattern = 1. > This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > { > res_npatterns > = std::max (VECTOR_CST_NPATTERNS (arg0), > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > res_nelts = res_npatterns * res_nelts_per_pattern; > } > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > - { > - res_npatterns = res_nelts; > - res_nelts_per_pattern = 1; > - } > else > return NULL_TREE; > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c > new file mode 100644 > index 00000000000..475978aae2b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > @@ -0,0 +1,24 @@ > +/* { dg-do run } */ > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > + > +typedef unsigned char u8; > + > +__attribute__((noipa)) > +static void check(const u8 * v) { > + if (*v != 15) __builtin_trap(); > +} > + > +__attribute__((noipa)) > +static void bug(void) { > + u8 in_lanes[32]; > + for (unsigned i = 0; i < 32; i += 2) { > + in_lanes[i + 0] = 0; > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > + } > + > + check(&in_lanes[13]); > + } > + > +int main() { > + bug(); > +} ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-18 8:39 ` Richard Sandiford @ 2023-08-18 9:22 ` Richard Biener 2023-08-19 10:35 ` Prathamesh Kulkarni 0 siblings, 1 reply; 7+ messages in thread From: Richard Biener @ 2023-08-18 9:22 UTC (permalink / raw) To: Richard Sandiford; +Cc: gcc-patches, prathamesh.kulkarni On Fri, 18 Aug 2023, Richard Sandiford wrote: > Richard Biener <rguenther@suse.de> writes: > > The following avoids running into somehow flawed logic in fold_vec_perm > > for non-VLA vectors. > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > Richard. > > > > PR tree-optimization/111048 > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > > vectors first. > > > > * gcc.dg/torture/pr111048.c: New testcase. > > Please don't do this as a permanent thing. It was a deliberate choice > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) > logic gets more coverage. Like you say, if something is wrong for VLS > then the chances are that it's also wrong for VLA. Sure, feel free to undo this change together with the fix for the VLA case. Richard. > Thanks, > Richard > > > > --- > > gcc/fold-const.cc | 12 ++++++------ > > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > > 2 files changed, 30 insertions(+), 6 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > index 5c51c9d91be..144fd7481b3 100644 > > --- a/gcc/fold-const.cc > > +++ b/gcc/fold-const.cc > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > unsigned res_npatterns, res_nelts_per_pattern; > > unsigned HOST_WIDE_INT res_nelts; > > > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > + { > > + res_npatterns = res_nelts; > > + res_nelts_per_pattern = 1; > > + } > > /* (1) If SEL is a suitable mask as determined by > > valid_mask_for_fold_vec_perm_cst_p, then: > > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > res_npatterns = nelts in result vector. > > res_nelts_per_pattern = 1. > > This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > { > > res_npatterns > > = std::max (VECTOR_CST_NPATTERNS (arg0), > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > res_nelts = res_npatterns * res_nelts_per_pattern; > > } > > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > - { > > - res_npatterns = res_nelts; > > - res_nelts_per_pattern = 1; > > - } > > else > > return NULL_TREE; > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c > > new file mode 100644 > > index 00000000000..475978aae2b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > > @@ -0,0 +1,24 @@ > > +/* { dg-do run } */ > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > > + > > +typedef unsigned char u8; > > + > > +__attribute__((noipa)) > > +static void check(const u8 * v) { > > + if (*v != 15) __builtin_trap(); > > +} > > + > > +__attribute__((noipa)) > > +static void bug(void) { > > + u8 in_lanes[32]; > > + for (unsigned i = 0; i < 32; i += 2) { > > + in_lanes[i + 0] = 0; > > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > > + } > > + > > + check(&in_lanes[13]); > > + } > > + > > +int main() { > > + bug(); > > +} > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-18 9:22 ` Richard Biener @ 2023-08-19 10:35 ` Prathamesh Kulkarni 2023-08-21 6:56 ` Richard Biener 0 siblings, 1 reply; 7+ messages in thread From: Prathamesh Kulkarni @ 2023-08-19 10:35 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 4235 bytes --] On Fri, 18 Aug 2023 at 14:52, Richard Biener <rguenther@suse.de> wrote: > > On Fri, 18 Aug 2023, Richard Sandiford wrote: > > > Richard Biener <rguenther@suse.de> writes: > > > The following avoids running into somehow flawed logic in fold_vec_perm > > > for non-VLA vectors. > > > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > > > Richard. > > > > > > PR tree-optimization/111048 > > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > > > vectors first. > > > > > > * gcc.dg/torture/pr111048.c: New testcase. > > > > Please don't do this as a permanent thing. It was a deliberate choice > > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) > > logic gets more coverage. Like you say, if something is wrong for VLS > > then the chances are that it's also wrong for VLA. > > Sure, feel free to undo this change together with the fix for the > VLA case. Hi, The attached patch reverts the workaround, and fixes the issue. Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and x64_64-linux-gnu. OK to commit ? Thanks, Prathamesh > > Richard. > > > Thanks, > > Richard > > > > > > > --- > > > gcc/fold-const.cc | 12 ++++++------ > > > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > > > 2 files changed, 30 insertions(+), 6 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > > index 5c51c9d91be..144fd7481b3 100644 > > > --- a/gcc/fold-const.cc > > > +++ b/gcc/fold-const.cc > > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > unsigned res_npatterns, res_nelts_per_pattern; > > > unsigned HOST_WIDE_INT res_nelts; > > > > > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > + { > > > + res_npatterns = res_nelts; > > > + res_nelts_per_pattern = 1; > > > + } > > > /* (1) If SEL is a suitable mask as determined by > > > valid_mask_for_fold_vec_perm_cst_p, then: > > > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > res_npatterns = nelts in result vector. > > > res_nelts_per_pattern = 1. > > > This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > { > > > res_npatterns > > > = std::max (VECTOR_CST_NPATTERNS (arg0), > > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > > > res_nelts = res_npatterns * res_nelts_per_pattern; > > > } > > > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > - { > > > - res_npatterns = res_nelts; > > > - res_nelts_per_pattern = 1; > > > - } > > > else > > > return NULL_TREE; > > > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > new file mode 100644 > > > index 00000000000..475978aae2b > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > @@ -0,0 +1,24 @@ > > > +/* { dg-do run } */ > > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > > > + > > > +typedef unsigned char u8; > > > + > > > +__attribute__((noipa)) > > > +static void check(const u8 * v) { > > > + if (*v != 15) __builtin_trap(); > > > +} > > > + > > > +__attribute__((noipa)) > > > +static void bug(void) { > > > + u8 in_lanes[32]; > > > + for (unsigned i = 0; i < 32; i += 2) { > > > + in_lanes[i + 0] = 0; > > > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > > > + } > > > + > > > + check(&in_lanes[13]); > > > + } > > > + > > > +int main() { > > > + bug(); > > > +} > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) [-- Attachment #2: gnu-887-1.txt --] [-- Type: text/plain, Size: 4021 bytes --] PR111048: Set arg_npatterns correctly. In valid_mask_for_fold_vec_perm_cst we set arg_npatterns always to VECTOR_CST_NPATTERNS (arg0) because of (q1 & 0) == 0 in following condition: /* Ensure that the stepped sequence always selects from the same input pattern. */ unsigned arg_npatterns = ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0) : VECTOR_CST_NPATTERNS (arg1); resulting in wrong code-gen issues. The patch fixes this by changing the condition to (q1 & 1) == 0. gcc/ChangeLog: PR tree-optimization/111048 * fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): Set arg_npatterns correctly. (fold_vec_perm_cst): Remove workaround and again call valid_mask_fold_vec_perm_cst_p for both VLS and VLA vectors. (test_fold_vec_perm_cst::test_nunits_min_4): Add test-case. diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 08530b6ae80..fa1d5834bc3 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -10599,7 +10599,7 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree arg1, /* Ensure that the stepped sequence always selects from the same input pattern. */ unsigned arg_npatterns - = ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0) + = ((q1 & 1) == 0) ? VECTOR_CST_NPATTERNS (arg0) : VECTOR_CST_NPATTERNS (arg1); if (!multiple_p (step, arg_npatterns)) @@ -10625,11 +10625,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, unsigned res_npatterns, res_nelts_per_pattern; unsigned HOST_WIDE_INT res_nelts; - if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) - { - res_npatterns = res_nelts; - res_nelts_per_pattern = 1; - } /* (1) If SEL is a suitable mask as determined by valid_mask_for_fold_vec_perm_cst_p, then: res_npatterns = max of npatterns between ARG0, ARG1, and SEL @@ -10639,7 +10634,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, res_npatterns = nelts in result vector. res_nelts_per_pattern = 1. This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ - else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) + if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) { res_npatterns = std::max (VECTOR_CST_NPATTERNS (arg0), @@ -10653,6 +10648,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, res_nelts = res_npatterns * res_nelts_per_pattern; } + else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) + { + res_npatterns = res_nelts; + res_nelts_per_pattern = 1; + } else return NULL_TREE; @@ -17518,6 +17518,36 @@ test_nunits_min_4 (machine_mode vmode) ARG0(1), ARG0(0), ARG0(2) }; validate_res (2, 3, res, expected_res); } + + /* Case 7: PR111048: Check that we set arg_npatterns correctly, + when arg0, arg1 and sel have different number of patterns. + arg0 is of shape (1, 1) + arg1 is of shape (4, 1) + sel is of shape (2, 3) = {1, len, 2, len+1, 3, len+2, ...} + + In this case the pattern: {len, len+1, len+2, ...} chooses arg1. + However, + step = (len+2) - (len+1) = 1 + arg_npatterns = VECTOR_CST_NPATTERNS (arg1) = 4 + Since step is not a multiple of arg_npatterns, + valid_mask_for_fold_vec_perm_cst should return false, + and thus fold_vec_perm_cst should return NULL_TREE. */ + { + tree arg0 = build_vec_cst_rand (vmode, 1, 1); + tree arg1 = build_vec_cst_rand (vmode, 4, 1); + poly_uint64 len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)); + + vec_perm_builder builder (len, 2, 3); + poly_uint64 mask_elems[] = { 0, len, 1, len + 1, 2, len + 2 }; + builder_push_elems (builder, mask_elems); + + vec_perm_indices sel (builder, 2, len); + const char *reason; + tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg1, sel, &reason); + + ASSERT_TRUE (res == NULL_TREE); + ASSERT_TRUE (!strcmp (reason, "step is not multiple of npatterns")); + } } } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-19 10:35 ` Prathamesh Kulkarni @ 2023-08-21 6:56 ` Richard Biener 2023-08-21 9:59 ` Prathamesh Kulkarni 0 siblings, 1 reply; 7+ messages in thread From: Richard Biener @ 2023-08-21 6:56 UTC (permalink / raw) To: Prathamesh Kulkarni; +Cc: Richard Sandiford, gcc-patches On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: > On Fri, 18 Aug 2023 at 14:52, Richard Biener <rguenther@suse.de> wrote: > > > > On Fri, 18 Aug 2023, Richard Sandiford wrote: > > > > > Richard Biener <rguenther@suse.de> writes: > > > > The following avoids running into somehow flawed logic in fold_vec_perm > > > > for non-VLA vectors. > > > > > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > > > > > Richard. > > > > > > > > PR tree-optimization/111048 > > > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > > > > vectors first. > > > > > > > > * gcc.dg/torture/pr111048.c: New testcase. > > > > > > Please don't do this as a permanent thing. It was a deliberate choice > > > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) > > > logic gets more coverage. Like you say, if something is wrong for VLS > > > then the chances are that it's also wrong for VLA. > > > > Sure, feel free to undo this change together with the fix for the > > VLA case. > Hi, > The attached patch reverts the workaround, and fixes the issue. > Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and > x64_64-linux-gnu. > OK to commit ? OK. > Thanks, > Prathamesh > > > > Richard. > > > > > Thanks, > > > Richard > > > > > > > > > > --- > > > > gcc/fold-const.cc | 12 ++++++------ > > > > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > > > > 2 files changed, 30 insertions(+), 6 deletions(-) > > > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > > > index 5c51c9d91be..144fd7481b3 100644 > > > > --- a/gcc/fold-const.cc > > > > +++ b/gcc/fold-const.cc > > > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > unsigned res_npatterns, res_nelts_per_pattern; > > > > unsigned HOST_WIDE_INT res_nelts; > > > > > > > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > + { > > > > + res_npatterns = res_nelts; > > > > + res_nelts_per_pattern = 1; > > > > + } > > > > /* (1) If SEL is a suitable mask as determined by > > > > valid_mask_for_fold_vec_perm_cst_p, then: > > > > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > res_npatterns = nelts in result vector. > > > > res_nelts_per_pattern = 1. > > > > This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > > > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > > { > > > > res_npatterns > > > > = std::max (VECTOR_CST_NPATTERNS (arg0), > > > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > > > > > res_nelts = res_npatterns * res_nelts_per_pattern; > > > > } > > > > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > - { > > > > - res_npatterns = res_nelts; > > > > - res_nelts_per_pattern = 1; > > > > - } > > > > else > > > > return NULL_TREE; > > > > > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > new file mode 100644 > > > > index 00000000000..475978aae2b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > @@ -0,0 +1,24 @@ > > > > +/* { dg-do run } */ > > > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > > > > + > > > > +typedef unsigned char u8; > > > > + > > > > +__attribute__((noipa)) > > > > +static void check(const u8 * v) { > > > > + if (*v != 15) __builtin_trap(); > > > > +} > > > > + > > > > +__attribute__((noipa)) > > > > +static void bug(void) { > > > > + u8 in_lanes[32]; > > > > + for (unsigned i = 0; i < 32; i += 2) { > > > > + in_lanes[i + 0] = 0; > > > > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > > > > + } > > > > + > > > > + check(&in_lanes[13]); > > > > + } > > > > + > > > > +int main() { > > > > + bug(); > > > > +} > > > > > > > -- > > Richard Biener <rguenther@suse.de> > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-21 6:56 ` Richard Biener @ 2023-08-21 9:59 ` Prathamesh Kulkarni 2023-08-21 10:45 ` Richard Sandiford 0 siblings, 1 reply; 7+ messages in thread From: Prathamesh Kulkarni @ 2023-08-21 9:59 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches On Mon, 21 Aug 2023 at 12:26, Richard Biener <rguenther@suse.de> wrote: > > On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: > > > On Fri, 18 Aug 2023 at 14:52, Richard Biener <rguenther@suse.de> wrote: > > > > > > On Fri, 18 Aug 2023, Richard Sandiford wrote: > > > > > > > Richard Biener <rguenther@suse.de> writes: > > > > > The following avoids running into somehow flawed logic in fold_vec_perm > > > > > for non-VLA vectors. > > > > > > > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > > > > > > > Richard. > > > > > > > > > > PR tree-optimization/111048 > > > > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA > > > > > vectors first. > > > > > > > > > > * gcc.dg/torture/pr111048.c: New testcase. > > > > > > > > Please don't do this as a permanent thing. It was a deliberate choice > > > > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) > > > > logic gets more coverage. Like you say, if something is wrong for VLS > > > > then the chances are that it's also wrong for VLA. > > > > > > Sure, feel free to undo this change together with the fix for the > > > VLA case. > > Hi, > > The attached patch reverts the workaround, and fixes the issue. > > Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and > > x64_64-linux-gnu. > > OK to commit ? > > OK. Thanks, committed to trunk in 649388462e9a3c2de0b90ce525de8044704cc521 Thanks, Prathamesh > > > Thanks, > > Prathamesh > > > > > > Richard. > > > > > > > Thanks, > > > > Richard > > > > > > > > > > > > > --- > > > > > gcc/fold-const.cc | 12 ++++++------ > > > > > gcc/testsuite/gcc.dg/torture/pr111048.c | 24 ++++++++++++++++++++++++ > > > > > 2 files changed, 30 insertions(+), 6 deletions(-) > > > > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > > > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > > > > index 5c51c9d91be..144fd7481b3 100644 > > > > > --- a/gcc/fold-const.cc > > > > > +++ b/gcc/fold-const.cc > > > > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > > unsigned res_npatterns, res_nelts_per_pattern; > > > > > unsigned HOST_WIDE_INT res_nelts; > > > > > > > > > > + if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > > + { > > > > > + res_npatterns = res_nelts; > > > > > + res_nelts_per_pattern = 1; > > > > > + } > > > > > /* (1) If SEL is a suitable mask as determined by > > > > > valid_mask_for_fold_vec_perm_cst_p, then: > > > > > res_npatterns = max of npatterns between ARG0, ARG1, and SEL > > > > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > > res_npatterns = nelts in result vector. > > > > > res_nelts_per_pattern = 1. > > > > > This exception is made so that VLS ARG0, ARG1 and SEL work as before. */ > > > > > - if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > > > + else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason)) > > > > > { > > > > > res_npatterns > > > > > = std::max (VECTOR_CST_NPATTERNS (arg0), > > > > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree arg1, const vec_perm_indices &sel, > > > > > > > > > > res_nelts = res_npatterns * res_nelts_per_pattern; > > > > > } > > > > > - else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts)) > > > > > - { > > > > > - res_npatterns = res_nelts; > > > > > - res_nelts_per_pattern = 1; > > > > > - } > > > > > else > > > > > return NULL_TREE; > > > > > > > > > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > new file mode 100644 > > > > > index 00000000000..475978aae2b > > > > > --- /dev/null > > > > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c > > > > > @@ -0,0 +1,24 @@ > > > > > +/* { dg-do run } */ > > > > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ > > > > > + > > > > > +typedef unsigned char u8; > > > > > + > > > > > +__attribute__((noipa)) > > > > > +static void check(const u8 * v) { > > > > > + if (*v != 15) __builtin_trap(); > > > > > +} > > > > > + > > > > > +__attribute__((noipa)) > > > > > +static void bug(void) { > > > > > + u8 in_lanes[32]; > > > > > + for (unsigned i = 0; i < 32; i += 2) { > > > > > + in_lanes[i + 0] = 0; > > > > > + in_lanes[i + 1] = ((u8)0xff) >> (i & 7); > > > > > + } > > > > > + > > > > > + check(&in_lanes[13]); > > > > > + } > > > > > + > > > > > +int main() { > > > > > + bug(); > > > > > +} > > > > > > > > > > -- > > > Richard Biener <rguenther@suse.de> > > > SUSE Software Solutions Germany GmbH, > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm 2023-08-21 9:59 ` Prathamesh Kulkarni @ 2023-08-21 10:45 ` Richard Sandiford 0 siblings, 0 replies; 7+ messages in thread From: Richard Sandiford @ 2023-08-21 10:45 UTC (permalink / raw) To: Prathamesh Kulkarni; +Cc: Richard Biener, gcc-patches Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > On Mon, 21 Aug 2023 at 12:26, Richard Biener <rguenther@suse.de> wrote: >> >> On Sat, 19 Aug 2023, Prathamesh Kulkarni wrote: >> >> > On Fri, 18 Aug 2023 at 14:52, Richard Biener <rguenther@suse.de> wrote: >> > > >> > > On Fri, 18 Aug 2023, Richard Sandiford wrote: >> > > >> > > > Richard Biener <rguenther@suse.de> writes: >> > > > > The following avoids running into somehow flawed logic in fold_vec_perm >> > > > > for non-VLA vectors. >> > > > > >> > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. >> > > > > >> > > > > Richard. >> > > > > >> > > > > PR tree-optimization/111048 >> > > > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA >> > > > > vectors first. >> > > > > >> > > > > * gcc.dg/torture/pr111048.c: New testcase. >> > > > >> > > > Please don't do this as a permanent thing. It was a deliberate choice >> > > > to have the is_constant be the fallback, so that the "generic" (VLA+VLS) >> > > > logic gets more coverage. Like you say, if something is wrong for VLS >> > > > then the chances are that it's also wrong for VLA. >> > > >> > > Sure, feel free to undo this change together with the fix for the >> > > VLA case. >> > Hi, >> > The attached patch reverts the workaround, and fixes the issue. >> > Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and >> > x64_64-linux-gnu. >> > OK to commit ? >> >> OK. > Thanks, committed to trunk in 649388462e9a3c2de0b90ce525de8044704cc521 Thanks for the patch. Please remember to close the PR too. Richard ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-21 10:45 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-18 7:46 [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm Richard Biener 2023-08-18 8:39 ` Richard Sandiford 2023-08-18 9:22 ` Richard Biener 2023-08-19 10:35 ` Prathamesh Kulkarni 2023-08-21 6:56 ` Richard Biener 2023-08-21 9:59 ` Prathamesh Kulkarni 2023-08-21 10:45 ` Richard Sandiford
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).