* [Bug tree-optimization/111648] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
@ 2023-09-30 16:48 ` prathamesh3492 at gcc dot gnu.org
2023-10-01 18:49 ` [Bug tree-optimization/111648] [14 Regression] " pinskia at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: prathamesh3492 at gcc dot gnu.org @ 2023-09-30 16:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Hi,
Sorry for the breakage, will take a look.
Thanks,
Prathamesh
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
2023-09-30 16:48 ` [Bug tree-optimization/111648] " prathamesh3492 at gcc dot gnu.org
@ 2023-10-01 18:49 ` pinskia at gcc dot gnu.org
2023-10-03 12:19 ` prathamesh3492 at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-01 18:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |14.0
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0 |1
Last reconfirmed| |2023-10-01
Summary|Wrong code at -O2/3 on |[14 Regression] Wrong code
|x86_64-linux-gnu since |at -O2/3 on
|r14-3243-ga7dba4a1c05 |x86_64-linux-gnu since
| |r14-3243-ga7dba4a1c05
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
2023-09-30 16:48 ` [Bug tree-optimization/111648] " prathamesh3492 at gcc dot gnu.org
2023-10-01 18:49 ` [Bug tree-optimization/111648] [14 Regression] " pinskia at gcc dot gnu.org
@ 2023-10-03 12:19 ` prathamesh3492 at gcc dot gnu.org
2023-10-03 12:21 ` prathamesh3492 at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: prathamesh3492 at gcc dot gnu.org @ 2023-10-03 12:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 56037
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56037&action=edit
Untested fix
The issue is that when a1 is a multiple of vector length, we end up creating
following encoding in result: { base_elem, arg[0], arg[1], ... } where arg is
chosen input vector, which is incorrect.
For above case, vectorizer pass creates VEC_PERM_EXPR<arg0, arg, sel> where:
arg0: { -16, -9, -10, -11 }
arg1: { -12, -5, -6, -7 }
sel = { 3, 4, 5, 6 }
arg0, arg1 and sel are encoded with npatterns = 1 and nelts_per_pattern = 3.
Since a1 = 4 and arg_len = 4, it ended up creating the result with
following encoding:
res = { arg0[3], arg1[0], arg1[1] } // npatterns = 1, nelts_per_pattern = 3
= { -11, -12, -5 }
So for res[4], it used S = (-5) - (-12) = 7
And hence computed it as -5 + 7 = 2.
instead of arg1[2], ie, -6.
which is the difference we see in output at -O0 vs -O2.
The patch tweaks the constratints in valid_mask_for_fold_vec_perm_cst_p to punt
if a1 is a multiple of vector length, so a1 ... ae only selects from stepped
part of the input vector, which seems to fix this issue.
I will run a proper bootstrap+test and post it upstream.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
` (2 preceding siblings ...)
2023-10-03 12:19 ` prathamesh3492 at gcc dot gnu.org
@ 2023-10-03 12:21 ` prathamesh3492 at gcc dot gnu.org
2023-10-04 9:25 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: prathamesh3492 at gcc dot gnu.org @ 2023-10-03 12:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
(In reply to prathamesh3492 from comment #3)
> Created attachment 56037 [details]
> Untested fix
>
> The issue is that when a1 is a multiple of vector length, we end up creating
> following encoding in result: { base_elem, arg[0], arg[1], ... } where arg
> is chosen input vector, which is incorrect.
>
> For above case, vectorizer pass creates VEC_PERM_EXPR<arg0, arg, sel> where:
> arg0: { -16, -9, -10, -11 }
> arg1: { -12, -5, -6, -7 }
> sel = { 3, 4, 5, 6 }
>
> arg0, arg1 and sel are encoded with npatterns = 1 and nelts_per_pattern = 3.
> Since a1 = 4 and arg_len = 4, it ended up creating the result with
> following encoding:
> res = { arg0[3], arg1[0], arg1[1] } // npatterns = 1, nelts_per_pattern = 3
> = { -11, -12, -5 }
>
> So for res[4], it used S = (-5) - (-12) = 7
Typo: I meant res[3], not res[4]. Sorry.
> And hence computed it as -5 + 7 = 2.
> instead of arg1[2], ie, -6.
> which is the difference we see in output at -O0 vs -O2.
>
> The patch tweaks the constratints in valid_mask_for_fold_vec_perm_cst_p to
> punt if a1 is a multiple of vector length, so a1 ... ae only selects from
> stepped part of the input vector, which seems to fix this issue.
> I will run a proper bootstrap+test and post it upstream.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
` (3 preceding siblings ...)
2023-10-03 12:21 ` prathamesh3492 at gcc dot gnu.org
@ 2023-10-04 9:25 ` rguenth at gcc dot gnu.org
2023-10-18 19:04 ` cvs-commit at gcc dot gnu.org
2023-10-18 19:06 ` prathamesh3492 at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-04 9:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
` (4 preceding siblings ...)
2023-10-04 9:25 ` rguenth at gcc dot gnu.org
@ 2023-10-18 19:04 ` cvs-commit at gcc dot gnu.org
2023-10-18 19:06 ` prathamesh3492 at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-18 19:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Prathamesh Kulkarni
<prathamesh3492@gcc.gnu.org>:
https://gcc.gnu.org/g:3ec8ecb8e92faec889bc6f7aeac9ff59e82b4f7f
commit r14-4726-g3ec8ecb8e92faec889bc6f7aeac9ff59e82b4f7f
Author: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Date: Thu Oct 19 00:29:38 2023 +0530
PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding.
gcc/ChangeLog:
PR tree-optimization/111648
* fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): If a1
chooses base element from arg, ensure that it's a natural stepped
sequence.
(build_vec_cst_rand): New param natural_stepped and use it to
construct a naturally stepped sequence.
(test_nunits_min_2): Add new unit tests Case 6 and Case 7.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/111648] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05
2023-09-30 14:04 [Bug tree-optimization/111648] New: Wrong code at -O2/3 on x86_64-linux-gnu since r14-3243-ga7dba4a1c05 shaohua.li at inf dot ethz.ch
` (5 preceding siblings ...)
2023-10-18 19:04 ` cvs-commit at gcc dot gnu.org
@ 2023-10-18 19:06 ` prathamesh3492 at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: prathamesh3492 at gcc dot gnu.org @ 2023-10-18 19:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111648
prathamesh3492 at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
Fixed.
^ permalink raw reply [flat|nested] 8+ messages in thread