public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "tnfchris at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/104408] New: SLP discovery fails due to -Ofast rewriting Date: Sun, 06 Feb 2022 09:13:14 +0000 [thread overview] Message-ID: <bug-104408-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104408 Bug ID: 104408 Summary: SLP discovery fails due to -Ofast rewriting Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- The following testcase: typedef struct { float r, i; } cf; void f (cf *restrict a, cf *restrict b, cf *restrict c, cf *restrict d, cf e) { for (int i = 0; i < 100; ++i) { b[i].r = e.r * (c[i].r - d[i].r) - e.i * (c[i].i - d[i].i); b[i].i = e.r * (c[i].i - d[i].i) + e.i * (c[i].r - d[i].r); } } when compiled at -O3 forms an SLP tree but fails at -Ofast because match.pd rewrites the expression into b[i].r = e.r * (c[i].r - d[i].r) + e.i * (d[i].i - c[i].i); b[i].i = e.r * (c[i].i - d[i].i) + e.i * (c[i].r - d[i].r); and so introduces a different interleaving in the second multiply operation. It's unclear to me what the gain of actually doing this is as it results in worse vector and scalar code due to you losing the sharing of the computed value of the nodes. Without the rewriting the first code can re-use the load from the first vector and just reverse the elements: .L2: ldr q1, [x3, x0] ldr q0, [x2, x0] fsub v0.4s, v0.4s, v1.4s fmul v1.4s, v2.4s, v0.4s fmul v0.4s, v3.4s, v0.4s rev64 v1.4s, v1.4s fneg v0.2d, v0.2d fadd v0.4s, v0.4s, v1.4s str q0, [x1, x0] add x0, x0, 16 cmp x0, 800 bne .L2 While with the rewrite it forces an increase in VF to be able to handle the interleaving .L2: ld2 {v0.4s - v1.4s}, [x3], 32 ld2 {v4.4s - v5.4s}, [x2], 32 fsub v2.4s, v1.4s, v5.4s fsub v3.4s, v4.4s, v0.4s fsub v5.4s, v5.4s, v1.4s fmul v2.4s, v2.4s, v6.4s fmul v4.4s, v6.4s, v3.4s fmla v2.4s, v7.4s, v3.4s fmla v4.4s, v5.4s, v7.4s mov v0.16b, v2.16b mov v1.16b, v4.16b st2 {v0.4s - v1.4s}, [x1], 32 cmp x5, x1 bne .L2 in scalar you lose the ability to re-use the subtract so you get an extra sub.
next reply other threads:[~2022-02-06 9:13 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-02-06 9:13 tnfchris at gcc dot gnu.org [this message] 2022-02-06 9:43 ` [Bug tree-optimization/104408] " tnfchris at gcc dot gnu.org 2022-02-06 12:41 ` tnfchris at gcc dot gnu.org 2022-02-07 8:55 ` rguenth at gcc dot gnu.org 2022-02-07 8:56 ` rguenth at gcc dot gnu.org 2022-02-07 9:58 ` tnfchris at gcc dot gnu.org 2022-02-07 10:55 ` rguenth at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-104408-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).