public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "already5chosen at yahoo dot com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3 Date: Sat, 26 Nov 2022 18:36:09 +0000 [thread overview] Message-ID: <bug-97832-4-roxWRcU4J0@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-97832-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832 --- Comment #20 from Michael_S <already5chosen at yahoo dot com> --- (In reply to Richard Biener from comment #17) > (In reply to Michael_S from comment #16) > > On unrelated note, why loop overhead uses so many instructions? > > Assuming that I am as misguided as gcc about load-op combining, I would > > write it as: > > sub %rax, %rdx > > .L3: > > vmovupd (%rdx,%rax), %ymm1 > > vmovupd 32(%rdx,%rax), %ymm0 > > vfmadd213pd 32(%rax), %ymm3, %ymm1 > > vfnmadd213pd (%rax), %ymm2, %ymm0 > > vfnmadd231pd 32(%rdx,%rax), %ymm3, %ymm0 > > vfnmadd231pd (%rdx,%rax), %ymm2, %ymm1 > > vmovupd %ymm0, (%rax) > > vmovupd %ymm1, 32(%rax) > > addq $64, %rax > > decl %esi > > jb .L3 > > > > The loop overhead in my variant is 3 x86 instructions==2 macro-ops, > > vs 5 x86 instructions==4 macro-ops in gcc variant. > > Also, in gcc variant all memory accesses have displacement that makes them > > 1 byte longer. In my variant only half of accesses have displacement. > > > > I think, in the past I had seen cases where gcc generates optimal or > > near-optimal > > code sequences for loop overhead. I wonder why it can not do it here. > > I don't think we currently consider IVs based on the difference of two > addresses. It seems to me that I had seen you doing it. But, may be, I confuse gcc with clang. > The cost benefit of no displacement is only size, Size is pretty important in high-IPC SIMD loops. Esp. on Intel and when # of iterations is small, because Intel has 16-byte fetch out of L1I cache. SIMD instructions tend to be long and not many instructions fit within 16 bytes even when memory accesses have no offsets. Offset adds impact to the injury. > otherwise > I have no idea why we have biased the %rax accesses by -32. Why we > fail to consider decrement-to-zero for the counter IV is probably because > IVCANON would add such IV but the vectorizer replaces that and IVOPTs > doesn't consider re-adding that. Sorry, I have no idea about the meaning of IVCANON.
next prev parent reply other threads:[~2022-11-26 18:36 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-14 20:44 [Bug target/97832] New: " already5chosen at yahoo dot com 2020-11-16 7:21 ` [Bug target/97832] " rguenth at gcc dot gnu.org 2020-11-16 11:11 ` rguenth at gcc dot gnu.org 2020-11-16 20:11 ` already5chosen at yahoo dot com 2020-11-17 9:21 ` [Bug tree-optimization/97832] " rguenth at gcc dot gnu.org 2020-11-17 10:18 ` rguenth at gcc dot gnu.org 2020-11-18 8:53 ` rguenth at gcc dot gnu.org 2020-11-18 9:15 ` rguenth at gcc dot gnu.org 2020-11-18 13:23 ` rguenth at gcc dot gnu.org 2020-11-18 13:39 ` rguenth at gcc dot gnu.org 2020-11-19 19:55 ` already5chosen at yahoo dot com 2020-11-20 7:10 ` rguenth at gcc dot gnu.org 2021-06-09 12:41 ` cvs-commit at gcc dot gnu.org 2021-06-09 12:54 ` rguenth at gcc dot gnu.org 2022-01-21 0:16 ` pinskia at gcc dot gnu.org 2022-11-24 23:22 ` already5chosen at yahoo dot com 2022-11-25 8:16 ` rguenth at gcc dot gnu.org 2022-11-25 13:19 ` already5chosen at yahoo dot com 2022-11-25 20:46 ` rguenth at gcc dot gnu.org 2022-11-25 21:27 ` amonakov at gcc dot gnu.org 2022-11-26 18:27 ` already5chosen at yahoo dot com 2022-11-26 18:36 ` already5chosen at yahoo dot com [this message] 2022-11-26 19:36 ` amonakov at gcc dot gnu.org 2022-11-26 22:00 ` already5chosen at yahoo dot com 2022-11-28 6:29 ` crazylht at gmail dot com 2022-11-28 6:42 ` crazylht at gmail dot com 2022-11-28 7:21 ` rguenther at suse dot de 2022-11-28 7:24 ` crazylht at gmail dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-97832-4-roxWRcU4J0@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).