public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector. @ 2023-12-19 5:05 liuhongt at gcc dot gnu.org 2023-12-19 5:21 ` [Bug target/113079] " liuhongt at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: liuhongt at gcc dot gnu.org @ 2023-12-19 5:05 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 Bug ID: 113079 Summary: [x86] Fails to generate dot_prod instructions for 64-bit vector. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: liuhongt at gcc dot gnu.org Target Milestone: --- int foo (int n, unsigned char* p, char* pi) { int sum = 0; for (int i = 0; i != 8; i++) { sum += p[i] * pi[i]; } return sum; } We can use 128-bit dot_prod instruction + clean upper 64 bits. Currently, gcc generates a long instruction sequence. vmovq xmm0, QWORD PTR [rsi] vmovq xmm2, QWORD PTR [rdx] vpmovzxbw xmm1, xmm0 vpsrlq xmm0, xmm0, 32 vpmovsxbw xmm3, xmm2 vpmullw xmm1, xmm1, xmm3 vpsrlq xmm2, xmm2, 32 vpmovzxbw xmm0, xmm0 vpmovsxbw xmm2, xmm2 vpmullw xmm0, xmm0, xmm2 vpmovsxwd xmm2, xmm1 vpsrlq xmm1, xmm1, 32 vpmovsxwd xmm1, xmm1 vpaddd xmm2, xmm2, xmm1 vpmovsxwd xmm1, xmm0 vpsrlq xmm0, xmm0, 32 vpmovsxwd xmm0, xmm0 vpaddd xmm1, xmm1, xmm2 vpxor xmm2, xmm2, xmm2 vpshufb xmm2, xmm2, XMMWORD PTR .LC1[rip] vpaddd xmm0, xmm0, xmm1 vpshufb xmm1, xmm0, XMMWORD PTR .LC0[rip] vpor xmm1, xmm1, xmm2 vpaddd xmm0, xmm0, xmm1 vmovd eax, xmm0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/113079] [x86] Fails to generate dot_prod instructions for 64-bit vector. 2023-12-19 5:05 [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector liuhongt at gcc dot gnu.org @ 2023-12-19 5:21 ` liuhongt at gcc dot gnu.org 2023-12-19 9:08 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: liuhongt at gcc dot gnu.org @ 2023-12-19 5:21 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 --- Comment #1 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #0) > int > foo (int n, unsigned char* p, char* pi) > { > int sum = 0; > for (int i = 0; i != 8; i++) > { > sum += p[i] * pi[i]; > } > return sum; > } > > We can use 128-bit dot_prod instruction + clean upper 64 bits. Currently, clean upper is not needed since it's integral operations, no side effect from upper 64-bits operations. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/113079] [x86] Fails to generate dot_prod instructions for 64-bit vector. 2023-12-19 5:05 [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector liuhongt at gcc dot gnu.org 2023-12-19 5:21 ` [Bug target/113079] " liuhongt at gcc dot gnu.org @ 2023-12-19 9:08 ` rguenth at gcc dot gnu.org 2024-05-07 7:45 ` cvs-commit at gcc dot gnu.org 2024-05-07 7:45 ` liuhongt at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: rguenth at gcc dot gnu.org @ 2023-12-19 9:08 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed| |2023-12-19 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- It's includes a lane reduction so we need to have those zero at least on GIMPLE (if we'd do it there) because the tree codes do not specify which lanes are reduced. The actual x86 instruction is probably fine so if you make V8QI operation available in the backend that should work without zeroing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/113079] [x86] Fails to generate dot_prod instructions for 64-bit vector. 2023-12-19 5:05 [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector liuhongt at gcc dot gnu.org 2023-12-19 5:21 ` [Bug target/113079] " liuhongt at gcc dot gnu.org 2023-12-19 9:08 ` rguenth at gcc dot gnu.org @ 2024-05-07 7:45 ` cvs-commit at gcc dot gnu.org 2024-05-07 7:45 ` liuhongt at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2024-05-07 7:45 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 --- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>: https://gcc.gnu.org/g:fa911365490a7ca308878517a4af6189ffba7ed6 commit r15-235-gfa911365490a7ca308878517a4af6189ffba7ed6 Author: liuhongt <hongtao.liu@intel.com> Date: Wed Dec 20 11:43:25 2023 +0800 Support dot_prod optabs for 64-bit vector. gcc/ChangeLog: PR target/113079 * config/i386/mmx.md (usdot_prodv8qi): New expander. (sdot_prodv8qi): Ditto. (udot_prodv8qi): Ditto. (usdot_prodv4hi): Ditto. (udot_prodv4hi): Ditto. (sdot_prodv4hi): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr113079.c: New test. * gcc.target/i386/pr113079-2.c: New test. * gcc.target/i386/sse4-pr113079-2.c: New test. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/113079] [x86] Fails to generate dot_prod instructions for 64-bit vector. 2023-12-19 5:05 [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector liuhongt at gcc dot gnu.org ` (2 preceding siblings ...) 2024-05-07 7:45 ` cvs-commit at gcc dot gnu.org @ 2024-05-07 7:45 ` liuhongt at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: liuhongt at gcc dot gnu.org @ 2024-05-07 7:45 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 Hongtao Liu <liuhongt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- Fixed in GCC15. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-05-07 7:45 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-12-19 5:05 [Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector liuhongt at gcc dot gnu.org 2023-12-19 5:21 ` [Bug target/113079] " liuhongt at gcc dot gnu.org 2023-12-19 9:08 ` rguenth at gcc dot gnu.org 2024-05-07 7:45 ` cvs-commit at gcc dot gnu.org 2024-05-07 7:45 ` liuhongt at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).