public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/53949] [SH] Add support for mac.w / mac.l instructions Date: Sun, 15 Jul 2012 12:11:00 -0000 [thread overview] Message-ID: <bug-53949-4-VgvJCb3eaP@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-53949-4@http.gcc.gnu.org/bugzilla/> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949 --- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-15 12:11:20 UTC --- Created attachment 27799 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27799 Proof of concept patch This is a proof of concept patch just to probe around. The idea is to allow the RA to allocate macl and mach registers in DImode, and have mac insns that use the macl/mach regs as a pair in DImode. With the patch applied, the following function ... int64_t test01 (const int16_t* a, const int16_t* b) { int64_t sum = 0; for (int i = 0; i < 16; ++i) sum += (int64_t)(*a++) * (int64_t)(*b++); return sum; } compiled with -m4 -O2 results in ... __Z6test01PKsS0_: .LFB0: .cfi_startproc mov #16,r1 ! 88 movsi_ie/3 [length = 2] clrmac ! 39 clrmac/1 [length = 2] .align 2 .L3: dt r1 ! 89 dect [length = 2] bf/s .L3 ! 90 branch_false [length = 2] mac.w @r4+,@r5+ ! 61 *macw [length = 2] sts macl,r0 ! 82 movsi_ie/8 [length = 2] rts ! 99 *return_i [length = 2] sts mach,r1 ! 83 movsi_ie/8 [length = 2] ... which is not that bad already. Some notes I took while playing around with this: - When compiling for big endian the RA mistakes mach and macl when storing mach:macl to a DImode reg:reg pair. This could probably fixed by providing appropriate move insns patterns. - Move insns/splits for DImode mach:macl <-> memory have to be added. I've seen an ICE when compiling with -O1: error: unrecognizable insn: (insn 122 14 15 2 (set (mem/c:DI (plus:SI (reg/f:SI 15 r15) (const_int 8 [0x8])) [0 %sfp+-8 S8 A32]) (reg:DI 148 macl)) sh_mac.cpp:38 -1 (nil)) - In some cases the mach:macl reg pair gets swapped to a general reg pair without any obvious need. Example function: int64_t test04 (const int16_t* a, const int16_t* b, const int16_t* c, const int16_t* d) { int64_t sum0 = 0; int64_t sum1 = 0; for (int i = 0; i < 16; ++i) sum0 += (int64_t)(*a++) * (int64_t)(*b++); for (int i = 0; i < 16; ++i) sum1 += (int64_t)(*c++) * (int64_t)(*d++); return sum0 - sum1; } The IRA pass first allocates sum0 and sum1 to mach:macl, but then reload seems to think that they are conflicting and moves sum0 to a general regs pair. This results in ... mov #0,r2 mov #16,r1 mov r2,r3 .L16: lds r2,macl lds r3,mach dt r1 mac.w @r4+,@r5+ sts macl,r2 bf/s .L16 sts mach,r3 mov #16,r1 clrmac .align 2 .L18: dt r1 bf/s .L18 mac.w @r6+,@r7+ which would be better as: mov #16,r1 clrmac .L16: dt r1 bf/s .L16 mac.w @r4+,@r5+ sts macl,r2 sts mach,r3 clrmac mov #16,r1 .L18: dt r1 bf/s .L18 mac.w @r6+,@r7+ - Loops with multiple running sums like for (int i = 0; i < 16; ++i) { sum0 += (int64_t)(*a++) * (int64_t)(*b++); sum1 += (int64_t)(*c++) * (int64_t)(*d++); } result in macl:mach swapping to general reg pairs between subsequent mac.w instructions. Ideally such loops should be split into multiple loops like in the previous example. - When loop unrolling is turned on the auto-inc addresses refs are converted to displacement addresses. Because the auto-inc-dec pass currently fails to detect a lot of auto-inc-dec possibilities the mac.w pattern will not match. The same goes for manually unrolled code like sum += (int64_t)(*a++) * (int64_t)(*b++); sum += (int64_t)(*a++) * (int64_t)(*b++); - Running sum variables should be turned into DImode variables if possible: int32_t test00 (const int16_t* a, const int16_t* b) { int32_t sum = 0; for (int i = 0; i < 16; ++i) sum += (*a++) * (*b++); return sum; } - The existing multiplication patterns could be adopted to utilize macl:mach reg pair allocation, especially 32x32 -> 64 bit multiplications. - Normal multiplications that do not need a full MAC operation but use memory operands can be done with a clrmac-mac sequence. Probably there are more subtle issues. Also, I have not tried expanding the standard name 'maddmn4' pattern, maybe it would make some of the problems mentioned above automagically disappear.
next prev parent reply other threads:[~2012-07-15 12:11 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-07-13 9:01 [Bug target/53949] New: " olegendo at gcc dot gnu.org 2012-07-13 10:34 ` [Bug target/53949] " olegendo at gcc dot gnu.org 2012-07-13 11:01 ` chrbr at gcc dot gnu.org 2012-07-13 11:24 ` chrbr at gcc dot gnu.org 2012-07-15 12:11 ` olegendo at gcc dot gnu.org [this message] 2012-07-17 19:13 ` olegendo at gcc dot gnu.org 2012-07-17 23:05 ` kkojima at gcc dot gnu.org 2012-07-22 16:47 ` olegendo at gcc dot gnu.org 2012-10-11 20:43 ` olegendo at gcc dot gnu.org 2012-11-07 21:31 ` olegendo at gcc dot gnu.org 2013-05-04 13:39 ` olegendo at gcc dot gnu.org 2013-12-17 12:25 ` olegendo at gcc dot gnu.org 2013-12-17 12:37 ` olegendo at gcc dot gnu.org 2015-01-30 19:37 ` olegendo at gcc dot gnu.org 2015-02-01 0:37 ` olegendo at gcc dot gnu.org 2015-04-08 20:08 ` olegendo at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-53949-4-VgvJCb3eaP@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).