public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "evstupac at gmail dot com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug c/52252] New: An opportunity for x86 gcc vectorizer (gain up to 3 times) Date: Tue, 14 Feb 2012 22:42:00 -0000 [thread overview] Message-ID: <bug-52252-4@http.gcc.gnu.org/bugzilla/> (raw) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52252 Bug #: 52252 Summary: An opportunity for x86 gcc vectorizer (gain up to 3 times) Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned@gcc.gnu.org ReportedBy: evstupac@gmail.com This is an example of byte conversion from RGB (Red Green Blue) to CMYK (Cyan Magenta Yellow blacK): #define byte unsigned char #define MIN(a, b) ((a) > (b)?(b):(a)) void convert_image(byte *in, byte *out, int size) { int i; for(i = 0; i < size; i++) { byte r = in[0]; byte g = in[1]; byte b = in[2]; byte c, m, y, k, tmp; c = 255 - r; m = 255 - g; y = 255 - b; tmp = MIN(m, y); k = MIN(c, tmp); out[0] = c - k; out[1] = m - k; out[2] = y - k; out[3] = k; in += 3; out += 4; } } Here trunk gcc for Arm unrolls the loop by 2 and vectorizes it using neon; gcc for x86 does not vectorize it. There are 2 tricky moments in this loop: 1) It converts 3 bytes into 4 2) We need to shuffle bytes after load: Let 0123456789ABCDF be 16 bytes in “in” array (first rgb is 012, next 345…) To count vector minimum we need to place 0,1,2 bytes into 3 different vectors. Gcc for Arm does this by 2 special loads: vld3.8 {d16, d18, d20}, [r2]! vld3.8 {d17, d19, d21}, [r2] putting 0 and 3 bytes into q8(d16, d17) 1 and 4 bytes into q9(d18, d19) 2 and 5 bytes into q10(d20, d21) And after all vector transformations it stores by 2 special stores: vst4.8 {d8, d10, d12, d14}, [r3]! vst4.8 {d9, d11, d13, d15}, [r3] However x86 gcc can do the same loads: movq (%edi),%mm5 movq %mm5,%mm7 movq %mm5,%mm6 pshufb %mm3,%mm5 /*0x00ffffff03ffffff*/ pshufb %mm2,%mm6 /*0x01ffffff04ffffff*/ pshufb %mm1,%mm7 /*0x02ffffff05ffffff*/ /* %mm5 – r, %mm6 – g, %mm7 – b */ And same stores: pslld $0x8,%mm6 pslld $0x10,%mm7 pslld $0x18,%mm4 pxor %mm5,%mm6 pxor %mm7,%mm4 pxor %mm6,%mm4 pshufb %mm0,%mm4 /*0x000102030405060708*/ /*here redundant*/ movq %mm4,(%esi) /* %mm5 – c, %mm6 – m, %mm7 – y, %mm4 - k */ pshufb here does not do anything, so could be removed, only in case we store less than 4 bytes we will need to shuffle them Moreover x86 gcc can do unroll not only by 2, but by 4: With the following loads: movdqu (%edi),%xmm5 movdqa %xmm5,%xmm7 movdqa %xmm5,%xmm6 pshufb %xmm3,%xmm5 /*0x00ffffff03ffffff06ffffff09ffffff*/ pshufb %xmm2,%xmm6 /*0x01ffffff04ffffff07ffffff0affffff*/ pshufb %xmm1,%xmm7 /*0x02ffffff05ffffff08ffffff0bffffff*/ /* %xmm5 – r, %xmm6 – g, %xmm7 – b */ And stores: pslld $0x8,%xmm6 pslld $0x10,%xmm7 pslld $0x18,%xmm4 pxor %xmm5,%xmm6 pxor %xmm7,%xmm4 pxor %xmm6,%xmm4 pshufb %xmm0,%xmm4 /*0x000102030405060708090a0b0c0d0e0f*/ /*here redundant*/ movdqa %xmm4,(%esi) /* %xmm5 – c, %xmm6 – m, %xmm7 – y, %xmm4 - k */
next reply other threads:[~2012-02-14 22:42 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-02-14 22:42 evstupac at gmail dot com [this message] 2012-02-15 11:55 ` [Bug tree-optimization/52252] " rguenth at gcc dot gnu.org 2012-02-29 12:34 ` evstupac at gmail dot com 2012-07-13 8:48 ` rguenth at gcc dot gnu.org 2014-02-11 14:27 ` evstupac at gmail dot com 2014-05-07 12:11 ` kyukhin at gcc dot gnu.org 2014-06-11 8:38 ` kyukhin at gcc dot gnu.org 2014-06-18 7:47 ` kyukhin at gcc dot gnu.org 2023-08-31 7:07 ` rguenth at gcc dot gnu.org 2023-11-28 6:06 ` pinskia at gcc dot gnu.org 2023-11-28 10:55 ` rguenther at suse dot de 2023-11-28 22:24 ` pinskia at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-52252-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).