From: David Miller <davem@davemloft.net>
To: rth@redhat.com
Cc: gcc@gcc.gnu.org
Subject: Re: VIS2 pattern review
Date: Thu, 13 Oct 2011 19:56:00 -0000 [thread overview]
Message-ID: <20111013.142636.1859659747859622111.davem@davemloft.net> (raw)
In-Reply-To: <4E96358F.30405@redhat.com>
From: Richard Henderson <rth@redhat.com>
Date: Wed, 12 Oct 2011 17:49:19 -0700
> There's a code sample 7-1 that illustrates a 16x16 multiply:
>
> fmul8sux16 %f0, %f1, %f2
> fmul8ulx16 %f0, %f1, %f3
> fpadd16 %f2, %f3, %f4
Be wary of code examples that don't even assemble (even numbered
float registers are required here).
fmul8sux16 basically does, for each element:
src1 = (rs1 >> 8) & 0xff;
src2 = rs2 & 0xffff;
product = src1 * src2;
scaled = (product & 0x00ffff00) >> 8;
if (product & 0x80)
scaled++;
rd = scaled & 0xffff;
fmul8ulx16 does the same except the assignment to src1 is:
src1 = rs1 & 0xff;
Therefore, I think this "16 x 16 multiply" operation isn't the kind
you think it is, and it's therefore not appropriate to use this in the
compiler for vector multiplies.
Just for shits and grins I tried it and the slp-7 testcase, as expected,
fails. The main multiply loop in that test case is compiled to:
sethi %hi(.LLC6), %i3
sethi %hi(in2), %g1
ldd [%i3+%lo(.LLC6)], %f22
sethi %hi(.LLC7), %i4
sethi %hi(.LLC8), %i2
sethi %hi(.LLC9), %i3
add %fp, -256, %g2
ldd [%i4+%lo(.LLC7)], %f20
or %g1, %lo(in2), %g1
ldd [%i2+%lo(.LLC8)], %f18
mov %fp, %i5
ldd [%i3+%lo(.LLC9)], %f16
mov %g1, %g4
mov %g2, %g3
.LL10:
ldd [%g4+8], %f14
ldd [%g4+16], %f12
fmul8sux16 %f14, %f22, %f26
ldd [%g4+24], %f10
fmul8ulx16 %f14, %f22, %f24
ldd [%g4], %f8
fmul8sux16 %f12, %f20, %f34
fmul8ulx16 %f12, %f20, %f32
fmul8sux16 %f10, %f18, %f30
fpadd16 %f26, %f24, %f14
fmul8ulx16 %f10, %f18, %f28
fmul8sux16 %f8, %f16, %f26
fmul8ulx16 %f8, %f16, %f24
fpadd16 %f34, %f32, %f12
std %f14, [%g3+8]
fpadd16 %f30, %f28, %f10
std %f12, [%g3+16]
fpadd16 %f26, %f24, %f8
std %f10, [%g3+24]
std %f8, [%g3]
add %g3, 32, %g3
cmp %g3, %i5
bne,pt %icc, .LL10
add %g4, 32, %g4
and it simply gives the wrong results.
The entire out2[] array is all zeros.
next prev parent reply other threads:[~2011-10-13 18:26 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-13 12:29 Richard Henderson
2011-10-13 19:56 ` David Miller [this message]
2011-10-13 20:06 ` David Miller
2011-10-13 20:20 ` Richard Henderson
2011-10-13 22:46 ` David Miller
2011-10-13 22:53 ` Richard Henderson
2011-10-14 1:50 ` David Miller
2011-10-14 4:38 ` Eric Botcazou
2011-10-14 6:06 ` David Miller
2011-10-14 16:40 ` Eric Botcazou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111013.142636.1859659747859622111.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=gcc@gcc.gnu.org \
--cc=rth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).