From: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
To: Richard Biener <rguenther@suse.de>, Jakub Jelinek <jakub@redhat.com>
Cc: nd <nd@arm.com>, David Malcolm <dmalcolm@redhat.com>,
Jonathan Wakely <jwakely.gcc@gmail.com>,
Andrew Haley <aph@redhat.com>,
Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>,
"Kay F. Jahnke" <kfjahnke@gmail.com>,
"gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: autovectorization in gcc
Date: Thu, 10 Jan 2019 11:11:00 -0000 [thread overview]
Message-ID: <0859f634-49fb-f603-0f0f-351b2f49298e@arm.com> (raw)
In-Reply-To: <alpine.LSU.2.20.1901100917230.23386@zhemvz.fhfr.qr>
On 10/01/2019 08:19, Richard Biener wrote:
> On Wed, 9 Jan 2019, Jakub Jelinek wrote:
>
>> On Wed, Jan 09, 2019 at 11:10:25AM -0500, David Malcolm wrote:
>>> extern void vf1()
>>> {
>>> #pragma vectorize enable
>>> for ( int i = 0 ; i < 32768 ; i++ )
>>> data [ i ] = std::sqrt ( data [ i ] ) ;
>>> }
>>>
>>> Compiling on this x86_64 box with -fopt-info-vec-missed shows the
>>
>>> _7 = .SQRT (_1);
>>> if (_1 u>= 0.0)
>>> goto <bb 8>; [99.95%]
>>> else
>>> goto <bb 4>; [0.05%]
>>>
>>> <bb 8> [local count: 1062472912]:
>>> goto <bb 5>; [100.00%]
>>>
>>> <bb 4> [local count: 531495]:
>>> __builtin_sqrtf (_1);
>>>
>>> I'm not sure where that control flow came from: it isn't in
>>> sqrt-test.cc.104t.stdarg
>>> but is in
>>> sqrt-test.cc.105t.cdce
>>> so I think it's coming from the argument-range code in cdce.
>>>
>>> Arguably the location on the statement is wrong: it's on the loop
>>> header, when it presumably should be on the std::sqrt call.
>>
>> See my either mail, it is the result of the -fmath-errno default,
>> the inline emitted sqrt doesn't handle errno setting and we emit
>> essentially x = sqrt (arg); if (__builtin_expect (arg < 0.0, 0)) sqrt (arg); where
>> the former sqrt is inline using HW instructions and the latter is the
>> library call.
>>
>> With some extra work we could vectorize it; e.g. if we make it handle
>> OpenMP #pragma omp ordered simd efficiently, it would be the same thing
>> - allow non-vectorizable portions of vectorized loops by doing there a
>> scalar loop from 0 to vf-1 doing the non-vectorizable stuff + drop the limitation
>> that the vectorized loop is a single bb. Essentially, in this case it would
>> be
>> vec1 = vec_load (data + i);
>> vec2 = vec_sqrt (vec1);
>> if (__builtin_expect (any (vec2 < 0.0)))
>> {
>> for (int i = 0; i < vf; i++)
>> sqrt (vec2[i]);
>> }
>> vec_store (data + i, vec2);
>> If that would turn to be way too hard, we could for the vectorization
>> purposes hide that into the .SQRT internal fn, say add a fndecl argument to
>> it if it should treat the exceptional cases some way so that the control
>> flow isn't visible in the vectorized loop.
>
> If we decide it's worth the trouble I'd rather do that in the epilogue
> and thus make the any (vec2 < 0.0) a reduction. Like
>
> smallest = min(smallest, vec1);
>
> and after the loop do the errno thing on the smallest element.
>
> That said, this is a transform that is probably worthwhile even
> on scalar code, possibly easiest to code-gen right from the start
> in the call-dce pass.
if this is useful other than errno handling then fine,
but i think it's a really bad idea to add optimization
complexity because of errno handling: nobody checks
errno after sqrt (other than conformance test code).
-fno-math-errno is almost surely closer to what the user
wants than trying to vectorize the errno handling.
next prev parent reply other threads:[~2019-01-10 11:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-09 8:29 Kay F. Jahnke
2019-01-09 9:46 ` Kyrill Tkachov
2019-01-09 9:50 ` Andrew Haley
2019-01-09 9:56 ` Jonathan Wakely
2019-01-09 16:10 ` David Malcolm
2019-01-09 16:25 ` Jakub Jelinek
2019-01-10 8:19 ` Richard Biener
2019-01-10 11:11 ` Szabolcs Nagy [this message]
2019-01-09 16:26 ` David Malcolm
2019-01-09 10:47 ` Ramana Radhakrishnan
2019-01-10 9:24 ` Kay F. Jahnke
2019-01-10 11:18 ` Jonathan Wakely
2019-08-18 10:59 ` [wwwdocs PATCH] for " Gerald Pfeifer
2019-01-09 10:56 ` Kay F. Jahnke
2019-01-09 11:03 ` Jakub Jelinek
2019-01-09 11:21 ` Jakub Jelinek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0859f634-49fb-f603-0f0f-351b2f49298e@arm.com \
--to=szabolcs.nagy@arm.com \
--cc=aph@redhat.com \
--cc=dmalcolm@redhat.com \
--cc=gcc@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=jwakely.gcc@gmail.com \
--cc=kfjahnke@gmail.com \
--cc=kyrylo.tkachov@foss.arm.com \
--cc=nd@arm.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).