From: HAO CHEN GUI <guihaoc@linux.ibm.com>
To: "Kewen.Lin" <linkw@linux.ibm.com>,
Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, David <dje.gcc@gmail.com>,
Peter Bergner <bergner@linux.ibm.com>
Subject: Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
Date: Thu, 22 Sep 2022 17:59:07 +0800 [thread overview]
Message-ID: <fe24d049-a800-9835-486c-0a6bd7280f43@linux.ibm.com> (raw)
In-Reply-To: <2c999590-8222-2879-3fe3-ca69159293ec@linux.ibm.com>
Hi Kewen & Segher,
Thanks so much for your review comments.
On 22/9/2022 上午 10:28, Kewen.Lin wrote:
> on 2022/9/22 05:56, Segher Boessenkool wrote:
>> Hi!
>>
>> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
>>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>>> of smin/max. So the builtins always generate xs[min/max]dp on all
>>> platforms.
>>
>> But how does this not blow up with -ffast-math?
>
> Indeed. Since it guards with "TARGET_VSX && !flag_finite_math_only",
> the bifs seem to cause ICE at -ffast-math.
>
> Haochen, could you double check it?
I tested it with "-ffast-math". fmin/max functions are converted to
MIN/MAX_EXPR in gimple lower pass. But the built-ins are not and hit the
ICE. I thought the built-ins are folded to MIN/MAX_EXPR like vec_ versions'
when fast-math is set. In fact they're not. Sorry for that.
I made a patch to fold these two built-ins to MIN/MAX_EXPR when fast-math
is set. Then the built-ins are converted to MIN/MAX_EXPR and expanded to
smin/max.
Thanks for pointing out the problem!
>
>>
>> In the other direction I am worried that the unspecs will degrade
>> performance (relative to smin/smax) when -ffast-math *is* active (and
>> this new builtin code and pattern doesn't blow up).
>
> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN}
> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO
> the previous expansion to smin/smax contradicts with the bif names (users
> expect to map them to xs{min,max}dp than others).
>
>>
>> I still think we should get RTL codes for this, to have access to proper
>> floating point min/max semantics always and everywhere. "fmin" and
>> "fmax" seem to be good names :-)
>
> It would be good, especially if we have observed some uses of these bifs
> and further opportunities around them. :)
>
Shall we submit a PR to add fmin/fmax to RTL codes?
> BR,
> Kewen
next prev parent reply other threads:[~2022-09-22 9:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-24 2:02 HAO CHEN GUI
2022-07-04 6:32 ` Ping " HAO CHEN GUI
2022-08-01 2:03 ` Ping^2 " HAO CHEN GUI
2022-09-21 5:20 ` Ping^3 " HAO CHEN GUI
2022-09-21 9:34 ` Kewen.Lin
2022-09-21 21:56 ` Segher Boessenkool
2022-09-22 2:28 ` Kewen.Lin
2022-09-22 9:59 ` HAO CHEN GUI [this message]
2022-09-22 13:56 ` Segher Boessenkool
2022-09-22 14:05 ` Segher Boessenkool
2022-09-26 5:58 ` Kewen.Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe24d049-a800-9835-486c-0a6bd7280f43@linux.ibm.com \
--to=guihaoc@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=linkw@linux.ibm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).