public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: xiezhiheng <xiezhiheng@huawei.com>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: RE: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3
Date: Thu, 20 Aug 2020 08:24:33 +0000	[thread overview]
Message-ID: <4dc070cbaca04eb59d2cac94bed1a9c3@huawei.com> (raw)
In-Reply-To: <mptk0xvlzku.fsf@arm.com>

[-- Attachment #1: Type: text/plain, Size: 2284 bytes --]

> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandiford@arm.com]
> Sent: Wednesday, August 19, 2020 6:06 PM
> To: xiezhiheng <xiezhiheng@huawei.com>
> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> emitted at -O3
> 
> xiezhiheng <xiezhiheng@huawei.com> writes:
> > I add FLAGS for part of intrinsics in aarch64-simd-builtins.def first for a try,
> > including all the add/sub arithmetic intrinsics.
> >
> > Something like faddp intrinsic which only handles floating-point operations,
> > both FP and NONE flags are suitable for it because FLAG_FP will be added
> > later if the intrinsic handles floating-point operations.  And I prefer FP
> since
> > it would be more clear.
> 
> Sounds good to me.
> 
> > But for qadd intrinsics, they would modify FPSR register which is a scenario
> > I missed before.  And I consider to add an additional flag
> FLAG_WRITE_FPSR
> > to represent it.
> 
> I don't think we make any attempt to guarantee that the Q flag is
> meaningful after saturating intrinsics.  To do that, we'd need to model
> the modification of the flag in the .md patterns too.
> 
> So my preference would be to leave this out and just use NONE for the
> saturating forms too.

The problem is that the test case in the attachment has different results under -O0 and -O2.

In gimple phase statement:
  _9 = __builtin_aarch64_uqaddv2si_uuu (op0_4, op1_6);
would be treated as dead code if we set NONE flag for saturating intrinsics.
Adding FLAG_WRITE_FPSR would help fix this problem.

Even when we set FLAG_WRITE_FPSR, the uqadd insn: 
  (insn 11 10 12 2 (set (reg:V2SI 97)
        (us_plus:V2SI (reg:V2SI 98)
            (reg:V2SI 99))) {aarch64_uqaddv2si}
     (nil))
could also be eliminated in RTL phase because this insn will be treated as dead insn.
So I think we might also need to modify saturating instruction patterns adding the side effect of set the FPSR register.

So if we could use NONE flag for saturating intrinsics, the description of function attributes and patterns are both incorrect. 
I think I can propose another patch to fix the patterns if you agree? 

Thanks,
Xie Zhiheng

[-- Attachment #2: test.c --]
[-- Type: text/plain, Size: 852 bytes --]

#include <arm_neon.h>
#include <stdlib.h>

typedef union {
  struct {
    int _xxx:24;
    unsigned int FZ:1;
    unsigned int DN:1;
    unsigned int AHP:1;
    unsigned int QC:1;
    int V:1;
    int C:1;
    int Z:1;
    int N:1;
  } b;
  unsigned int word;
} _ARM_FPSCR;

static volatile int __read_neon_cumulative_sat (void) {
    _ARM_FPSCR _afpscr_for_qc;
    asm volatile ("mrs %0,fpsr" : "=r" (_afpscr_for_qc));
    return _afpscr_for_qc.b.QC;
}

int main()
{
  uint32x2_t op0, op1, res;

  op0 = vdup_n_u32 ((uint32_t)0xfffffff0);
  op1 = vdup_n_u32 ((uint32_t)0x20);

  _ARM_FPSCR _afpscr_for_qc;
  asm volatile ("mrs %0,fpsr" : "=r" (_afpscr_for_qc));
  _afpscr_for_qc.b.QC = (0);
  asm volatile ("msr fpsr,%0" :  : "r" (_afpscr_for_qc));

  res = vqadd_u32 (op0, op1);
  if (__read_neon_cumulative_sat () != 1)
    abort ();

  return 0;
}

  reply	other threads:[~2020-08-20  8:24 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-02 13:22 xiezhiheng
2020-07-02 14:45 ` Richard Biener
2020-07-06  9:10   ` xiezhiheng
2020-07-06  9:31     ` Richard Sandiford
2020-07-07 12:49       ` xiezhiheng
2020-07-07 14:07         ` Richard Sandiford
2020-07-15  8:49           ` xiezhiheng
2020-07-16 12:41             ` Richard Sandiford
2020-07-16 14:05               ` xiezhiheng
2020-07-17  9:03                 ` Richard Sandiford
2020-07-30  2:43                   ` xiezhiheng
2020-07-31  9:02                     ` Richard Sandiford
2020-08-03  2:21                       ` xiezhiheng
2020-08-03 13:55                         ` Richard Sandiford
2020-08-04  8:01                           ` xiezhiheng
2020-08-04 16:25                             ` Richard Sandiford
2020-08-17  8:05                               ` xiezhiheng
2020-08-19 10:06                                 ` Richard Sandiford
2020-08-20  8:24                                   ` xiezhiheng [this message]
2020-08-20  8:55                                     ` Richard Sandiford
2020-08-20 12:16                                       ` xiezhiheng
2020-08-21  9:02                                         ` Richard Sandiford
2020-08-25  3:14                                           ` xiezhiheng
2020-08-25 11:07                                             ` Richard Sandiford
2020-08-26  1:39                                               ` xiezhiheng
2020-08-26 10:14                                                 ` Richard Sandiford
2020-08-27  2:50                                                   ` xiezhiheng
2020-08-27  8:08                                                     ` Richard Sandiford
2020-10-09  9:32                                                       ` xiezhiheng
2020-10-13  8:07                                                         ` Richard Sandiford
2020-10-19  9:21                                                           ` xiezhiheng
2020-10-20 16:53                                                             ` Richard Sandiford
2020-10-22  9:16                                                               ` xiezhiheng
2020-10-26 13:03                                                                 ` Richard Sandiford
2020-10-30  6:41                                                                   ` xiezhiheng
2020-10-30 10:23                                                                     ` Richard Sandiford
2020-11-03 11:59                                                                       ` xiezhiheng
2020-11-03 13:57                                                                         ` Richard Sandiford
2020-11-09  3:27                                                                           ` xiezhiheng
2020-11-10 11:53                                                                             ` Richard Sandiford
2020-11-11  7:59                                                                               ` xiezhiheng
2020-11-11 10:59                                                                                 ` Richard Sandiford
  -- strict thread matches above, loose matches on Subject: below --
2020-04-02  6:35 xiezhiheng
2020-06-09 20:40 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4dc070cbaca04eb59d2cac94bed1a9c3@huawei.com \
    --to=xiezhiheng@huawei.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).