Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: guojiufu <guojiufu@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	gcc-patches@gcc.gnu.org,  wschmidt@linux.ibm.com,
	dje.gcc@gmail.com, jlaw@tachyum.com, amker.cheng@gmail.com
Subject: Re: [PATCH] Check type size for doloop iv on BITS_PER_WORD [PR61837]
Date: Mon, 12 Jul 2021 16:21:13 +0800	[thread overview]
Message-ID: <e752f0a62db41698722413ac8eb4698b@imap.linux.ibm.com> (raw)
In-Reply-To: <nycvar.YFH.7.76.2107120816190.10711@zhemvz.fhfr.qr>

On 2021-07-12 14:20, Richard Biener wrote:
> On Fri, 9 Jul 2021, Segher Boessenkool wrote:
> 
>> On Fri, Jul 09, 2021 at 08:43:59AM +0200, Richard Biener wrote:
>> > I wonder if there's a way to query the target what modes the doloop
>> > pattern can handle (not being too familiar with the doloop code).
>> 
>> You can look what modes are allowed for operand 0 of doloop_end,
>> perhaps?  Although that is a define_expand, not a define_insn, so it 
>> is
>> hard to introspect.
>> 
>> > Why do you need to do any checks besides the new type being able to
>> > represent all IV values?  The original doloop IV will never wrap
>> > (OTOH if niter is U*_MAX then we compute niter + 1 which will become
>> > zero ... I suppose the doloop might still do the correct thing here
>> > but it also still will with a IV with larger type).

The issue comes from U*_MAX (original short MAX), as you said: on which
niter + 1 becomes zero.  And because the step for doloop is -1; then, on
larger type 'zero - 1' will be a very large number on larger type
(e.g. 0xff...ff); but on the original short type 'zero - 1' is a small 
value
(e.g. "0xff").

>> 
>> doloop_valid_p guarantees it is simple and doesn't wrap.
>> 
>> > I'd have expected sth like
>> >
>> >    ntype = lang_hooks.types.type_for_mode (word_mode, TYPE_UNSIGNED
>> > (ntype));
>> >
>> > thus the decision made using a mode - which is also why I wonder
>> > if there's a way to query the target for this.  As you say,
>> > it _may_ be fast, so better check (somehow).


I was also thinking of using hooks like type_for_size/type_for_mode.
     /* Use type in word size may fast.  */
     if (TYPE_PRECISION (ntype) < BITS_PER_WORD
         && Wi::ltu_p (niter_desc->max, wi::to_widest (TYPE_MAX_VALUE 
(ntype))))
       {
         ntype = lang_hooks.types.type_for_size (BITS_PER_WORD, 1);
         base = fold_convert (ntype, base);
       }

As you pointed out, this does not query the mode from targets.
As Segher pointed out "doloop_end" checks unsupported mode, while it 
seems
not easy to use it in tree-ssa-loop-ivopts.c.
For implementations of doloop_end, tartgets like rs6000/aarch64/ia64 
requires
Pmode/DImode; while there are other targets that work on other 'mode' 
(e.g. SI).


In doloop_optimize, there is code:

```
     mode = desc->mode;
.....
     doloop_reg = gen_reg_rtx (mode);
     rtx_insn *doloop_seq = targetm.gen_doloop_end (doloop_reg, 
start_label);

     word_mode_size = GET_MODE_PRECISION (word_mode);
     word_mode_max = (HOST_WIDE_INT_1U << (word_mode_size - 1) << 1) - 1;
     if (! doloop_seq
         && mode != word_mode
         /* Before trying mode different from the one in that # of 
iterations is
            computed, we must be sure that the number of iterations fits 
into
            the new mode.  */
         && (word_mode_size >= GET_MODE_PRECISION (mode)
             || wi::leu_p (iterations_max, word_mode_max)))
       {
         if (word_mode_size > GET_MODE_PRECISION (mode))
           count = simplify_gen_unary (ZERO_EXTEND, word_mode, count, 
mode);
         else
           count = lowpart_subreg (word_mode, count, mode);
         PUT_MODE (doloop_reg, word_mode);
         doloop_seq = targetm.gen_doloop_end (doloop_reg, start_label);
       }
     if (! doloop_seq)
       {
         if (dump_file)
           fprintf (dump_file,
                    "Doloop: Target unwilling to use doloop pattern!\n");
         return false;
       }
```
The above code first tries the mode of niter_desc by call 
targetm.gen_doloop_end
to see if the target can generate doloop insns, if fail, then try to use
'word_mode' against gen_doloop_end.


>> 
>> Almost all targets just use Pmode, but there is no such guarantee I
>> think, and esp. some targets that do not have machine insns for this
>> (but want to generate different code for this anyway) can do pretty 
>> much
>> anything.
>> 
>> Maybe using just Pmode here is good enough though?
> 
> I think Pmode is a particularly bad choice and I'd prefer word_mode
> if we go for any hardcoded mode.  s390x for example seems to handle
> both SImode and DImode (but names the helper gen_doloop_si64
> for SImode?!).  But indeed it looks like somehow querying doloop_end
> is going to be difficult since the expander doesn't have any mode,
> so we'd have to actually try emit RTL here.

Instead of using hardcode mode, maybe we could add a hook for targets to 
return
the preferred mode.


Thanks for those valuable comments!

Jiufu Guo



> 
> Richard.

next prev parent reply	other threads:[~2021-07-12  8:21 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-09  2:07 Jiufu Guo
2021-07-09  6:43 ` Richard Biener
2021-07-09 17:08   ` Segher Boessenkool
2021-07-12  6:20     ` Richard Biener
2021-07-12  8:21       ` guojiufu [this message]
2021-07-12  8:57         ` Richard Biener
2021-07-12  9:43           ` guojiufu
2021-07-12 10:02             ` Richard Biener
2021-07-12 14:08               ` guojiufu
2021-07-12 14:46                 ` Richard Biener
2021-07-12 15:53                   ` guojiufu
2021-07-12 15:59                     ` guojiufu
2021-07-13  2:09                     ` guojiufu
2021-07-13  7:09                       ` Richard Biener
2021-07-13  8:16                         ` guojiufu
2021-07-13 15:51                       ` Segher Boessenkool
2021-07-14  2:38                         ` guojiufu
2021-07-13 15:38       ` Segher Boessenkool
2021-07-14  3:10         ` guojiufu
2021-07-14  7:17           ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e752f0a62db41698722413ac8eb4698b@imap.linux.ibm.com \
    --to=guojiufu@linux.ibm.com \
    --cc=amker.cheng@gmail.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@tachyum.com \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).