Re: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: 钟居哲 <juzhe.zhong@rivai.ai>
To: richard.sandiford <richard.sandiford@arm.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	rguenther <rguenther@suse.de>,  kito.cheng <kito.cheng@gmail.com>
Subject: Re: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1)
Date: Fri, 19 Aug 2022 22:34:35 +0800	[thread overview]
Message-ID: <781C254B1237B319+202208192234349812258@rivai.ai> (raw)
In-Reply-To: <mpt5yiol7k2.fsf@arm.com>

I rewrite test_vector_subregs_fore_back as follows:

static void
test_vector_subregs_fore_back (machine_mode inner_mode)
{
  poly_uint64 nunits = GET_MODE_NUNITS (inner_mode);
  unsigned int min_nunits = constant_lower_bound (nunits);
  scalar_mode int_mode = GET_MODE_INNER (inner_mode);
  unsigned int count = gcd (min_nunits, 4);

  rtx_vector_builder builder (inner_mode, count, 2);
  unsigned int nelts_per_pattern = count == 1 ? 2 : count;
  for (unsigned int i = 0; i < nelts_per_pattern; ++i)
    builder.quick_push (gen_int_mode (i, int_mode));
  for (unsigned int i = 0; i < count; ++i)
    builder.quick_push (gen_int_mode (-(int) i, int_mode));
  rtx x = builder.build ();

  test_vector_subregs_modes (x);
  if (!nunits.is_constant ())
    test_vector_subregs_modes (x, nunits - min_nunits, count);
}

I add the code: unsigned int nelts_per_pattern = count == 1 ? 2 : count;
then replace the "count" into "nelts_per_pattern " in the first loop.

It can pass now. And "x" value I print out seems to be correct:
(const_vector:VNx1DI [
        (const_int 0 [0])
        repeat [
            (const_int 1 [0x1])
        ]
    ])

Is this correct solution ? 

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Sandiford
Date: 2022-08-19 20:52
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther; kito.cheng
Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1)
"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes:
>> Ah, right, sorry for the bogus suggestion.
>> In that case, what specifically goes wrong?  Which test in
>> test_vector_subregs_modes fails, and what does the ICE look like?
>
> Because may_be (nunits, 1) return true. For nunits = (1,1), it will execute test_vector_subregs_modes.
> The fail ICE:
> ../../../riscv-gcc/gcc/simplify-rtx.cc:8396: test_vector_subregs_modes: FAIL: ASSERT_RTX_EQ (expected, simplify_subreg (QImode, x, inner_mode, byte))
>   expected: (nil)
>
>   actual: (const_int 0 [0])
> cc1: internal compiler error: in assert_rtx_eq_at, at selftest-rtl.cc:57
> 0x1304ee1 selftest::assert_rtx_eq_at(selftest::location const&, char const*, rtx_def*, rtx_def*)
>         ../../../riscv-gcc/gcc/selftest-rtl.cc:57
> 0x1332504 test_vector_subregs_modes
>         ../../../riscv-gcc/gcc/simplify-rtx.cc:8395
> 0x1332988 test_vector_subregs_fore_back
>         ../../../riscv-gcc/gcc/simplify-rtx.cc:8442
> 0x1332ae7 test_vector_subregs
>         ../../../riscv-gcc/gcc/simplify-rtx.cc:8467
> 0x1332c57 test_vector_ops
>         ../../../riscv-gcc/gcc/simplify-rtx.cc:8487
> 0x1332c7b selftest::simplify_rtx_cc_tests()
>         ../../../riscv-gcc/gcc/simplify-rtx.cc:8547
> 0x21318fa selftest::run_tests()
>         ../../../riscv-gcc/gcc/selftest-run-tests.cc:115
> 0x1362a76 toplev::run_self_tests()
>         ../../../riscv-gcc/gcc/toplev.cc:2205
>
> I analyzed the codes:
> In test_vector_subregs_fore_back, when nunits = (1,1). The expected = NULL_RTX and simplify_subreg (QImode, x, inner_mode, byte) = const_int 0.
> So the assertion fails.
 
Hmm, ok, so the subreg operation is unexpected succeeding.
 
> This is the test for stepped vector using 2 element per pattern.  For poly_uint16 (1,1), it's true it is possible only has 1 element. 
 
The stepped case is 3 elements per pattern rather than 2.  In a stepped
pattern: a, b, b+n are represented explicitly, then the rest are
implicitly b+n*2, b+n*3, etc.
 
The case being handled by this code is instead the 2-element case:
a, b are represented explicitly, then the rest are implicitly all b.
 
Why is (1,1) different though?  The test is constructing:
 
  nunits: 1 + 1x
  shape: nelts_per_pattern == 2, npatterns == 1
  elements: a, b[, b, b, b, b, ...]
 
It then tests subregs starting at 0 + 1x (so starting at the first b).
But for (2,2) we should have:
 
  nunits: 2 + 2x
  shape: nelts_per_pattern == 2, npatterns == 2
  elements: a1, a2, b1, b2[, b1, b2, b1, b2, ...]
 
and we should test subregs starting at 0 + 2x (so starting at the
first b1).  The two cases should be very similar, it's just that the
(2,2) case doubles the number of patterns.
 
> I think it makes sense to fail the test. However for poly (1,1) machine mode, can we have the chance that some target define this
> kind of machine mode only used for intrinsics?  I already developed full RVV support in GCC (including intrinsc and auto-vectorization).
> I only enable auto-vectorization with mode larger than (2,2) and test it fully.
> From my experience, it seems the stepped vector only created during VLA auto-vectorization. And I think only allow poly (1,1)mode used in intrinsics will
> not create issues. Am I understanding wrong ？Feel free to correct me. Thanks ~
 
Following on from what I said above, it doesn't look like this particular
case is related to stepped vectors.
 
(1,1) shouldn't (need to) be a special case though.  Any potentital
problems that would occur for (1,1) with npatterns==1 would also occur
for (n,n) with npatterns==n.  E.g. if stepped vectors are problematic
for (1,1) then an interleaving of 2 stepped vectors (i.e. npatterns==2)
would be problematic for (2,2).
 
So yeah, preventing a mode being used for autovectorisation would allow
the target to have a bit more control over which constants are actually
generated.  But it shouldn't be necessary to do that for correctness.
 
Thanks,
Richard
 
> juzhe.zhong@rivai.ai
>  
> From: Richard Sandiford
> Date: 2022-08-19 17:35
> To: juzhe.zhong\@rivai.ai
> CC: rguenther; gcc-patches; kito.cheng
> Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1)
> "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes:
>> Hi, Richard. I tried the codes:
>> if (!nunits.is_constant () && maybe_gt (nunits, 1)) 
>> test_vector_subregs_modes (x, nunits - min_nunits, count);
>>
>> It still failed. For nunits = (1,1) , maybe_gt (nunits, 1) return true value.
>  
> Ah, right, sorry for the bogus suggestion.
>  
> In that case, what specifically goes wrong?  Which test in
> test_vector_subregs_modes fails, and what does the ICE look like?
>  
> Thanks,
> Richard
>  
>> But I tried:
>> if (!nunits.is_constant () && known_gt (nunits, 1)) 
>> test_vector_subregs_modes (x, nunits - min_nunits, count);
>> It pass. But it report a warning: "warning: comparison between signed and unsigned integer expressions [-Wsign-compare]" during the compilation.
>>
>> Finally, I tried:
>> if (!nunits.is_constant () && known_gt (GET_MODE_NUNITS (inner_mode), 1)) 
>> test_vector_subregs_modes (x, nunits - min_nunits, count);
>> It passed with no warning.
>>
>> Is 'known_gt (GET_MODE_NUNITS (inner_mode), 1)' a good solution for this?
>> Thanks!
>>
>>
>> juzhe.zhong@rivai.ai
>>  
>> From: Richard Sandiford
>> Date: 2022-08-19 16:03
>> To: juzhe.zhong
>> CC: gcc-patches; rguenther; kito.cheng
>> Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1)
>> juzhe.zhong@rivai.ai writes:
>>> From: zhongjuzhe <juzhe.zhong@rivai.ai>
>>>
>>> Hello. This patch is preparing for following RVV support.
>>>
>>> Both ARM SVE and RVV (RISC-V 'V' Extension) support length-agnostic vector.
>>> The minimum vector length of ARM SVE is 128-bit and the runtime invariant of ARM SVE is always 128-bit blocks.
>>> However, the minimum vector length of RVV can be 32bit in 'Zve32*' sub-extension and 64bit in 'Zve*' sub-extension.
>>>
>>> So I define the machine_mode as follows:
>>> VECTOR_MODE_WITH_PREFIX (VNx, INT, DI, 1, 0);
>>> ADJUST_NUNITS (MODE, riscv_vector_chunks);
>>> The riscv_vector_chunks = poly_uint16 (1, 1)
>>>
>>> The compilation is failed for the stepped vector test:
>>> (const_vector:VNx1DI repeat [
>>>         (const_int 8 [0x8])
>>>         (const_int 7 [0x7])
>>>     ])
>>>
>>> I understand for stepped vector should always have aleast 2 elements and stepped vector initialization is common
>>> for VLA (vector-lengthe agnostic) auto-vectorization. It makes sense that report fail for stepped vector of poly_uint16 (1, 1).
>>>
>>> machine mode with nunits = poly_uint16 (1, 1) needs to implemented in intrinsics. And I would like to enable RVV auto-vectorization
>>> with vector mode only nunits is larger than poly_uint16 (2, 2) in RISC-V backend. I think it will not create issue if we define
>>> vector mode with nunits = poly_uint16 (1, 1). Feel free to correct me or offer me some other better solutions. Thanks!
>>>
>>>   
>>>
>>> gcc/ChangeLog:
>>>
>>>         * simplify-rtx.cc (test_vector_subregs_fore_back): skip test for poly_uint16 (1, 1).
>>>
>>> ---
>>>  gcc/simplify-rtx.cc | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
>>> index 7d09bf7103d..61e0dfa00d0 100644
>>> --- a/gcc/simplify-rtx.cc
>>> +++ b/gcc/simplify-rtx.cc
>>> @@ -8438,7 +8438,7 @@ test_vector_subregs_fore_back (machine_mode inner_mode)
>>>    rtx x = builder.build ();
>>>  
>>>    test_vector_subregs_modes (x);
>>> -  if (!nunits.is_constant ())
>>> +  if (!nunits.is_constant () && known_ne (nunits, poly_uint16 (1, 1)))
>>>      test_vector_subregs_modes (x, nunits - min_nunits, count);
>>  
>> I think instead we should use maybe_gt (nunits, 1), on the basis that
>> the fore_back tests require vectors that have a minimum of 2 elements.
>> Something like poly_uint16 (1, 2) would have the same problem as
>> poly_uint16 (1, 1).  ({1, 2} is an unlikely value, but it's OK in
>> principle.)
>>  
>> This corresponds to the minimum of 3 elements for the stepped tests:
>>  
>>   if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
>>       && maybe_gt (GET_MODE_NUNITS (mode), 2))
>>     {
>>       test_vector_ops_series (mode, scalar_reg);
>>       test_vector_subregs (mode);
>>     }
>>  
>> Thanks,
>> Richard
>>  
>

next prev parent reply	other threads:[~2022-08-19 14:34 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-18 10:46 juzhe.zhong
2022-08-19  8:03 ` Richard Sandiford
2022-08-19  8:19   ` juzhe.zhong
2022-08-19  9:06   ` juzhe.zhong
2022-08-19  9:35     ` Richard Sandiford
2022-08-19  9:57       ` juzhe.zhong
2022-08-19 12:52         ` Richard Sandiford
2022-08-19 14:10           ` 钟居哲
2022-08-19 14:34           ` 钟居哲 [this message]
2022-08-20  1:23           ` 钟居哲
2022-08-22  8:31             ` Richard Sandiford
2022-08-22  8:45               ` juzhe.zhong
2022-08-22  8:56                 ` Richard Sandiford
2022-08-22  9:12                   ` juzhe.zhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=781C254B1237B319+202208192234349812258@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=kito.cheng@gmail.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).