Re: Discussion about arm/aarch64 testcase failures seen with patch for PR111673

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
Cc: Surya Kumari Jangala <jskumari@linux.vnet.ibm.com>,
	 Peter Bergner <bergner@linux.ibm.com>,
	 GCC Development <gcc@gcc.gnu.org>,
	 vmakarov@redhat.com
Subject: Re: Discussion about arm/aarch64 testcase failures seen with patch for PR111673
Date: Tue, 28 Nov 2023 15:41:14 +0000	[thread overview]
Message-ID: <mpty1ehrilh.fsf@arm.com> (raw)
In-Reply-To: <6ca90437-7564-4339-b652-46587efe828e@foss.arm.com> (Richard Earnshaw's message of "Tue, 28 Nov 2023 13:48:45 +0000")

Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 28/11/2023 12:52, Surya Kumari Jangala wrote:
>> Hi Richard,
>> Thanks a lot for your response!
>> 
>> Another failure reported by the Linaro CI is as follows :
>> (Note: I am planning to send a separate mail for each failure, as this will make
>> the discussion easy to track)
>> 
>> FAIL: gcc.target/aarch64/sve/acle/general/cpy_1.c -march=armv8.2-a+sve -moverride=tune=none  check-function-bodies dup_x0_m
>> 
>> Expected code:
>> 
>>        ...
>>        add     (x[0-9]+), x0, #?1
>>        mov     (p[0-7])\.b, p15\.b
>>        mov     z0\.d, \2/m, \1
>>        ...
>>        ret
>> 
>> 
>> Code obtained w/o patch:
>>          addvl   sp, sp, #-1
>>          str     p15, [sp]
>>          add     x0, x0, 1
>>          mov     p3.b, p15.b
>>          mov     z0.d, p3/m, x0
>>          ldr     p15, [sp]
>>          addvl   sp, sp, #1
>>          ret
>> 
>> Code obtained w/ patch:
>> 	addvl   sp, sp, #-1
>>          str     p15, [sp]
>>          mov     p3.b, p15.b
>>          add     x0, x0, 1
>>          mov     z0.d, p3/m, x0
>>          ldr     p15, [sp]
>>          addvl   sp, sp, #1
>>          ret
>> 
>> As we can see, with the patch, the following two instructions are interchanged:
>>          add     x0, x0, 1
>>          mov     p3.b, p15.b
>
> Indeed, both look acceptable results to me, especially given that we 
> don't schedule results at -O1.
>
> There's two ways of fixing this:
> 1) Simply swap the order to what the compiler currently generates (which 
> is a little fragile, since it might flip back someday).
> 2) Write the test as
>
>
> ** (
> **       add     (x[0-9]+), x0, #?1
> **       mov     (p[0-7])\.b, p15\.b
> **       mov     z0\.d, \2/m, \1
> ** |
> **       mov     (p[0-7])\.b, p15\.b
> **       add     (x[0-9]+), x0, #?1
> **       mov     z0\.d, \1/m, \2
> ** )
>
> Note, we need to swap the match names in the third insn to account for 
> the different order of the earlier instructions.
>
> Neither is ideal, but the second is perhaps a little more bomb proof.
>
> I don't really have a strong feeling either way, but perhaps the second 
> is slightly preferable.
>
> Richard S: thoughts?

Yeah, I agree the second is probably better.  The | doesn't reset the
capture numbers, so I think the final instruction needs to be:

**       mov     z0\.d, \3/m, \4

Thanks,
Richard

>
> R.
>
>> I believe that this is fine and the test can be modified to allow it to pass on
>> aarch64. Please let me know what you think.
>> 
>> Regards,
>> Surya
>> 
>> 
>> On 24/11/23 4:18 pm, Richard Earnshaw wrote:
>>>
>>>
>>> On 24/11/2023 08:09, Surya Kumari Jangala via Gcc wrote:
>>>> Hi Richard,
>>>> Ping. Please let me know if the test failure that I mentioned in the mail below can be handled by changing the expected generated code. I am not conversant with arm, and hence would appreciate your help.
>>>>
>>>> Regards,
>>>> Surya
>>>>
>>>> On 03/11/23 4:58 pm, Surya Kumari Jangala wrote:
>>>>> Hi Richard,
>>>>> I had submitted a patch for review (https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html)
>>>>> regarding scaling save/restore costs of callee save registers with block
>>>>> frequency in the IRA pass (PR111673).
>>>>>
>>>>> This patch has been approved by VMakarov
>>>>> (https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632089.html).
>>>>>
>>>>> With this patch, we are seeing performance improvements with spec on x86
>>>>> (exchange: 5%, xalancbmk: 2.5%) and on Power (perlbench: 5.57%).
>>>>>
>>>>> I received a mail from Linaro about some failures seen in the CI pipeline with
>>>>> this patch. I have analyzed the failures and I wish to discuss the analysis with you.
>>>>>
>>>>> One failure reported by the Linaro CI is:
>>>>>
>>>>> FAIL: gcc.target/arm/pr111235.c scan-assembler-times ldrexd\tr[0-9]+, r[0-9]+, \\[r[0-9]+\\] 2
>>>>>
>>>>> The diff in the assembly between trunk and patch is:
>>>>>
>>>>> 93c93
>>>>> <       push    {r4, r5}
>>>>> ---
>>>>>>         push    {fp}
>>>>> 95c95
>>>>> <       ldrexd  r4, r5, [r0]
>>>>> ---
>>>>>>         ldrexd  fp, ip, [r0]
>>>>> 99c99
>>>>> <       pop     {r4, r5}
>>>>> ---
>>>>>>         ldr     fp, [sp], #4
>>>>>
>>>>>
>>>>> The test fails with patch because the ldrexd insn uses fp & ip registers instead
>>>>> of r[0-9]+
>>>>>
>>>>> But the code produced by patch is better because it is pushing and restoring only
>>>>> one register (fp) instead of two registers (r4, r5). Hence, this test can be
>>>>> modified to allow it to pass on arm. Please let me know what you think.
>>>>>
>>>>> If you need more information, please let me know. I will be sending separate mails
>>>>> for the other test failures.
>>>>>
>>>
>>> Thanks for looking at this.
>>>
>>>
>>> The key part of this test is that the compiler generates LDREXD.  The registers used for that are pretty much irrelevant as we don't match them to any other operations within the test.  So I'd recommend just testing for the mnemonic and not for any of the operands (ie just match "ldrexd\t").
>>>
>>> R.
>>>
>>>>> Regards,
>>>>> Surya
>>>>>
>>>>>
>>>>>

next prev parent reply	other threads:[~2023-11-28 15:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-03 11:28 Discussion about arm " Surya Kumari Jangala
2023-11-24  8:09 ` Surya Kumari Jangala
2023-11-24 10:48   ` Richard Earnshaw
2023-11-28 12:52     ` Discussion about arm/aarch64 " Surya Kumari Jangala
2023-11-28 13:48       ` Richard Earnshaw
2023-11-28 15:41         ` Richard Sandiford [this message]
2023-12-14  7:17         ` Surya Kumari Jangala
2023-12-14 16:11           ` Richard Earnshaw (lists)
2023-12-15 17:04             ` Surya Kumari Jangala
2024-01-29  6:14               ` Surya Kumari Jangala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpty1ehrilh.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Richard.Earnshaw@foss.arm.com \
    --cc=bergner@linux.ibm.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jskumari@linux.vnet.ibm.com \
    --cc=vmakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).