Re: [Patch, rs6000, aarch64, middle-end] Add implementation for different targets for pair mem fusion

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Ajit Agarwal <aagarwa1@linux.ibm.com>
To: Alex Coplan <alex.coplan@arm.com>,
	"Kewen.Lin" <linkw@linux.ibm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	Michael Meissner <meissner@linux.ibm.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>,
	richard.sandiford@arm.com
Subject: Re: [Patch, rs6000, aarch64, middle-end] Add implementation for different targets for pair mem fusion
Date: Tue, 4 Jun 2024 21:30:53 +0530	[thread overview]
Message-ID: <13fea2af-a64f-45c7-ae47-131191ac5871@linux.ibm.com> (raw)
In-Reply-To: <6a87ceb9-0de2-49f2-8998-17391c213c7d@linux.ibm.com>

Hello Richard:

On 03/06/24 9:28 pm, Ajit Agarwal wrote:
> Hello Richard:
> 
> On 03/06/24 8:24 pm, Richard Sandiford wrote:
>> Ajit Agarwal <aagarwa1@linux.ibm.com> writes:
>>> Hello Richard:
>>>
>>> On 03/06/24 7:47 pm, Richard Sandiford wrote:
>>>> Ajit Agarwal <aagarwa1@linux.ibm.com> writes:
>>>>> On 03/06/24 5:03 pm, Richard Sandiford wrote:
>>>>>> Ajit Agarwal <aagarwa1@linux.ibm.com> writes:
>>>>>>>> [...]
>>>>>>>> If it is intentional, what distinguishes things like vperm and xxinsertw
>>>>>>>> (and all other unspecs) from plain addition?
>>>>>>>>
>>>>>>>>   [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa")
>>>>>>>>         (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "wa")
>>>>>>>> 		    (match_operand:VSX_F 2 "vsx_register_operand" "wa")))]
>>>>>>>>
>>>>>>>
>>>>>>> Plain addition are not supported currently.
>>>>>>> We have not seen many cases with plain addition and this patch
>>>>>>> will not accept plain addition.
>>>>>>>
>>>>>>>  
>>>>>>>> This is why the intention behind the patch is important.  As it stands,
>>>>>>>> it isn't clear what criteria the patch is using to distinguish "valid"
>>>>>>>> fuse candidates from "invalid" ones.
>>>>>>>>
>>>>>>>
>>>>>>> Intention behind this patch all variants of UNSPEC instructions are
>>>>>>> supported and uses without UNSPEC are not supported in this patch.
>>>>>>
>>>>>> But why make the distinction this way though?  UNSPEC is a very
>>>>>> GCC-specific concept.  Whether something is an UNSPEC or some other
>>>>>> RTL code depends largely on historical accident.  E.g. we have specific
>>>>>> codes for VEC_SELECT, VEC_MERGE, and VEC_DUPLICATE, but don't have one
>>>>>> for VEC_PERM (even for VEC_PERM_EXPR exists in gimple).
>>>>>>
>>>>>> It seems unlikely that GCC's choice about whether to represent something
>>>>>> as an UNSPEC or as another RTL code lines up neatly with the kind of
>>>>>> codegen decisions that a good assembly programmer would make.
>>>>>>
>>>>>> I suppose another way of asking is to turn this around and say: what
>>>>>> kind of uses are you trying to exclude?  Presumably things are worse
>>>>>> if you remove this function override.  But what makes them worse?
>>>>>> What kind of uses cause the regression?
>>>>>>
>>>>>
>>>>> Uses of fused load where load with low address uses are modified with load with high address uses.
>>>>>
>>>>> Similarly load with high address uses are modified with load low address
>>>>> uses.
>>>>
>>>> It sounds like something is going wrong the subreg updates.
>>>> Can you give an example of where this occurs?  For instance...
>>>>
>>>>> This is the semantics of lxvp instructions which can occur through
>>>>> UNSPEC uses otherwise it breaks the functionality and seen failure
>>>>> in almost all vect regressions and SPEC benchmarks.
>>>>
>>>> ...could you take one of the simpler vect regressions, show the before
>>>> and after RTL, and why the transformation is wrong?
>>>
>>> Before the change:
>>>
>>> (insn 32 30 103 5 (set (reg:V16QI 127 [ _32 ])
>>>         (mem:V16QI (reg:DI 130 [ ivtmp.37 ]) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit}
>>>      (nil))
>>> (insn 103 32 135 5 (set (reg:V16QI 173 [ _32 ])
>>>         (mem:V16QI (plus:DI (reg:DI 130 [ ivtmp.37 ])
>>>                 (const_int 16 [0x10])) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit}
>>>      (nil))
>>> (insn 135 103 34 5 (set (reg:DI 155)
>>>         (plus:DI (reg:DI 130 [ ivtmp.37 ])
>>>             (const_int 16 [0x10]))) 66 {*adddi3}
>>>      (nil))
>>> (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ])
>>>         (unspec:V16QI [
>>>                 (reg:V16QI 127 [ _32 ]) repeated x2
>>>                 (reg:V16QI 152)
>>>             ] UNSPEC_VPERM))  {altivec_vperm_v16qi_direct}
>>>      (expr_list:REG_DEAD (reg:V16QI 127 [ _32 ])
>>>         (nil)))
>>> (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ])
>>>         (unspec:V16QI [
>>>                 (reg:V16QI 173 [ _32 ]) repeated x2
>>>                 (reg:V16QI 152)
>>>             ] UNSPEC_VPERM)) 
>>>  {altivec_vperm_v16qi_direct}
>>>
>>>
>>> After the change:
>>>
>>> (insn 103 30 135 5 (set (reg:OO 127 [ _32 ])
>>>         (mem:OO (reg:DI 130 [ ivtmp.37 ]) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {*movoo}
>>>      (nil))
>>> (insn 135 103 34 5 (set (reg:DI 155)
>>>         (plus:DI (reg:DI 130 [ ivtmp.37 ])
>>>             (const_int 16 [0x10]))) 66 {*adddi3}
>>>      (nil))
>>> (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ])
>>>         (unspec:V16QI [
>>>                 (subreg:V16QI (reg:OO 127 [ _32 ]) 16)
>>>                 (subreg:V16QI (reg:OO 127 [ _32 ]) 16)
>>>                 (reg:V16QI 152)
>>>             ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct}
>>>      (expr_list:REG_DEAD (reg:OO 127 [ _32 ])
>>>         (nil)))
>>> (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ])
>>>         (unspec:V16QI [
>>>                 (subreg:V16QI (reg:OO 127 [ _32 ]) 0)
>>>                 (subreg:V16QI (reg:OO 127 [ _32 ]) 0)
>>>                 (reg:V16QI 152)
>>>             ] UNSPEC_VPERM))  {altivec_vperm_v16qi_direct}
>>>
>>> After the change the tests passes.
>>
>> But isn't this an example of the optimisation working on unspecs,
>> and working correctly?
>>
> 
> Yes this is working fine.
> 
>> I meant instead: could you give an example of the vect regressions
>> that you saw with the unspec test removed?  You mentioned that many
>> vect tests regressed without the unspec test, so it would be helpful
>> to see these failures in action.  That is, it'd be helpful to take
>> a compiler that doesn't have the unspec tests and show:
>>
>> - the relevant rtl of one of the failing tests before the pass runs
>>   (when the rtl is still correct)
>>
>> - the relevant rtl of one of the failing tests after the pass runs
>>   (when the rtl is now incorrect)
>>
>> - the reason why the rtl after the pass is wrong
>>
> 
> I meant to say this is the semantics of lxvp instructions which can occur through
> UNSPEC uses. If we dont use above semantics in UNSPEC the vect regressions and spec 
> fails functionality.
> 
> I will find a test without UNSPEC and let you know.
>

I have fixed all the issues with all RTL uses of fused load other than UNSPEC.
With the fixes all RTL codes uses of fused load are supported along with
UNSPEC.

Thanks for suggesting to use all RTL for fused load along with UNSPEC.

I will separate patch with all the fixes soon.

Thanks & Regards
Ajit 
> Thanks & Regards
> Ajit
>  
>> Thanks,
>> Richard

next prev parent reply	other threads:[~2024-06-04 16:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-30 19:51 Ajit Agarwal
2024-05-30 21:34 ` Segher Boessenkool
2024-05-31  8:14   ` Richard Sandiford
2024-05-31 14:19     ` Segher Boessenkool
2024-05-31  9:53 ` Richard Sandiford
2024-05-31 10:28   ` Richard Sandiford
2024-05-31 13:54   ` Ajit Agarwal
2024-05-31 14:38     ` Richard Sandiford
2024-05-31 16:59       ` Ajit Agarwal
2024-06-02  5:52         ` Ajit Agarwal
2024-06-03  8:37         ` Richard Sandiford
2024-06-03 11:05           ` Ajit Agarwal
2024-06-03 11:33             ` Richard Sandiford
2024-06-03 13:47               ` Ajit Agarwal
2024-06-03 14:17                 ` Richard Sandiford
2024-06-03 14:34                   ` Ajit Agarwal
2024-06-03 14:54                     ` Richard Sandiford
2024-06-03 15:58                       ` Ajit Agarwal
2024-06-04 16:00                         ` Ajit Agarwal [this message]
2024-06-02 13:16   ` Ajit Agarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13fea2af-a64f-45c7-ae47-131191ac5871@linux.ibm.com \
    --to=aagarwa1@linux.ibm.com \
    --cc=alex.coplan@arm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=meissner@linux.ibm.com \
    --cc=richard.sandiford@arm.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).