From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=ChzF=6L=redhat.com=amacleod@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by sourceware.org (Postfix) with ESMTPS id E54F33858004
	for <gcc-patches@gcc.gnu.org>; Wed, 15 Feb 2023 16:05:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E54F33858004
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1676477141;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=BJ8QPEHfFWGPD2ce6lScnHnQbHTMCal/IXdqe+dWzZY=;
	b=D67t1BCTyIZ7LlmaVcmBnLsEZu06FCiZl4aJLQvgrpRH82gg87ykOhZqkcdj9fvTPCItEa
	ylqL8C3zHjNsf4pAs3ZIREwxakD3g1UUcnXLvRcAy+Kmc17xDw/bOcrJnQ5lHR8KyBAL3C
	ZLsrYX99CvcPtfNR1Ldm+Y/t2RKxz10=
Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com
 [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id
 us-mta-617-aCHmPgkoMaGyRc0VlTftQg-1; Wed, 15 Feb 2023 11:05:30 -0500
X-MC-Unique: aCHmPgkoMaGyRc0VlTftQg-1
Received: by mail-qk1-f197.google.com with SMTP id o24-20020a05620a22d800b007389d2f57f3so11785842qki.21
        for <gcc-patches@gcc.gnu.org>; Wed, 15 Feb 2023 08:05:29 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=BJ8QPEHfFWGPD2ce6lScnHnQbHTMCal/IXdqe+dWzZY=;
        b=f2p2S+C/Xo+IQT3eR+O76apwsu/xu/YF+2soBtvsBi4d1xwcojEIw7Emgd4sTfncOh
         U4uHH0zWpOV/v4Dqg2W2lNGQG6FOBDkNKtAp9h9JieIMeBPbUgClN+vELUTPNn37wsZv
         89axLSmvqONVnDb4QjwLSDgw4AI/9qOMcnpqF3XFdjjvBXY0jLozzeW7O3hUqdb876ME
         jsBRPYjoyo4bhIbGJectfeAufVFBEZPIzm+s1GJQzulXRxw6hGqIZ+yTCnfAroKqYcCh
         +Lcq9PXdu5hL+og7FhbpXupXNtrre9DypEKqNxPl0OCF65Lb/EL/XjIu14xcGwTBpY76
         byRA==
X-Gm-Message-State: AO0yUKVCN4J8qjJbwRh7zizxGMkDrBqYkshzB8ss3iTMByZgr2ZGFhcE
	RObT918ckutashkGNCJlbuaKrpRbc8zRFfkM0a2l714yXeH6vr4idPno5PCD9fCzBFvgWCB+Iir
	UR1xsmA7VmUyVezX63A8SiM8=
X-Received: by 2002:a05:622a:153:b0:3a8:e9e:e194 with SMTP id v19-20020a05622a015300b003a80e9ee194mr4514036qtw.40.1676477128161;
        Wed, 15 Feb 2023 08:05:28 -0800 (PST)
X-Google-Smtp-Source: AK7set9nPKFmMY5R5U2+yskhCCP6AVfHthGWrYqTq9poQwsUcr8ptwXLrfjt9Wx3tKmPgHk812cvkw==
X-Received: by 2002:a05:622a:153:b0:3a8:e9e:e194 with SMTP id v19-20020a05622a015300b003a80e9ee194mr4514000qtw.40.1676477127802;
        Wed, 15 Feb 2023 08:05:27 -0800 (PST)
Received: from ?IPV6:2607:fea8:a263:f600::de2a? ([2607:fea8:a263:f600::de2a])
        by smtp.gmail.com with ESMTPSA id v2-20020a379302000000b007112aa42c4fsm14002163qkd.135.2023.02.15.08.05.26
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Wed, 15 Feb 2023 08:05:27 -0800 (PST)
Message-ID: <f31a7866-95ee-6fe3-63c3-5bb77936d990@redhat.com>
Date: Wed, 15 Feb 2023 11:05:25 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.6.0
Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by
 using new optabs [PR108583]
To: Tamar Christina <Tamar.Christina@arm.com>,
 Richard Biener <rguenther@suse.de>,
 Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>, nd
 <nd@arm.com>, "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
References: <mpt1qmx1jbc.fsf@arm.com>
 <F6AE5834-ACA9-4FE2-9697-3E20F8A53D47@suse.de>
 <d3f26ea2-34fe-7a0b-4881-7129037bd002@redhat.com>
 <VI1PR08MB5325B60757CAE14D4A12E2E6FFDD9@VI1PR08MB5325.eurprd08.prod.outlook.com>
 <VI1PR08MB53256CAEC1E6A172223C3537FFA39@VI1PR08MB5325.eurprd08.prod.outlook.com>
From: Andrew MacLeod <amacleod@redhat.com>
In-Reply-To: <VI1PR08MB53256CAEC1E6A172223C3537FFA39@VI1PR08MB5325.eurprd08.prod.outlook.com>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Language: en-US
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 2/15/23 07:51, Tamar Christina wrote:
>>>>>> In any case, if you disagree I don’t' really see a way forward
>>>>>> aside from making this its own pattern running it before the
>>>>>> overwidening
>>> pattern.
>>>>> I think we should look to see if ranger can be persuaded to provide
>>>>> the range of the 16-bit addition, even though the statement that
>>>>> produces it isn't part of a BB.  It shouldn't matter that the
>>>>> addition originally came from a 32-bit one: the range follows
>>>>> directly from the ranges of the operands (i.e. the fact that the
>>>>> operands are the results of widening conversions).
>>>> I think you can ask ranger on operations on names defined in the IL,
>>>> so you can work yourself through the sequence of operations in the
>>>> pattern sequence to compute ranges on their defs (and possibly even
>>>> store them in the SSA info).  You just need to pick the correct
>>>> ranger API for this…. Andrew CCed
>>>>
>>>>
>>> Its not clear to me whats being asked...
>>>
>>> Expressions don't need to be in the IL to do range calculations.. I
>>> believe we support arbitrary tree expressions via range_of_expr.
>>>
>>> if you have 32 bit ranges that you want to do 16 bit addition on, you
>>> can also cast those ranges to a 16bit type,
>>>
>>> my32bitrange.cast (my16bittype);
>>>
>>> then invoke range-ops directly via getting the handler:
>>>
>>> handler = range_op_handler (PLUS_EXPR, 16bittype_tree); if (handler)
>>>      handler->fold (result, my16bittype, mycasted32bitrange,
>>> myothercasted32bitrange)
>>>
>>> There are higher level APIs if what you have on hand is closer to IL
>>> than random ranges
>>>
>>> Describe exactly what it is you want to do... and I'll try to direct
>>> you to the best way to do it.
>> The vectorizer has  a pattern matcher that runs at startup on the scalar code.
>> This pattern matcher can replace one or more statements with alternative
>> ones, these can be either existing tree_code or new internal functions.
>>
>> One of the patterns here is a overwidening detection pattern which reduces
>> the precision that an operation is to be done in during vectorization.
>>
>> Another one is widening multiplication, which replaced PLUS_EXPR with
>> WIDEN_PLUS_EXPR.
>>
>> These can be chained, so e.g. a widening addition done on ints can be
>> reduced to a widen addition done on shorts.
>>
>> The question is whether given the new expression that the vectorizer has
>> created whether ranger can tell what the precision is.  get_range_query fails
>> because presumably it has no idea about the new operations created  and
>> also doesn't know about any new IFNs.
> Hi,
>
> I have been trying to use ranger as requested. I've tried:
>
> 	  gimple_ranger ranger;
> 	  int_range_max r;
> 	  /* Check that no overflow will occur.  If we don't have range
> 	     information we can't perform the optimization.  */
> 	  if (ranger.range_of_expr (r, oprnd0, stmt))
> 	    {
> 	      wide_int max = r.upper_bound ();
>                      ....
>
> Which works for non-patterns, but still doesn't work for patterns.
> On a stmt:
> patt_27 = (_3) w+ (level_15(D));
>
> it gives me a range:
>
> $2 = {
>    <wide_int_storage> = {
>      val = {[0x0] = 0xffffffffffffffff, [0x1] = 0x7fff95bd8b00, [0x2] = 0x7fff95bd78b0, [0x3] = 0x3fa1dd0, [0x4] = 0x3fa1dd0, [0x5] = 0x344a706f832d4f00, [0x6] = 0x7fff95bd7950, [0x7] = 0x1ae7f11, [0x8] = 0x7fff95bd79f8},
>      len = 0x1,
>      precision = 0x10
>    },
>    members of generic_wide_int<wide_int_storage>:
>    static is_sign_extended = 0x1
> }
>
> The precision is fine, but range seems to be -1?
>
> Should I use range_op_handler (WIDEN_PLUS_EXPR, ...) in this case?

Its easier to see the range if you dump it.. ie:

p r.dump(stderr)

Im way behind the curve on exactly whats going on.  Im not sure how the 
above 2 things relate..  I presume $2 is is 'max'?  I have no context, 
what did you expect the range of _3 to be?

We have no entry in range-ops.cc for a WIDEN_PLUS_EXPR,  so ranger would 
only give back a VARYING for that no doubt.. however I doubt it would be 
too difficult to write the fold_range() method for it.

Its unclear to me what you mean by it doesnt work on patterns. so lets 
do some basics.

You have a stmt  "patt_27 = (_3) w+ (level_15(D));"

I gather thats a WIDEN_PLUS_EXPR, and if I read it right, patt_27 is a 
type that is twice as wide as _3, and will contain the value "_3 + 
level_15"?

You query above is asking for the range of _3 at this stmt in the IL.

And you are trying to determine whether the expression "_3 + level_15" 
would still fit in the type of _3, and thus you could avoid the WIDEN_* 
paradigm and revert to a simply plus?

And you also want to be able to do this for expressions which are not 
currently in the IL?

----  IF that is all true, then I would suggest one of 2 possible routes.
1) we add WIDEN_PLUS_EXPR to range-ops.  THIs involves writing 
fold_range() for it whereby it would create a range of a type double the 
precision of _3, then take the 2 ranges for op1 and op2, cast them to 
this new type and add them.

2) manually doing the same thing.   BUt if you are goignto manually do 
it, we might as well put that same code into fold_range then the entire 
ecosystem will benefit.

Once the operation can be performed in range ops, you can cast the new 
range back to the type of _3 and see if its fully represented. ie

int_range_max r1, r2
if (ranger.range_of_stmt (r1, stmt))
   {
     r2 = r1;
     r2.cast (TREE_TYPE (_3));
     r2.cast (TREE_TYPE (patt_27));
     if (r1 == r2)
       // No info was lost casting back and forth, so r1 must fit into 
type of _3

That should work for within the IL.  And if you want to do the same 
thing outside of the IL, you have to come up with the values you want to 
use for op1 and op2, replace the ranger query with a direct range-opfold:

range_op_handler handler (WIDEN_PLUS_EXPR, TREE_TYPE (patt_27));
if (handler && handler->fold_range (r1, range_of__3, range_of_level_15))
   {
     // same casting song and dance


If you dont want to go thru this process, in theory, you could try 
simply adding _3 and level_15 in their own precision, and if max/min 
aren't +INF/-INF then you can probably assume there is no overflow?
in which case, all you do is the path you are on above for within a stmt 
should work:

	  gimple_ranger ranger;
	  int_range_max r0, r1, def;
	  /* Check that no overflow will occur.  If we don't have range
	     information we can't perform the optimization.  */
	  if (ranger.range_of_expr (r0, oprnd0, stmt) && ranger.range_of_expr (r1,oprnd1, stmt)
	    {
	      range_op_handler handler (PLUS_EXPR, TREE_TYPE (_3));
	      if (handler && handler->fold_range (def, r0, r1))
		// examine def.upper_bound() and def.lower_bound()

Am I grasping some of the issue here?

Andrew