From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id CDEF23858D28 for ; Wed, 15 Feb 2023 17:51:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CDEF23858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676483460; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fCpuVRsq5NuwrQhKjhedsW+COZcO4XdKUN71nIbttNg=; b=FHamiHkwrccG24fFPmnXGlMD4oVqKCNUos++FC3HoJpgvH6XuwISL2wUIqdUw+wjC8Mq+x PQ+mYZgsEX9CBgUUWNZCy1YZwGc5WDUDBdwml7ZVZa3dk89NcFDSUdVbz+Io2d/v/HUDmY ZKYQzellEqeNJyGTqOPxzOiRbzP/vB4= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-178-HVCYwMh4P8OkRN0Mva6vKg-1; Wed, 15 Feb 2023 12:50:59 -0500 X-MC-Unique: HVCYwMh4P8OkRN0Mva6vKg-1 Received: by mail-qv1-f70.google.com with SMTP id f1-20020ad442c1000000b0056c228fa15cso10939219qvr.4 for ; Wed, 15 Feb 2023 09:50:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=B9KiPPw5Z46EMnJARX7EqKP4VO+w/FYAoNU8nf4ZmqI=; b=sak7j1odl3w5U2PLGmOrdCc5qVsf7P5DGmE9ANgZvc7+fKccHVeHe21dqK72lkhWD2 JKKmAKunuH139VZ5zBwXHGCDlzV+G7tQwtsEL1VJ/SjKwMTIPt9++Xfs+R7JGiW5McO6 7tLiAPtRSR6kAvW5gGu631+3ZajnS8x0JNHGI01BMx3GsyEbusOoxP2Kyp3GLuPGh3fW v8RJLfMvwUlTW59o4SwN+ur1EEqblYlQMHMh+c3szBSe+acAVFpixU4Rqt/G9qbQUvlL njpeBTXeic/XBGu3+PJAtiWuY9XK7V4ZFQ22efr5R2p/547o2MqCxXC1MJyvXAgLL7nd REBg== X-Gm-Message-State: AO0yUKU1r/GuBXsuB5208mAWZ3BF+riTNQt3ciILXteQjZHrTHIo5BDx Ndx15zbVCH4EVvttcmu9teQzI82Tl/wRMTkvo1zdrOkAiopS4x/n4FQG2ckRXzedF1K/XwZkBKt ZEuTu+bOjzq0iBD4fPg== X-Received: by 2002:a05:622a:88:b0:3ba:247a:3fb7 with SMTP id o8-20020a05622a008800b003ba247a3fb7mr5034461qtw.53.1676483458573; Wed, 15 Feb 2023 09:50:58 -0800 (PST) X-Google-Smtp-Source: AK7set9prW3+F8faVEGY/1kW4MUQdYUQkGl6JedYs7U8TnOEAPCYQxUUn09DWk/XLRfLhXG90BFkHA== X-Received: by 2002:a05:622a:88:b0:3ba:247a:3fb7 with SMTP id o8-20020a05622a008800b003ba247a3fb7mr5034424qtw.53.1676483458146; Wed, 15 Feb 2023 09:50:58 -0800 (PST) Received: from ?IPV6:2607:fea8:a263:f600::de2a? ([2607:fea8:a263:f600::de2a]) by smtp.gmail.com with ESMTPSA id o7-20020ac87c47000000b003bb8c60cdf1sm11915068qtv.78.2023.02.15.09.50.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Feb 2023 09:50:57 -0800 (PST) Message-ID: <77142b9b-7af8-eb04-e596-6dd2f97aff9a@redhat.com> Date: Wed, 15 Feb 2023 12:50:56 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583] To: Tamar Christina , Richard Biener , Richard Sandiford Cc: Tamar Christina via Gcc-patches , nd , "jlaw@ventanamicro.com" References: From: Andrew MacLeod In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="------------4VvvgwIAgGTJjcW65AcJFVNH" Content-Language: en-US X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------4VvvgwIAgGTJjcW65AcJFVNH Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2/15/23 12:13, Tamar Christina wrote: >> On 2/15/23 07:51, Tamar Christina wrote: >> Thanks, lots of useful context there. > This second pattern replaces the above into: > > _6 = _3 +w level_14(D); > _7 = _6 / 255; > _8 = (unsigned char) _7; > > Thus removing the need to promote before the addition. What I'm working on > is an optimization for division. So I am after what the range of _6 is. oprnd0 in my > example is the 1rst operand of the division. > > I need to know the range of_6 because based on the range we can optimize this > division into something much more efficient. > >> ----  IF that is all true, then I would suggest one of 2 possible routes. >> 1) we add WIDEN_PLUS_EXPR to range-ops.  THIs involves writing >> fold_range() for it whereby it would create a range of a type double the >> precision of _3, then take the 2 ranges for op1 and op2, cast them to this new >> type and add them. >> > Right, so I guess none of the widening operations are currently there. Can you > point me in the right direction of where I need to add them? sure, details below >> 2) manually doing the same thing.   BUt if you are goignto manually do it, we >> might as well put that same code into fold_range then the entire ecosystem >> will benefit. >> >> Once the operation can be performed in range ops, you can cast the new >> range back to the type of _3 and see if its fully represented. ie >> >> int_range_max r1, r2 >> if (ranger.range_of_stmt (r1, stmt)) >>   { >>     r2 = r1; >>     r2.cast (TREE_TYPE (_3)); >>     r2.cast (TREE_TYPE (patt_27)); >>     if (r1 == r2) >>       // No info was lost casting back and forth, so r1 must fit into type of _3 >> >> That should work for within the IL.  And if you want to do the same thing >> outside of the IL, you have to come up with the values you want to use for >> op1 and op2, replace the ranger query with a direct range-opfold: >> >> range_op_handler handler (WIDEN_PLUS_EXPR, TREE_TYPE (patt_27)); if >> (handler && handler->fold_range (r1, range_of__3, range_of_level_15)) >>   { >>     // same casting song and dance >> >> > Just for my own understanding, does the fold_range here update the information > in the IL? Or is it just for this computation? So when I hit this pattern again it > recomputes it? fold_range does not update anything.  It just performs the calculation, and passes like VRP etc are responsible for if, and when, that is reflected in some way/transformation in the IL. The IL is primarily used for context to look back and try to determine the range of the inputs to the statement.   Thats why, if you arent using an expression in the IL, you need to provide the ranges yourself.   BY default, you end up with the full range for the type, ie VARYING.  but if ranger can detertmine through branches and such that its something different, it will. ie, so if you case is preceeded by if (_3 < 20 && level_15< 20)   //  the range of _3 will be [0, 19] and _15 will be [0, 19], and th addition will end up with a range of [0, 38] In your case, I see the ranges are the range of the 8 bit type: irange] int [0, 255] NONZERO 0xff >> If you dont want to go thru this process, in theory, you could try simply >> adding _3 and level_15 in their own precision, and if max/min aren't +INF/- >> INF then you can probably assume there is no overflow? >> in which case, all you do is the path you are on above for within a stmt should >> work: >> >> gimple_ranger ranger; >> int_range_max r0, r1, def; >> /* Check that no overflow will occur. If we don't have range >> information we can't perform the optimization. */ >> if (ranger.range_of_expr (r0, oprnd0, stmt) && >> ranger.range_of_expr (r1,oprnd1, stmt) >> { >> range_op_handler handler (PLUS_EXPR, TREE_TYPE (_3)); >> if (handler && handler->fold_range (def, r0, r1))so I would expect a skeleton to be >> // examine def.upper_bound() and def.lower_bound() >> >> Am I grasping some of the issue here? > You are, and this was helpful. I would imagine that Richard wouldn't accept me > to do it locally though. So I guess if it's safe to do for this PR fix, I can add the basic > widening operations to ranger-ops if you can show me where. > all the range-op integer code is in gcc/range-op.cc.  As this is a basic binary operation, you should be able to get away with implementing a single routine,  wi_fold () which adds 2 wide int bounds  together and returns a result.  THis si the implelemntaion for operator_plus. void operator_plus::wi_fold (irange &r, tree type,                         const wide_int &lh_lb, const wide_int &lh_ub,                         const wide_int &rh_lb, const wide_int &rh_ub) const {   wi::overflow_type ov_lb, ov_ub;   signop s = TYPE_SIGN (type);   wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);   wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);   value_range_with_overflow (r, type, new_lb, new_ub, ov_lb, ov_ub); } you shouldn't have to do any of the overflow stuff at the end, just take the 2 sets of wide int, double their precision to start, add them together (it cant possible overflow right) and then return an int_range<2> with those bounds... ie void operator_plus::wi_fold (irange &r, tree type,                         const wide_int &lh_lb, const wide_int &lh_ub,                         const wide_int &rh_lb, const wide_int &rh_ub) const {   wi::overflow_type ov_lb, ov_ub;   signop s = TYPE_SIGN (type);   // Do whatever wideint magic is required to do this adds in higher precision   wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);   wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);   r = int_range<2> (type, new_lb, new_ub); } The operator needs to be registered, I've attached the skeleton for it.  you should just have to finish implementing wi_fold(). in theory :-) --------------4VvvgwIAgGTJjcW65AcJFVNH Content-Type: text/x-patch; charset=UTF-8; name="tam.diff" Content-Disposition: attachment; filename="tam.diff" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2djYy9yYW5nZS1vcC5jYyBiL2djYy9yYW5nZS1vcC5jYwppbmRleCA1YzY3 YmNlNmQzYS4uYzQyNWM0OTZjMjUgMTAwNjQ0Ci0tLSBhL2djYy9yYW5nZS1vcC5jYworKysgYi9n Y2MvcmFuZ2Utb3AuY2MKQEAgLTE3MzAsNiArMTczMCwyOSBAQCBvcGVyYXRvcl9taW51czo6b3Ay X3JhbmdlIChpcmFuZ2UgJnIsIHRyZWUgdHlwZSwKICAgcmV0dXJuIGZvbGRfcmFuZ2UgKHIsIHR5 cGUsIG9wMSwgbGhzKTsKIH0KIAorY2xhc3Mgb3BlcmF0b3Jfd2lkZW5fcGx1cyA6IHB1YmxpYyBy YW5nZV9vcGVyYXRvcgoreworcHVibGljOgorICB2aXJ0dWFsIHZvaWQgd2lfZm9sZCAoaXJhbmdl ICZyLCB0cmVlIHR5cGUsCisJCQljb25zdCB3aWRlX2ludCAmbGhfbGIsCisJCQljb25zdCB3aWRl X2ludCAmbGhfdWIsCisJCQljb25zdCB3aWRlX2ludCAmcmhfbGIsCisJCQljb25zdCB3aWRlX2lu dCAmcmhfdWIpIGNvbnN0OworfSBvcF93aWRlbl9wbHVzOworCit2b2lkCitvcGVyYXRvcl93aWRl bl9wbHVzOjp3aV9mb2xkIChpcmFuZ2UgJnIsIHRyZWUgdHlwZSwKKwkJCSAgICAgIGNvbnN0IHdp ZGVfaW50ICZsaF9sYiwKKwkJCSAgICAgIGNvbnN0IHdpZGVfaW50ICZsaF91YiwKKwkJCSAgICAg IGNvbnN0IHdpZGVfaW50ICZyaF9sYiwKKwkJCSAgICAgIGNvbnN0IHdpZGVfaW50ICZyaF91Yikg Y29uc3QKK3sKKyAgd2k6Om92ZXJmbG93X3R5cGUgb3ZfbGIsIG92X3ViOworICBzaWdub3AgcyA9 IFRZUEVfU0lHTiAodHlwZSk7CisgIHdpZGVfaW50IG5ld19sYiA9IHdpOjphZGQgKGxoX2xiLCBy aF9sYiwgcywgJm92X2xiKTsKKyAgd2lkZV9pbnQgbmV3X3ViID0gd2k6OmFkZCAobGhfdWIsIHJo X3ViLCBzLCAmb3ZfdWIpOworICByID0gaW50X3JhbmdlPDI+ICh0eXBlLCBuZXdfbGIsIG5ld191 Yik7Cit9CiAKIGNsYXNzIG9wZXJhdG9yX3BvaW50ZXJfZGlmZiA6IHB1YmxpYyByYW5nZV9vcGVy YXRvcgogewpAQCAtNDUwNSw2ICs0NTI4LDcgQEAgaW50ZWdyYWxfdGFibGU6OmludGVncmFsX3Rh YmxlICgpCiAgIHNldCAoQUJTVV9FWFBSLCBvcF9hYnN1KTsKICAgc2V0IChORUdBVEVfRVhQUiwg b3BfbmVnYXRlKTsKICAgc2V0IChBRERSX0VYUFIsIG9wX2FkZHIpOworICBzZXQgKFdJREVOX1BM VVNfRVhQUiwgb3Bfd2lkZW5fcGx1cyk7CiB9CiAKIC8vIEluc3RhbnRpYXRlIGEgcmFuZ2Ugb3Ag dGFibGUgZm9yIHBvaW50ZXIgb3BlcmF0aW9ucy4K --------------4VvvgwIAgGTJjcW65AcJFVNH--